3. Analysis with kallisto#
RumBall also supports kallisto for RNA-seq analysis. The following example shows how to run kallisto on the same samples in Tutorial (human, RSEM).
3.1. Mapping reads by kallisto#
Make index for kallisto:
build=GRCh38 # specify the build that you need
Ddir=Ensembl-$build/
ncore=12 # number of CPUs
build-index-RNAseq.sh -p $ncore kallisto $build $Ddir
Run kallisto:
ID=("SRR710092" "SRR710093" "SRR710094" "SRR710095")
NAME=("HEK293_Control_rep1" "HEK293_Control_rep2" "HEK293_siCTCF_rep1" "HEK293_siCTCF_rep2")
index=$Ddir/kallisto-indexes/genome
ncore=12 # number of CPUs
mkdir -p log
for ((i=0; i<${#ID[@]}; i++))
do
echo ${NAME[$i]}
fq1=fastq/${ID[$i]}_1.fastq.gz
fq2=fastq/${ID[$i]}_2.fastq.gz
kallisto.sh -p $ncore ${NAME[$i]} "$fq1 $fq2" $Ddir reverse
done
Then you can merge the output of kallisto to make a single count matrix using kallisto_merge.sh
:
s=""
for ((i=0; i<${#ID[@]}; i++))
do
s="$s kallisto/${NAME[$i]}/abundance.tsv"
done
mkdir -p Matrix_kallisto
kallisto_merge.sh "$s" Matrix_kallisto/HEK293 $Ddir
3.2. Differential analysis#
The differential analysis step is the same with the STAR example in Tutorial (human, RSEM).
Add the -k
option to DESeq2.sh
and edgeR.sh
to use the output of kallisto_merge.sh
as input.
Ctrl="kallisto/HEK293_Control_rep1 kallisto/HEK293_Control_rep2"
siCTCF="kallisto/HEK293_siCTCF_rep1 kallisto/HEK293_siCTCF_rep2"
# For DESeq2
mkdir -p Matrix_edgeR_kallisto
kallisto_merge.sh "$Ctrl $siCTCF" Matrix_deseq2_kallisto/HEK293 $Ddir
DESeq2.sh -k Matrix_deseq2_kallisto/HEK293 2:2 Control:siCTCF Human
# For edgeR
mkdir -p Matrix_deseq2_kallisto
kallisto_merge.sh "$Ctrl $siCTCF" Matrix_edgeR_kallisto/HEK293 $Ddir
edgeR.sh -k Matrix_edgeR_kallisto/HEK293 2:2 Control:siCTCF Human
Note
It is recommended to use sleuth for the differential analysis of the kallisto output instead of edgeR and DESeq2. See the sleuth walkthroughs for more details.