nf-core/smRNAseq
https://nf-co.re/smrnaseq/2.2.1/
nf-core/smRNAseq
Pipeline summary
- Raw read QC (FastQC)
- Adapter trimming (Trim Galore!)
- Insert Size calculation
- Collapse reads (seqcluster)
- Contamination filtering (Bowtie2)
- Alignment against miRBase mature miRNA (Bowtie1)
- Alignment against miRBase hairpin
- Unaligned reads from step 3 (Bowtie1)
- Collapsed reads from step 2.2 (Bowtie1)
- Post-alignment processing of miRBase hairpin
- Basic statistics from step 3 and step 4.1 (SAMtools)
- Analysis on miRBase, or MirGeneDB hairpin counts (edgeR)
- TMM normalization and a table of top expression hairpin
- MDS plot clustering samples
- Heatmap of sample similarities
- miRNA and isomiR annotation from step 4.1 (mirtop)
- Alignment against host reference genome (Bowtie1)
- Post-alignment processing of alignment against host reference genome (SAMtools)
- Novel miRNAs and known miRNAs discovery (MiRDeep2)
- Mapping against reference genome with the mapper module
- Known and novel miRNA discovery with the mirdeep2 module
- miRNA quality control (mirtrace)
- Present QC for raw read, alignment, and expression results (MultiQC)
running nf-core/sRNAseq
navigate to smallRNA run directory
cd ~/workshop/smallRNA
ls -ls
tree
You can see the same directory structure for the pipeline. This was run with a ’nextflow run’ kickoff shell script, and the samplesheet aas shown.
samplesheet
cat samplesheet.csv
nextflow run script
cat nf-smRNAseq.sh
These were the same sample as run for the mRNA experiment.
- 16 Samples sequenced with an MGI400 sequencer at SAGC using the Tecan Universal RNA-seq library protocol.
- 2 different cancer cell lines (human)
- treatment vs control
- 4 replicates for each
We will go over how this was set up.
small RNA multiqc link to small RNA multiqc
Downloading the data to you local PC
working on an hpc environment has some drawbacks, one being that viewing images and html files isn’t as straightforward as it is on your pc. So that we can view the outputs together, this next step is to download the data from the nectar cloud to you pc
to promote be good practice, we will first compress the whole directory using tar
and run an md5sum check, to show that the download hasn’t corrupted the data.
cd ~/workshop
tar -cvf smallRNA.tar smallRNA
md5sum smallRNA.tar > smallRNA.md5sum
md5sum smallRNA.tar
the md5sum value should print to you screen, you could also save it as a txt file to view later
bee6e0161d40eb934c2ad0b4c2db1898 smallRNA.tar
you can do the same on you downloaded copy, to make sure it’s exactly the same data.
downloading with scp
we will be using scp (ssh copy)
You will need the same information as what you used to log in with ssh
1 | username | workshop |
2 | password | Sagc_2024 |
3 | IP address | given on the day |
# open a local shell terminal, make a directory and move to it
mkdir -p ~/workshop_RNAseq/smallRNA
cd ~/workshop_RNAseq/smallRNA
#example only, use your own IP address IP=[yourIPaddress]
scp -r workshop@${IP}:/home/workshop/workshop/smallRNA.tar .
#eg. scp -r workshop@${IP}:/home/workshop/workshop/smallRNA .
- this file is 617M, it should take a minute or less to downlaod
now on your local environment, you can run the same md5sum command
md5sum smallRNA.tar
the value should be exactly the same, showing the files are exact to the bit.
you can now uncompress the directory
tar -xvf smallRNA.tar
Results
We will work through the multiqc link to small RNA multiqc
And navigate through the smRNA outputs from the downloaded local copy.