Perform basic read QC at command line prior to mapping
[ Main_Page ]
Reads can be biased in many different ways. Controlling read bias before attempting any sophisticated analysis is mandatory. FastQC runs both at command-line and through a handy Java graphical interface [1])
Apply FastQC on each read file
The command-line version is used here in batch with a simple bash command.
Assuming you can use 8 cpu and do not wish a verbose output, the following command will save results in a freshly created 'FastQC.results' folder.
The '--noextract' parameter will ensure that the resulting folder is not decompressed. All these options can be modified.
Distributing reads to parallel jobs requires sufficient bandwidth, this is not 'advised' when your data is stored on USB disk and/or your IO bandwidth is limited.
mkdir -p FastQC.results for f in *.fastq.gz; do # add '-Q 33' if the quality phred score range requires it fastqc --noextract -q -t 8 -o FastQC.results $f; done
The content of each archive can be de-compressed and viewed (fastqc_report.html) using your favourite web browser or included from the png pictures into your report.
details
- per_base_quality.png
- per_sequence_quality.png
- per_base_sequence_content.png
- per_base_gc_content.png
- per_sequence_gc_content.png
- per_base_n_content.png
- sequence_length_distribution.png
- duplication_levels.png
- kmer_profiles.png
|
|
|
|
|
|
|
|
|
References:
[ Main_Page ]