FastQC
FASTQC is a widely used tool for assessing the quality of raw and processed sequencing data. It provides a comprehensive quality check, including metrics like per-base quality scores, GC content, and adapter contamination.
Options:
<file{R1,R2}.fastq>
: Input FASTQ files (gzip-compressed files, e.g.,file1.fastq.gz
, are also supported).-o <output_directory>
: Specify the directory where reports will be saved. Defaults to the current directory if omitted.-t <number_of_threads>
: Specify the number of threads for parallel processing.
Interpreting FASTQC Results
FASTQC generates:
HTML Report: Visual summary of the quality metrics.
ZIP File: Contains the raw data used to generate the report.
Key metrics in the HTML report:
Per Base Sequence Quality:
Boxplots showing quality scores across all positions in reads.
Green indicates high-quality bases (>Q30).
Per Sequence Quality Scores:
Overall quality of reads in the file.
Per Base GC Content:
GC content distribution across the length of reads.
Adapter Content:
Detects overrepresented adapter sequences.
Overrepresented Sequences:
Identifies frequently occurring sequences (e.g., adapters or contaminants).
Last updated