cutadapter

Note that FastQC only gave you warning when overrepresented sequences were in first 200,000 sequences.

Supposed read1.fastq and read2.fastq is the paired end data with 4 lines per read.

Go through each adapter as below, e.g. sampling 1 million read1.fastq for truseq-forward-contam adapter:

Adapters:

>multiplexing-forward
GATCGGAAGAGCACACGTCT
>solexa-forward
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
>truseq-forward-contam
AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
>truseq-reverse-contam
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
>nextera-forward-read-contam
CTGTCTCTTATACACATCTCCGAGCCCACGAGAC
>nextera-reverse-read-contam
CTGTCTCTTATACACATCTGACGCTGCCGACGA
>solexa-reverse
AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG

Examples:

zcat JRH2_ATGTCA_R1.fastq.gz | head -n 10000 | grep AGATCGGAAGAGC

cutadapt

3' Adapters

Before:

MYSEQUEN
MYSEQUENCEADAP
MYSEQUENCEADAPTER
MYSEQUENCEADAPTERSOMETHINGELSE

After:

MYSEQUEN
MYSEQUENCE
MYSEQUENCE
MYSEQUENCE

results matching ""

    No results matching ""