cutadapter
Note that FastQC only gave you warning when overrepresented sequences were in first 200,000 sequences.
Supposed read1.fastq and read2.fastq is the paired end data with 4 lines per read.
Go through each adapter as below, e.g. sampling 1 million read1.fastq for truseq-forward-contam adapter:
Adapters:
>multiplexing-forward
GATCGGAAGAGCACACGTCT
>solexa-forward
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
>truseq-forward-contam
AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
>truseq-reverse-contam
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
>nextera-forward-read-contam
CTGTCTCTTATACACATCTCCGAGCCCACGAGAC
>nextera-reverse-read-contam
CTGTCTCTTATACACATCTGACGCTGCCGACGA
>solexa-reverse
AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
Examples:
zcat JRH2_ATGTCA_R1.fastq.gz | head -n 10000 | grep AGATCGGAAGAGC
cutadapt
3' Adapters
Before:
MYSEQUEN
MYSEQUENCEADAP
MYSEQUENCEADAPTER
MYSEQUENCEADAPTERSOMETHINGELSE
After:
MYSEQUEN
MYSEQUENCE
MYSEQUENCE
MYSEQUENCE