How to get FastQ reads from a SAM/BAM file

Using ngless, of course. Just run the following on the command line to take the sequences from

ngless -e 'write(as_reads(samfile("input.bam")), ofile="sequences.fq.gz")'

Let me unpack that a bit. This is equivalent to writing the following script:

ngless "0.0"

# Load a SAM or BAM file
sam_or_bam = samfile("input.bam")

# Get the reads out
reads = as_reads(sam_or_bam)

# Write them out
write(reads, ofile="sequences.fq.gz")

The exact behaviour will depend on whether you have paired or single-ended reads. If you have single-ended reads, they will be saved to sequences.fq.gz. If you have paired-end reads, they will be saved to sequences.pair.1.fq.gz and sequences.pair.2.fq.gz (and if there are any single-ends mixed in, sequences.singles.fq.gz.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s