-
Notifications
You must be signed in to change notification settings - Fork 1
Bio examples
Examples from the bio library
The 'bio' library contains a directory /examples that are intended to demonstrate various useful functionality, as well as being useful tools in their own right. Their main purpose is educational, however, so they are all very simple, typically just a couple of lines, and with a minimum of dependencies. So don't expect informative error messages, help output, or even sensible option handling!
By building or installing the bio library, you will also build or install these executables. To avoid that, unset the 'examples' flag to cabal, e.g:
cabal configure -f-examples
The library provides a generic function to read sequence data from any supported format (Fasta, FastQ, SFF, 2bit, ACE, PHD...). FastOut uses this to convert from any of these to either Fasta or FastQ.
Apparently, it is common to clip long Titanium 454 reads down to the shorter length of the previous (FLX) generation. This is a tool that does exactly that, clipping flowgrams down to 400 flows, and adjusting read length accordingly.
This is a more generic trimming program for SFF files, but working on sequence coordinates rather than flowgram coordinates. It can clip reads in SFF files from either side, at length relative to the beginning (positive coordinates) or end (negative coordinates).
Recover corrupt SFF files. This was useful on one occasion, but I haven't seen a lot of corrupt files after that. Anyway, the functionality is there, should you need it.
When using FlowSim to simulate reads, it is possible that different reads get the same name, which might confuse some tools. This renames all reads in a set of SFF files by tacking on a serial number.