Skip to content

Forensic analysis tool useful in backwards computing information from next-generation sequencing data.

License

Notifications You must be signed in to change notification settings

stjudecloud/ngsderive

Repository files navigation

ngsderive

Actions: CI Status PyPI PyPI: Downloads PyPI: Downloads License: MIT

Forensic analysis tool useful in backwards computing information from next-generation sequencing data and annotating splice junctions.
Explore the docs »

Request Feature · Report Bug · ⭐ Consider starring the repo! ⭐

Notice: ngsderive is largely a forensic analysis tool useful in backwards computing information from next-generation sequencing data. Notably, most results are provided as a 'best guess' — the tool does not claim 100% accuracy and results should be considered with that understanding. An exception would be the junction-annotation tool which analyzes more concrete evidence than the other tools.

🎨 Features

The following attributes can be guessed using ngsderive:

  • Illumina Instrument. Infer which Illumina instrument was used to generate the data by matching against known instrument and flowcell naming patterns. Each guess comes with a confidence score.
  • RNA-Seq Strandedness. Infer from the data whether RNA-Seq data was generated using a Stranded-Forward, Stranded-Reverse, or Unstranded protocol.
  • Pre-trimmed Read Length. Compute the distribution of read lengths in the file and attempt to guess what the original read length of the experiment was.
  • PHRED Score Encoding. Infers which encoding scheme was used to store PHRED scores as ASCII characters.
  • Junction Annotation. Annotates splice junctions as novel, partial novel, or known in comparison to a reference gene model.

📚 Getting Started

Installation

You can install ngsderive using the Python Package Index (PyPI).

pip install ngsderive

🖥️ Development

If you are interested in contributing to the code, please first review our CONTRIBUTING.md document.

To bootstrap a development environment, please use the following commands.

# Clone the repository
git clone [email protected]:stjudecloud/ngsderive.git
cd ngsderive

# Install the project using poetry
poetry install

🚧️ Tests

ngsderive provides a (currently patchy) set of tests — both unit and end-to-end.

py.test

🤝 Contributing

Contributions, issues and feature requests are welcome!
Feel free to check issues page. You can also take a look at the contributing guide.

📝 License

This project is licensed as follows:

  • All code related to the instrument subcommand is licensed under the AGPL v2.0. This is not due to any strict requirement, but out of deference to some code I drew inspiration from (and copied patterns from), the decision was made to license this code consistently.
  • The rest of the project is licensed under the MIT License - see the LICENSE.md file for details.

Copyright © 2020 St. Jude Cloud Team.