Skip to content
/ crimson Public

Bioinformatics tool outputs converter to JSON or YAML

License

Notifications You must be signed in to change notification settings

bow/crimson

Repository files navigation

crimson

pypi ci coverage

crimson converts non-standard bioinformatics tool outputs to JSON or YAML.

Currently it can convert outputs of the following tools:

For each conversion, there are two execution options: as command line tool or as a Python library function. The first alternative uses crimson as a command-line tool. The second one requires importing the crimson library in your program.

Installation

crimson is available on the Python Package Index and you can install it via pip:

$ pip install crimson

It is also available on BioConda, both through the conda package manager or as a Docker container.

For Docker execution, you may also use the GitHub Docker registry. This registry hosts the latest version, but does not host versions 1.1.0 or earlier.

docker pull ghcr.io/bow/crimson

Usage

As a command line tool

The general command is crimson {tool_name}. By default, the output is written to stdout. For example, to use the picard parser, you would execute:

$ crimson picard /path/to/a/picard.metrics

You can also write the output to a file by specifying a file name. The following command writes the output to a file named converted.json:

$ crimson picard /path/to/a/picard.metrics converted.json

Some parsers may accept additional input formats. The FastQC parser, for example, also accepts a path to a FastQC output directory as its input:

$ crimson fastqc /path/to/a/fastqc/dir

It also accepts a path to a zipped result:

$ crimson fastqc /path/to/a/fastqc_result.zip

When in doubt, use the --help flag:

$ crimson --help            # for the general help
$ crimson fastqc --help     # for the parser-specific help, in this case FastQC

As a Python library function

The specific function to import is generally located at crimson.{tool_name}.parser. So to use the picard parser in your program, you can do:

from crimson import picard

# You can specify the input file name as a string or path-like object...
parsed = picard.parse("/path/to/a/picard.metrics")

# ... or a file handle
with open("/path/to/a/picard.metrics") as src:
    parsed = picard.parse(src)

Why?

  • Not enough tools use standard output formats.
  • Writing and re-writing the same parsers across different scripts is not a productive way to spend the day.

Local Development

Setting up a local development requires that you set up all of the supported Python versions. We use pyenv for this.

# Clone the repository and cd into it.
$ git clone https://github.com/bow/crimson
$ cd crimson

# Create your local development environment. This command also installs
# all supported Python versions using `pyenv`.
$ make dev

# Run the test and linter suite to verify the setup.
$ make lint test

# When in doubt, just run `make` without any arguments.
$ make

Contributing

If you are interested, crimson accepts the following types contribution:

  • Documentation updates / tweaks (if anything seems unclear, feel free to open an issue)
  • Bug reports
  • Support for tools' outputs which can be converted to JSON or YAML

For any of these, feel free to open an issue in the issue tracker or submit a pull request.

License

crimson is BSD-licensed. Refer to the LICENSE file for the full license.