Skip to content

Commit

Permalink
🎨 minor updates to JOSS paper
Browse files Browse the repository at this point in the history
  • Loading branch information
Wytamma authored Nov 25, 2024
1 parent 6125f33 commit 6069b34
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ Snk (pronounced "snek") is a workflow management tool designed to simplify the u

The integration of bioinformatic analyses into comprehensive pipelines (aka workflows) has revolutionised the field by improving the robustness and reproducibility of analyses. One of the most popular workflow frameworks is Snakemake [@10.12688/f1000research.29032.2]. Snakemake is a user-friendly and adaptable make-style workflow framework with a powerful specification language built atop of the Python programming language. Despite its success, Snakemake workflows are often developed for specific research analysis rather than as general-purpose reusable tools. That is, Snakemake workflows are typically built for the reproducibility of a single analysis but not necessarily built for flexibility.

To improve their utility, Snakemake workflows developers often encapsulate workflows within CLI tools by producing wrapper code to abstract the workflow execution, sometimes called workflows-as-applications or workflows packages [@roach_ten_2022]. These wrappers serve as intermediaries between the end-user (via the CLI) and the workflow execution, enabling developers to tailor the Snakemake experience to specific use cases. For example, the pangolin CLI tool wraps a snakemake workflow for SARS-CoV-2 lineage assignment [@otoole_pango_2022]. Initiatives like Snaketool have simplified the development of Snakemake-based CLIs by offering a template for developers [@roach_ten_2022]. Nonetheless, the onus remains on the developer to create and maintain the CLI wrappers for their workflow.
To improve their utility, Snakemake workflows developers often encapsulate workflows within CLI tools by producing wrapper code to abstract the workflow execution, sometimes called workflows-as-applications or workflows packages [@roach_ten_2022]. These wrappers serve as intermediaries between the end-user (via the CLI) and the workflow execution, enabling developers to tailor the Snakemake experience to specific use cases. For example, the pangolin CLI tool wraps a Snakemake workflow for SARS-CoV-2 lineage assignment [@otoole_pango_2022]. Initiatives like Snaketool have simplified the development of Snakemake-based CLIs by offering a template for developers [@roach_ten_2022]. Nonetheless, the onus remains on the developer to create and maintain the CLI wrappers for their workflow.

Here we present Snk, a Snakemake workflow management system that allows users to install Snakemake workflows as dynamically generated Command Line Interfaces. Thus users can create a CLI for their (or others') Snakemake workflows with minimal to no code changes required. The Snk-generated CLIs follow best practices and include several features out of the box that improve user experience. The CLIs can be configured at install time or via a `snk.yaml` configuration file. Snk is readily available for installation via PyPI and Conda, using the commands `pip install snk` and `conda install snk`, respectively.
Here we present Snk, a Snakemake workflow management system that allows users to install Snakemake workflows as dynamically generated CLIs. Thus users can create a CLI for their (or others') Snakemake workflows with minimal to no code changes required. The Snk-generated CLIs follow best practices and include several features out of the box that improve user experience. The CLIs can be configured at install time or via a `snk.yaml` configuration file. Snk is readily available for installation via PyPI and Conda, using the commands `pip install snk` and `conda install snk`, respectively.

Snk has two distinct major functions; managing the installation of workflows, and dynamical generating CLIs from Snakemake configuration files. To install a workflow as a CLI, users can specify the file path, URL, or GitHub name (username/repo) of a workflow. Snk copies (clones) workflows into a managed directory structure, creates a CLI entry point, and optionally creates an isolated virtual environment for each workflow. Workflows can be installed from specific commits, tags, or branches, ensuring reproducibility. The advent of Snk allows users to utilise the Snakemake workflow catalog (\url{https://snakemake.github.io/snakemake-workflow-catalog}) as a searchable package index of Snk-installable Snakemake tools. The snk install command is flexible and can be used to install diverse workflows using installation options. For example, the [dna-seq-gatk-variant-calling workflow](https://github.com/snakemake-workflows/dna-seq-gatk-variant-calling) (release tag v2.1.1) can be installed as a CLI named `variant-calling` with Snakemake v8.10.8 and Pandas and NumPy dependencies using the following command:

Expand All @@ -51,15 +51,15 @@ snk install \
-t v2.1.1
```

The workflow will then be accessible via the `variant-calling` CLI in the terminal (Figure 1). Additionally, the snk command can be used to list and uninstall workflows installed with Snk. The complete documentation for managing workflows can be found at \url{https://snk.wytamma.com/managing_workflows}.
The workflow will then be accessible via the `variant-calling` CLI in the terminal (Figure 1). Additionally, the `snk` command can be used to list and uninstall workflows installed with Snk. The complete documentation for managing workflows can be found at \url{https://snk.wytamma.com/managing_workflows}.

![The `variant-calling` CLI generated by Snk.](docs/images/variant-calling-cli.png)

The core functionality of Snk is the dynamic creation of CLIs. Internally snk uses the `Snk-CLI` sister package to generate the CLI. By default key values pairs of the Snakemake configfile are mapped to CLI option. For example, `samples: samples.tsv` in the configfile will generate a `--samples` option in the CLI with the default value `samples.tsv` (Figure 2). The CLI generated by snk is highly customisable and can be configured via a snk.yaml file placed in the workflow directory. The snk.yaml file can configure many aspects of CLI including subcommands, ASCII art, help messages, resource files, default values, and much more. Complete documentation for the Snk config file can be found at \url{https://snk.wytamma.com/snk_config_file}.

![The run command of the `variant-calling` CLI dynamically generated from the Snakemake configfile. Several standard options are provided in the Options section, e.g., `--dry` (equivalent to Snakemakes `--dry-run`), `--dag` to create a DAG plot of the workflow, and `--cores` witch defaults to all. The Workflow Configuration section contains the options dynamically generated from the configfile. Snk-CLI automatically infers the defaults and types of the options and creates flags for boolean options.](docs/images/variant-calling-cli-run.png)

Developers can also directly use the Snk-CLI package to generate CLIs for their Snakemake workflows. By using the CLI class from Snk-CLI workflow, developers can build a fully featured workflow package without having to write a Snakemake wrapper. We provide a guide for using Snk-CLI to build self-contained workflow packages at \url{https://snk.wytamma.com/workflow_packages}. The Snk-CLI package is available via PYPI and can be installed using the command `pip install snk-cli`.
Developers can also directly use the Snk-CLI package to generate CLIs for their Snakemake workflows. By using the `CLI` class from Snk-CLI, developers can build a fully featured workflow package without having to write a Snakemake wrapper. We provide a guide for using Snk-CLI to build self-contained workflow packages at \url{https://snk.wytamma.com/workflow_packages}. The Snk-CLI package is available via PyPI and can be installed using the command `pip install snk-cli`.

Snk is a powerful tool that simplifies the use of Snakemake workflows by dynamically generating CLIs. Snk is open-source software released under the MIT license. Snk documentation, source code, and issue tracker are available at \url{https://github.com/Wytamma/snk}. We welcome contributions and feedback from the community to improve Snk and make it a valuable tool for the Snakemake community and reproducible research at large.

Expand Down

0 comments on commit 6069b34

Please sign in to comment.