Skip to content

Latest commit

 

History

History
31 lines (19 loc) · 837 Bytes

HOWTO_snakemake.md

File metadata and controls

31 lines (19 loc) · 837 Bytes

Snakemake

We recommend trying out Snakemake as a replacement for GNU Make for managing workflows with several parts, such as:

  • download raw data
  • process raw data into clean data
  • divide dataset into train and test
  • train models
    • train model A
    • train model B
  • make result figure (using results from A and B)

Using snakemake, you can easily manage data science workflows in a Python-like syntax.

Tutorial: https://snakemake.readthedocs.io/en/stable/tutorial/tutorial.html

Example Repos that use Snakemake

Demo of prediction and feature selection with sklearn

https://github.com/tufts-ml/mastre-predict-and-downselect (Request access from Mike)

See especially the toy data workflow and the 'movie reviews' workflows.

Time series prediction repo

https://github.com/tufts-ml/mastre-predict-and-downselect