Module Usage in Projects

Jump to bottom

Stephan Reichl edited this page Aug 7, 2024 · 9 revisions

As a concrete example we will apply the unsupervised_analysis module on MyData stored on data/MyData.

First, we provide the respective configuration file using this specific and predefined structure. https://github.com/epigen/mr.pareto/blob/ffa00e74f1227f5c4f526e2a84fdc832c18ad720/config/config.yaml#L12-L15

Second, within the main Snakefile (workflow/Snakefile) we have to do three things - load and parse all configurations into a structured dictionary. https://github.com/epigen/mr.pareto/blob/ffa00e74f1227f5c4f526e2a84fdc832c18ad720/workflow/Snakefile#L19-L28 - include the MyData analysis snakfile from the rule subfolder (see below). https://github.com/epigen/mr.pareto/blob/ffa00e74f1227f5c4f526e2a84fdc832c18ad720/workflow/Snakefile#L31-L32 - require all outputs from the used module as inputs to the target rule all. https://github.com/epigen/mr.pareto/blob/ffa00e74f1227f5c4f526e2a84fdc832c18ad720/workflow/Snakefile#L35-L40

In the dedicated snakefile for the analysis of MyData, workflow/rules/MyData.smk we load the specified version of the unsupervised_analysis module directly from GitHub, provide it with the previously loaded configuration and use as a prefix for all loaded rules. Recommendation: {data_name}_{module_name}_. https://github.com/epigen/mr.pareto/blob/ffa00e74f1227f5c4f526e2a84fdc832c18ad720/workflow/rules/MyData.smk#L1-L10