-
Notifications
You must be signed in to change notification settings - Fork 1
Module Usage in Projects
As a concrete example, we will apply the unsupervised_analysis
module to MyData
stored on data/MyData.
First, we provide the configuration file for the application of the unsupervised_analysis module
to MyData
using this specific and predefined structure within your project's config/config.yaml.
#### Datasets and Workflows to include ###
workflows:
MyData:
unsupervised_analysis: "config/MyData/MyData_unsupervised_analysis_config.yaml"
Tip
Recommended folder and naming scheme for config files: config/{dataset_name}/{dataset_name}_{module}_config.yaml
.
Second, within the main Snakefile (workflow/Snakefile
) we have to do three things
- load and parse all configurations into a structured dictionary.
# load configs for all workflows and datasets config_wf = dict() for ds in config["workflows"]: for wf in config["workflows"][ds]: with open(config["workflows"][ds][wf], 'r') as stream: try: config_wf[ds+'_'+wf]=yaml.safe_load(stream) except yaml.YAMLError as exc: print(exc)
- include the
MyData
analysis snakfile from the rule subfolder (see last step).##### load rules (one per dataset) ##### include: os.path.join("rules", "MyData.smk")
- require all outputs from the used module as inputs to the target rule
all
.#### Target Rule #### rule all: input: #### MyData Analysis rules.MyData_unsupervised_analysis_all.input, ...
Finally, within the dedicated snakefile for the analysis of MyData
, workflow/rules/MyData.smk
we load the specified version of the unsupervised_analysis
module directly from GitHub, provide it with the previously loaded configuration and use a prefix for all (*
) loaded rules.
# MyData Analysis
### MyData - Unsupervised Analysis ####
module MyData_unsupervised_analysis:
snakefile:
github("epigen/unsupervised_analysis", path="workflow/Snakefile", tag="v2.0.0")
config:
config_wf["MyData_unsupervised_analysis"]
use rule * from MyData_unsupervised_analysis as MyData_unsupervised_analysis_*
Tip
Recommended file name for the analysis-specific snakefile: workflow/rules/{dataset_name}.smk
.
Recommended prefix for the loaded rules: {dataset_name}_{module}_
.
====================== COMING SOON ======================