Data processing pipeline

Methods for building a data processing pipeline in MATLAB. Intended for use with MRI data in general and diffusion MRI in particular.

General structure

A node (dp_node_base.m) is a class that executes a single processing step. It has the following key methods

input = po2i(obj, previous_output), which takes an output structure from a previous node and converts it into an input to the present node. This method may, for example, rename fields.
output = i2o(obj, input), which build the output structure from the input structure. This is to declare what files that are expected to be generated by the node
output = execute(obj, input, output), which executes the code that generates the output from the input.

Apart from these methods, the class has some important properties

previous_node, which links it to previous processing steps.
output_test, which determines which fields of the output structure that will be checked for file existence before the node is executed.

These properties should be set in the constructor.

Apart from the methods and properties mentioned here, there are additional ones to help the execution of the data processing.

Node types

There are different types of nodes

dp_node_primary, which only generates output structures for later nodes
dp_node, which is intended to act on image data e.g. nifti-files
dp_node_workflow, which glues multiple nodes together into one
dp_node_items, which acts on an unstructured set of items e.g. imaging data not yet identified

In addition, there are nodes that only deal with inputs and outputs

dp_node_io_rename, which only renames fields
dp_node_io_append, which appends fields
dp_node_io, which appends a single fields

These nodes takes as input a translation table on the form { {'field_1'}, {value} }, where value can be a function handle @(x) x.field, which will get the input structure sent to it. Here, output.field_1 would have been set to input.field. Finally, dp_node_io takes only a single pair as input, e.g. dp_node_io('field_1', value) would achieve the same thing as above. The rename node will have only field_1 as output (on top of standard fields, whereas the other two will append the input structure.

Examples of nodes with more specific functions are

dp_node_dcm2nii.m, which converts a folder of dicom files to a nifti file
dp_node_dmri_denoise.m, which applies denoising via mrtrix.

To use the more specific nodes in your project, you have two options. First, start from scratch, and inherit from dp_node and implement at minimum i2o and execute. Second, you can inherit from an existing more functional node, and customize it by overloading the po2i and possibly the i2o methods. See examples.

Diffusion nodes

Nodes for processing dMRI data are prefixed by dp_node_dmri. They assume the input structure has a field called dmri_fn. For nodes needing metadata, it assumed an xps-structure can be loaded form a correspondingly named xps.mat file. See the mdm-framework for details.

Data processing with different modes

A node can support one or more data processing modes, which are accessed via the run method. For example, my_node().run(mode) would start the data processing in the given mode. Examples of modes are

report, which prints a report showing which input and output files that exist, and an example of an output structure
iter, which generates a list of outputs of the present node
execute, which runs these execute method on all outputs generated by the previous_node of the present node
debug, which is identical to execute except that errors are not enclosed in a try/catch structure
visualize, which saves visualizations of the data managed by the node
mgui, which opens the output of the node in a graphical user interface

Normally, you would use iter and report to troubleshoot a developing pipeline.

Options

An options structure can be supplemented to the data processing, according to my_node().run(mode, opt), where opt is a structure with one or more of the following fields

do_try_catch, which is a boolean that determines whether errors in the data processing is catched or rethrown
verbose, which tells the pipeline how much information to display (range 0-3)
do_overwrite, a boolean that determines whether existing files will be written over or not (note: output files older than input files will always be written over)

See dp_nose_base.m for a full list (static method: default_opt).

Input structure

Mandatory fields

id, which holds the identity of the data being processed

Optional, but near mandatory fields

op, is the output path (this is where e.g. dp_node_denoise puts its output)
bp, is the base-path from which paths are created (e.g. bp/id/nii/file.nii.gz)

These three fields will always be set in the output node from the input node, even if they are not mentioned in your code. However, the framework will not override changes done in a node.

Output structure

All fields are optional, but some rules apply

Fields ending with _fn are assumed to be refering to files, meaning they will be part of the input/output checks
Setting output.tmp.bp to a temporary path and output.tmp.do_delete = 1 will allow the execute method to put data in a temporary path that will be deleted once the execution is done.

Tips and tricks

When setting up a new node, start by defining the po2i, i2o and execute methods, as empty functions. Run the node in the report mode. Set a breakpoint within the empty functions, and start writing your code, and test it. Once you can run the report mode without errors, you're ready to start the execution mode (execute).

Running the node without catching errors may cause it to stop early, in a subject with input/output errors, that you may wish to ignore. To deal with this, run the node for a subject where all preceeding nodes work correctly. For example

my_node().run('report', struct('do_try_catch', 0, 'verbose', 3))

Stand alone use

The nodes can also be used in a stand-alone fashion. Example for denoising

input.dmri_fn = 'my_path/your_dwi_volume.nii.gz'; input.op = msf_fileparts(input.nii_fn);

a = dp_node_dmri_denoise(); a.execute(input, a.i2o(input));

Dependencies

https://github.com/markus-nilsson/md-dmri
Depending on nodes used: MRTrix, FSL, and other tools

Acknowledgements

If you use this in your project, please acknowledge this and cite this repository, and its author: Markus Nilsson at Lund University.

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
example_dmri		example_dmri
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dp.m		dp.m
dp_node.m		dp_node.m
dp_node_base.m		dp_node_base.m
dp_node_base_support.m		dp_node_base_support.m
dp_node_control.m		dp_node_control.m
dp_node_copy.m		dp_node_copy.m
dp_node_copy_and_rename.m		dp_node_copy_and_rename.m
dp_node_copy_files.m		dp_node_copy_files.m
dp_node_core.m		dp_node_core.m
dp_node_core_central.m		dp_node_core_central.m
dp_node_core_dpm.m		dp_node_core_dpm.m
dp_node_core_log.m		dp_node_core_log.m
dp_node_core_roi.m		dp_node_core_roi.m
dp_node_csv.m		dp_node_csv.m
dp_node_dcm2nii.m		dp_node_dcm2nii.m
dp_node_dcm2nii_and_xps.m		dp_node_dcm2nii_and_xps.m
dp_node_dcm2nii_on_folder.m		dp_node_dcm2nii_on_folder.m
dp_node_dcm2nii_on_zips.m		dp_node_dcm2nii_on_zips.m
dp_node_dmri.m		dp_node_dmri.m
dp_node_dmri_bet.m		dp_node_dmri_bet.m
dp_node_dmri_denoise.m		dp_node_dmri_denoise.m
dp_node_dmri_disco.m		dp_node_dmri_disco.m
dp_node_dmri_disco_synb0.m		dp_node_dmri_disco_synb0.m
dp_node_dmri_dti.m		dp_node_dmri_dti.m
dp_node_dmri_flirt.m		dp_node_dmri_flirt.m
dp_node_dmri_flirt_apply.m		dp_node_dmri_flirt_apply.m
dp_node_dmri_io_bval_bvec.m		dp_node_dmri_io_bval_bvec.m
dp_node_dmri_io_xps_to_bval_bvec.m		dp_node_dmri_io_xps_to_bval_bvec.m
dp_node_dmri_mec.m		dp_node_dmri_mec.m
dp_node_dmri_mec_eb.m		dp_node_dmri_mec_eb.m
dp_node_dmri_merge.m		dp_node_dmri_merge.m
dp_node_dmri_normalise_seq.m		dp_node_dmri_normalise_seq.m
dp_node_dmri_powder_average.m		dp_node_dmri_powder_average.m
dp_node_dmri_preprocess_hardi.m		dp_node_dmri_preprocess_hardi.m
dp_node_dmri_preprocess_hardi_v2.m		dp_node_dmri_preprocess_hardi_v2.m
dp_node_dmri_preprocess_set_defaults.m		dp_node_dmri_preprocess_set_defaults.m
dp_node_dmri_qti.m		dp_node_dmri_qti.m
dp_node_dmri_qti_pa.m		dp_node_dmri_qti_pa.m
dp_node_dmri_smooth.m		dp_node_dmri_smooth.m
dp_node_dmri_subsample.m		dp_node_dmri_subsample.m
dp_node_dmri_subsample_b0.m		dp_node_dmri_subsample_b0.m
dp_node_dmri_topup.m		dp_node_dmri_topup.m
dp_node_dmri_topup2.m		dp_node_dmri_topup2.m
dp_node_dmri_topup2_apply.m		dp_node_dmri_topup2_apply.m
dp_node_dmri_topup2_b0.m		dp_node_dmri_topup2_b0.m
dp_node_dmri_topup2_io.m		dp_node_dmri_topup2_io.m
dp_node_dmri_topup2_prep.m		dp_node_dmri_topup2_prep.m
dp_node_dmri_topup_apply.m		dp_node_dmri_topup_apply.m
dp_node_dmri_topup_b0.m		dp_node_dmri_topup_b0.m
dp_node_dmri_topup_io.m		dp_node_dmri_topup_io.m
dp_node_dmri_topup_prep.m		dp_node_dmri_topup_prep.m
dp_node_dmri_xps.m		dp_node_dmri_xps.m
dp_node_dmri_xps_force.m		dp_node_dmri_xps_force.m
dp_node_dmri_xps_from_bval_bvec.m		dp_node_dmri_xps_from_bval_bvec.m
dp_node_dmri_xps_from_gdir.m		dp_node_dmri_xps_from_gdir.m
dp_node_dmri_xps_from_json.m		dp_node_dmri_xps_from_json.m
dp_node_dmri_xps_make.m		dp_node_dmri_xps_make.m
dp_node_elastix_apply.m		dp_node_elastix_apply.m
dp_node_elastix_apply_to_many.m		dp_node_elastix_apply_to_many.m
dp_node_elastix_coreg.m		dp_node_elastix_coreg.m
dp_node_files_to_items.m		dp_node_files_to_items.m
dp_node_fn_delete.m		dp_node_fn_delete.m
dp_node_fsl_bet.m		dp_node_fsl_bet.m
dp_node_fsl_eddy.m		dp_node_fsl_eddy.m
dp_node_fsl_eddy_prepare.m		dp_node_fsl_eddy_prepare.m
dp_node_fsl_eddy_prepare_post_topup.m		dp_node_fsl_eddy_prepare_post_topup.m
dp_node_fsl_eddy_run.m		dp_node_fsl_eddy_run.m
dp_node_fsl_flirt.m		dp_node_fsl_flirt.m
dp_node_fsl_flirt_apply.m		dp_node_fsl_flirt_apply.m
dp_node_fsl_fnirt.m		dp_node_fsl_fnirt.m
dp_node_fsl_fnirt_apply.m		dp_node_fsl_fnirt_apply.m
dp_node_fsl_fnirt_apply_all.m		dp_node_fsl_fnirt_apply_all.m
dp_node_fsl_tbss_distance_map.m		dp_node_fsl_tbss_distance_map.m
dp_node_fsl_tbss_postreg.m		dp_node_fsl_tbss_postreg.m
dp_node_fsl_tbss_skeletonize.m		dp_node_fsl_tbss_skeletonize.m
dp_node_identify_sequences.m		dp_node_identify_sequences.m
dp_node_io.m		dp_node_io.m
dp_node_io_append.m		dp_node_io_append.m
dp_node_io_cache.m		dp_node_io_cache.m
dp_node_io_filter_by_id.m		dp_node_io_filter_by_id.m
dp_node_io_filter_by_number.m		dp_node_io_filter_by_number.m
dp_node_io_merge.m		dp_node_io_merge.m
dp_node_io_rename.m		dp_node_io_rename.m
dp_node_items.m		dp_node_items.m
dp_node_items_from_fields.m		dp_node_items_from_fields.m
dp_node_items_from_files.m		dp_node_items_from_files.m
dp_node_items_to_fields.m		dp_node_items_to_fields.m
dp_node_make_xps.m		dp_node_make_xps.m
dp_node_md.m		dp_node_md.m
dp_node_mrtrix.m		dp_node_mrtrix.m
dp_node_mrtrix_dwibiascorrect.m		dp_node_mrtrix_dwibiascorrect.m
dp_node_mrtrix_dwigradcheck.m		dp_node_mrtrix_dwigradcheck.m
dp_node_primary.m		dp_node_primary.m
dp_node_primary_feed.m		dp_node_primary_feed.m
dp_node_primary_list_folder.m		dp_node_primary_list_folder.m
dp_node_primary_list_lund.m		dp_node_primary_list_lund.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data processing pipeline

General structure

Node types

Diffusion nodes

Data processing with different modes

Options

Input structure

Output structure

Tips and tricks

Stand alone use

Dependencies

Acknowledgements

About

Releases

Packages

Languages

License

markus-nilsson/dpp

Folders and files

Latest commit

History

Repository files navigation

Data processing pipeline

General structure

Node types

Diffusion nodes

Data processing with different modes

Options

Input structure

Output structure

Tips and tricks

Stand alone use

Dependencies

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages