Skip to content

Conversation

erindiel
Copy link

This notebook covers an example of image analysis using fluorescence WSI data from IDC as input to MCMICRO. Some caveats include:

Note that the markers.csv and params.yml files needed for MCMICRO analysis are included in this PR as well.

cc @melissalinkert @dclunie @fedorov

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@DanielaSchacherer
Copy link
Contributor

Hey Erin,
I am Daniela, also part of the IDC team and Andrey asked me to check out your notebook and let you know my feedback! First of all, MCMICRO seems to be a pretty cool tool, that I did not come across so far. I tried to run the notebook locally on Windows and Linux. On Windows I was not successful unfortunately, s5cmd and docker did not work at all. However, on Linux, I was able to run your notebook with some adaptations (had to run part of the code from the command line instead from within the notebook).

I agree with all your points mentioned above. I think it could be valuable to have a list of requirements (nextflow, docker, java...) and maybe separate notebooks for Windows and Linux (can't say anything about MacOS). Alternatively, notebooks could be converted to a script + README, that explains what to expect from the code.

@fedorov
Copy link
Member

fedorov commented Sep 4, 2024

@DanielaSchacherer thank you for the review!

@erindiel I am working on my review, and adding some small features to idc-index to help simplify some steps. I will follow up shortly.

@DanielaSchacherer
Copy link
Contributor

As an update to what I had to adapt in the code: I had to manually create the folder /raw in MCMICRO-example-IDC/OME-TIFF and move markers.csv into MCMICRO-example-IDC/OME-TIFF. I guess that's not something I was supposed to do, is it?

Copy link
Member

@fedorov fedorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After all, I was not able to complete it, because the linux VM I used for the test did not have enough disk space! I will make a different VM, but I thought the comments below stand on their own.

@@ -0,0 +1,37 @@
channel_number,cycle_number,marker_name,Filter,excitation_wavelength,emission_wavelength,background,exposure,remove
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Information about those markers is contained in the DICOM file. Doesn't BioFormats DICOM reader extract those into any user-accessible location? If this file needs to be created by the user somehow, this notebook is not generalizable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere in the opening I would include a statement about the prerequisites that will be used in this notebook, and also what platform it was confirmed to work on.

workflow:
start-at: segmentation
stop-at: quantification
segmentation: ilastik
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add something in the notebook text (in the beginning) that would explain what this workflow is actually doing, how to access documentation, how to find alternative workflows? There is a reference to https://mcmicro.org/parameters/other.html#ilastik, but it is deep in the notebook, and even then it does not say what analysis this actual workflow is doing.

Copy link
Member

@fedorov fedorov Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using s5cmd, it is more convenient to pip install --upgrade idc-index and then do this:

!idc download-from-selection --download-dir=/media/volume/sdb/MCMICRO-example-IDC/DICOM/ --dir-template="" --series-instance-uid=1.3.6.1.4.1.5962.99.1.2343322182.1764456793.1655905763910.4.0

Among other things, the command above will check if the destination directory has enough space, and will report download progress, which is quite helpful for this large image series.

It would be best to parameterize this and other cell that work with input or output images with the destination directory. I tested this on a cloud VM that did not have enough disk space (the image is ~60GB), and I had to use a mounted volume. On the first try, I did not realize that the subsequent cells depend on the files in the repo (parameters/markers), and when I first tried to run the analysis on the data downloaded and converted in a directory in that volume, it failed. It worked after I copied the missing files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend installing and checking all of the prerequisites in the beginning of the notebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants