Replies: 5 comments 11 replies
-
Hi again Dave! I'm hoping someone else might have input on question 1. For #2...
We do generally advocate that raw data should always be included whenever possible. This can indeed produce very large individual individual files, but both NWB supported APIs have data tools for handling such operations. The DANDIArchive also has very high-performance upload/download operations, as well as a streaming functionality that allows you to interact with any file on the archive as if it were on your system, but only returns portions of data on an 'as-needed' basis. I've personally used all of these tools to handle individual files as big as 372GB and datasets in the tens of TB (which DANDI hosts completely free of charge), so if you have specific questions on how to do certain things for this please don't hesitate to reach out at any time. Are you thinking of using Python or MATLAB to perform your conversion? Both APIs support data tools to manage this quantity of data: one example would be the PyNWB tutorial on iterative data write. This would only load a small but specifiable amount of data into memory at any one point in time, and would iteratively write the entire series of data to the NWBFile in that manner. I also ask, because if you use Python, we've already implemented some automated tools for doing these sort of advanced data engineering things over on the NWB Conversion Tools project. Specific to TIFF stacks, one should be able to simply construct a little script like so from datetime import datetime
from dateutil import tz
from pathlib import Path
from nwb_conversion_tools import TiffImagingInterface
file_path = "my_tiff_file.tif" # point to a single tiff file
sampling_frequency = ?? # need to specify sampling rate in Hz of the images
# Change the file_path to the location in your system
interface = TiffImagingInterface(file_path=file_path, sampling_frequency=sampling_frequency)
# Extract what metadata we can from the source files
metadata = interface.get_metadata()
# session_start_time is required for conversion. If it cannot be inferred
# automatically from the source files you must supply one.
session_start_time = datetime(2020, 1, 1, 12, 30, 0, tzinfo=tz.gettz("US/Pacific")) # specify timezone and session start time here
metadata["NWBFile"] = dict(session_start_time=session_start_time)
# Choose a path for saving the nwb file and run the conversion
save_path = f"{path_to_save_nwbfile}" # This should be something like: "./saved_file.nwb"
interface.run_conversion(save_path=save_path, metadata=metadata) which will technically write them as a 2-photon series, but we'd be happy to extend that to whatever the conclusion of #1 happens to be. (we should also allow a list of tiff files to be written to the same series, I'll bring that up over there right now) Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hi again Cody! Thank you for the detailed response. I am definitely going to share the raw data, preprocessed may follow but that is a work in progress. I have no doubt dandi can handle these type of things, I am more concerned about our local infrastructure. In any case I will try to make the files. I am going to use python, so the package you detailed looks really useful. Ill give it a shot and keep an eye on the issue you raised about loading/writing multiple tiffs. I have also found a calcium imaging specific tutorial on pynwb that I think I can mimic with some editing to meet the requirements (https://pynwb.readthedocs.io/en/stable/tutorials/domain/ophys.html). I can mock things up with twoPhotonSeries and pivot if there is a better solution. Dave |
Beta Was this translation helpful? Give feedback.
-
Hey again @DaveOC90, it looks like what we'll have to do for the one photon data is make a new NWB data type for it - this work began a couple years ago but fell through the cracks: NeurodataWithoutBorders/nwb-schema#283 I'll be taking it over now to try to get this done once and for all. It would be a great help if you could take a look at what they proposed back in the day and see if it works for you. Also, not required but would be enormously helpful to this process - if you would be willing to share a single session of the one-photon data with me while I work on this, and possibly meet to go over any questions I might have? |
Beta Was this translation helpful? Give feedback.
-
Hey Dave, I'm finishing up work on the schema now, and I'd love to go over it with you one-on-one! Whenever you have the time, feel free to schedule something automatically via my calendly. Looking forward to meeting in person, |
Beta Was this translation helpful? Give feedback.
-
Hello Cody, We're also acquiring mesoscale imaging data and would like to would like to store raw ScanImage tif files in a remote file repository, losslessly compressed. We would also like to have the ability to stream small numbers of frames in case we ever need to run quality control or test new ROI detection algorithms on the data I understand that NWB allows one to store the raw files in a compressed form, and allows random access of individual files (correct me if I'm wrong here), however it's not clear to me whether it is possible to stream individual frames from a compressed tif with this API. Perhaps you could clarify for us? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi All,
I would like to convert a dual wavelength one photon mesoscale calcium imaging into nwb. I have two questions in particular:
I do not know what class to use for the data. It seemed like an image or optical series class object might be best for these, but I realize these are non-specific. Perhaps I can create a one photon class that inherits from the two photon? I am not familiar with two photon so I dont know how appropriate this would be.
I also have concerns about how to aggregate the data into one sessions file, as we have upwards of 36GB of imaging data per session, and in some cases triple that. Does nwb generally advocate for really large files (10s of GBs)? I think the computers we have may struggle with this, and it may make transferring the data more difficult.
Some data details below:
Data Description
The data are currently in nifti format, and in a BIDs like organization, that is to say, they are organized by subject id, session, and run (afaik there is no bids standard for these type of data yet). The data are 2D x Time. We have at least 6 runs/scans per session, and around 31 sessions. The 10 minute runs can yield ~5.5 -6 GB of data so they are usually split into three tif files when output (we then converted to nii). A further complication is that they were acquired using a dual wavelength protocol, which means signal sensitive frames were interleaved with signal insensitive frames, We have split each tif file into two separate nii files. So for now we have 6 image files (as nii) per run, and at least 6 runs per session (in some cases > 12), and 31 sessions across 23 subjects.
Thanks for any help you can offer.
Dave
Beta Was this translation helpful? Give feedback.
All reactions