Replies: 4 comments
-
Hello! Mapping things properly would be something like assuming that the activity of each subject comes from some sort of common manifold. Projecting each subject into that manifold would create more comparable features. At this point the hard part is not the classification but how to find an appropriate manifold. For what it concerns the logit link, we would need a different observation model (Bernoulli, i.e. Binomial with 2 classes). We don't have it yet, but we are planing to add it. If you arrive at the point of applying a logistic regression, you can use scikit-learn and I can also help you set-up the model design once you have the abstract features I was talking about. |
Beta Was this translation helpful? Give feedback.
-
Actually, I know a Pedro @pedroherrerovidal working on olfactory bulb that did a lot of cross-animal alignment, see this. He might have a better understanding of this problem. Thanks in advance Pedro :) |
Beta Was this translation helpful? Give feedback.
-
Hi @BalzaniEdoardo and @vkonan, If I understood the question correctly, the final goal is to determine if signals coming from one animal belong to one of two underlying classes. If that is the case, I agree with Edo and we can think performing this task in a shared feature space. Before that, my conceptualization of the problem. High-level, you can think as your recordings as features that define a space where the genotype or target class/label can be separated. Now given known recordings and associated classes you want to train a model to predict class in new animals. Brute force, one could take all neurons and time as features to train a classifier, but this wouldn't work because i) the features (neurons, timebins and conditions) are not matched across animals (different feature spaces), and ii) it would overlook known statistics of the signal (temporal correlations, conditioning on odor presented; different scales). A slightly more complex approximation would be to engineer features from the original signal to train the classifier. Some of these could be the amplitude or synchrony of the signal (shared feature space). But, we know this will change depending on the odor and the animal, so we want to normalize (condition) based on signal and animal (note that normalizing/conditioning is critical for the classifier to work optimally; shared scale). However, this approach would discard or not model explicitly the variability in the data. As Edo suggested, a more principled way of addressing this problem is defining a shared feature space (conditioned on animals and conditions) using latent space models that model temporal dynamics (take into consideration the temporal correlation of the signal); and then train a classifier in this space (shared feature space with same scale). Marginalizing over the animal, condition/odor and temporal variability would be best to achieve the best label/class/genotype classification. There is already a mention to my work here, which could be a good reference on how to align spaces across animals. Then one would have to decode genotype, which could be done hierarchically or training a ML model on top. Different models make different assumptions, and it is good to understand which one fits your goal best. For some other references, you can look into: |
Beta Was this translation helpful? Give feedback.
-
@vkonan we added a Binomial observation model in the development branch, if you want to try that out! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
My original question was posted here and evolved into this following question:
The experiment setup: wildtype (n=7) and knockout (n=8) animals received 33 odors and the neural responses (GCaMP6m) in olfactory bulb were recorded using 2-photon microscopy. My goal is to predict whether a trace from the calcium imaging recording (either deconvolved spike timestamps or raw calcium traces) is from a wild type animal or from a knockout animal. I know there are classifiers in scikit-learn that may do this but I'm using the data I have as an exercise to learn nemos better and understand what it can and cannot do.
My data structure is as follows:
A list of numpy arrays, each array containing data from one animal (data[0] is one animal), in the shape of <ROI x odors x time (in frames)>.
for example:
len(data_wt) = 8
data[0].shape = (125, 33, 85)
len(data_ko) = 7
data[0].shape = (176, 33, 85)
The number of ROIs differ from animal to animal, however the number of odors and the time dimension are the same. There are missing data for some of the odors (not all animals received all the odors). Contributors have already described how I should organize my data (see above link) so that it is compatible with nemos.
What is unclear is whether or not I can use nemos to analyze my data in the way I described earlier.
Beta Was this translation helpful? Give feedback.
All reactions