Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: compute a connectivity matrix from all BOLD runs/sessions #176

Open
victoris93 opened this issue Sep 20, 2024 · 4 comments

Comments

@victoris93
Copy link

Your idea

For many connectivity analyses it is advised that one maximizes time series length, so one would expect to be able to use all BOLD data to compute a connectivity matrix (e.g., a classic case of a single matrix per subject). Any chance you guys would consider an option to concatenate all runs or sessions before computing a matrix?

@victoris93 victoris93 changed the title An option to compute a connectivity matrix from all BOLD runs/sessions Feature request: compute a connectivity matrix from all BOLD runs/sessions Sep 20, 2024
@htwangtw
Copy link
Collaborator

Thanks for the suggestion!
We had some internal suggestions and never really reached a conclusion on how should different runs / sessions handled. A subject level summary sounds very reasonable.
This is indeed a common use case and I have done something similar myself.
In terms of implementation, I will probably calculate the connectivity metrics for each scan and then calculated the average. This is more memory efficient and in principle the same. I am happy to show it with a minimal example if you wish!

@victoris93
Copy link
Author

I've thought about taking the average of run-specific matrices myself, too, but I wasn't sure if from the methods standpoint it's equivalent to computing a single matrix from all runs concatenated. But I think if at some point you decided to implement it, it would be interesting to be able to:

  • concatenate all runs and sessions
  • concatenate all runs within session (e.g., if one wanted to assess subject discriminability across sessions)

If it's not at the top of the agenda, I understand. Just thought I'd leave it here for future reference.

@htwangtw
Copy link
Collaborator

htwangtw commented Sep 23, 2024

The original reason that I decide to not implement an group connectome are

  1. this is relatively simple for researchers to compute on their own.
  2. it adds way too many small customisation and deviate away from the aim of having a simple tool, CLI will become bulky if they are exposed as options to users
  3. the outputs will be bulky if all possible ways of combining data are created.

If anything along these lines are implemented, I would prefer to have on thing that's useful for more researchers, rather than all possible combinations.

Also for averaging vs concatenating, IMHO this is a necessary compromise for efficient computing. The results are not going to be numerically identical, but the similarity is extremely high (see code below).
Concatenating all the time series will lead to inefficient RAM usage. For a bulk of the computing time the RAM usage would be low, and on a computing cluster with scheduler, this is the behaviour that would get user priority punished.
Along the same vine, concatenation will not scale for longer scans.

Following is the code to show the the averaging approach and the concatenating approach will create pretty similar results:

import numpy as np


time_series = [np.random.rand(100, 444) for _ in range(5)]  # time series length of 100, typical for short scans, 444 parcels
connectome_concat = np.corrcoef(np.concatenate(time_series).T)

connectome_average = np.mean([np.corrcoef(ts.T) for ts in time_series], axis=0)

similarity = np.corrcoef(connectome_concat.flatten(), connectome_average.flatten())[0, 1]

assert similarity > 0.99

@victoris93
Copy link
Author

Ok I see, this sounds totally reasonable! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants