-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selection of sparse canonical variates (w_1, ..., w_N) #29
Comments
Hi @yugeji thank you for your interest in our method and apologies for the late response. You are asking an important question. The optimization is done iteratively (using UpdateW), such that the Ws are selected considering all the pairwise combinations and the total value of the objective function (summing over all the pairs). You can see the documentation of MultiCCA here https://rdrr.io/cran/PMA/man/MultiCCA.html |
Hi @livnatje , Thank you very much for your response!! (And no worries.) In the meantime, we've been investigating the behavior of the update_w function as implemented in the link you posted above, and we'd like to note that due to the specific pairwise iteration of the maximization implementation, the order in which datasets (cell types in your case) are passed matters for the final latent factor ( But, this is just an addendum point. I'll leave the issue open in case of additional comment on the topic but otherwise please feel free to close, and thank you very much for your response! |
Oh, interesting and thanks for surfacing that! We will look into that too and examine sensitivity. DIALOGUE outputs have been pretty robust (as CCA is convex and the HLM gives unique solutions). Did you find substantial variation? |
We have yet to investigate the robustness of the A demonstration of this permutation invariance and the effect of permutation on the current MultiCCA implementation on a toy example can be found here: https://github.com/theislab/sparsecca/blob/main/examples/linear_programming_multicca.ipynb |
OK, so we will explore this as well. I will leave this issue open as you suggested and we can follow up. |
Hello,
The selection of sparse canonical variates in the Methods section of your paper details that$w_i$ and $w_j$ are fit according to the multi-factor PMD algorithm in Witten 2009. The difference in DIALOGUE, however, is that an additional summation term $\sum_{i \lt j}$ is added such that $w_i$ and $w_j$ are optimized over all pairwise combinations of cell types.
We are confused about how$w_1, ..., w_N$ - specifically, a single $w_i$ for each cell type - is selected when there are multiple $w_i$ output by MultiCCA per cell type (due to the pairwise combinations).
For example, given three cell types 1, 2, and 3, DIALOGUE computes$w$ such that
In this case we end up computing two$w_1$ , two $w_2$ , and two $w_3$ because each $\max$ is calculated independently with MultiCCA. How do you select which of these $w$ to select for $w_1, ..., w_N$ ?
Alternatively, does MultiCCA actually optimize for$maximize \sum_{i \lt j} w_i^TX_i^TX_jw_j$ including the summation?
The text was updated successfully, but these errors were encountered: