Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set tau=0 when running TCA? #21

Open
YushaLiu opened this issue Dec 22, 2023 · 10 comments
Open

Set tau=0 when running TCA? #21

YushaLiu opened this issue Dec 22, 2023 · 10 comments

Comments

@YushaLiu
Copy link

Thanks for developing this great package! I have two quick questions.

  1. If I want to estimate the cell type-specific methylation values for each sample, can I call tca with tau=0? Does that still give sensible results? I'm asking this because it seems that the way you calculate the posterior for Z_ijh (described in equation 5-9 in the methods section in the paper) will no longer hold if tau is exactly 0?
  2. Can I use the tca function to estimate cell type-specific expression values as well, if I set X to be the gene expression data matrix? That is, in your algorithm the values of X do not have to be beta values that range from 0 and 1, is that right?

Thanks very much!

@E-R
Copy link
Contributor

E-R commented Dec 22, 2023

  1. Right, tau cannot be zero (this is indicated in the documentation of the tca function).
  2. Yes, you can in principle feed the functions in the package with expression data and the values will not be assumed or restricted to be in [0,1]. However, keep in mind that TCA does assume normality of the data. We recently developed a new method that is also theoretically justified for gene expression -- see here; preprint will follow soon.

@YushaLiu
Copy link
Author

Thanks for your prompt reply!
I'm looking forward to reading the preprint for your new method. What's the major difference between the new method and TCA, except that the new method can model omics data other than methylation, and have theoretical justifications? Are there key differences in modeling assumptions for the new method? Thanks!

@E-R
Copy link
Contributor

E-R commented Dec 23, 2023

yes, there is one additional key difference: we model the covariance of variation in expression/methylation between cell types

@YushaLiu
Copy link
Author

I see, thanks!

@YushaLiu YushaLiu reopened this Dec 24, 2023
@YushaLiu
Copy link
Author

YushaLiu commented Dec 24, 2023

I have a follow-up question regarding the weights input W of the function tca. Are zero values allowed for some elements in W? The documentation said all the weights must be positive, but in the dataset I'm analyzing, certain cell types do not exist at all for some samples so the weight should be exactly 0 in those cases. Can tca handle that? If not, should I set the weight to be very small, say, 1e-4, and still get sensible estimates of cell type-specific expression values (for samples with nonzero weights) by running tca? Thanks very much!

@YushaLiu
Copy link
Author

YushaLiu commented Dec 25, 2023

Hi again! I tried running tca with W that contains exactly zero elements and the call doesn't throw out an error. Can I trust the estimates of Zhat (sample-specific, cell type-specific estimates) for samples with nonzero weights in that cell type?

I got another question about tau. I called tca with tau = 0.05, but the returning model fit indicates that TCA is still estimating tau and not using the value I provided, e.g., the tau_hat component from the output is still the estimated tau that I would obtain by setting tau = NULL in the tca call. I'm using TCA_1.2.1. Could you please look into it?

@E-R
Copy link
Contributor

E-R commented Dec 27, 2023

What do you mean by W that contains zero elements? An empty matrix? This should return an error.

re the tau problem -- you can open a new issue and I will look into it. However, why would you want to set a fixed tau instead of estimating it?

@YushaLiu
Copy link
Author

YushaLiu commented Dec 29, 2023

No I mean some elements of the weight matrix W are exactly 0, which means that some samples do not contain any of a particular cell type. That could happen in real data.

Re the tau problem, for my explorative analyses of data using TCA, I would like to minimize tau and let the hidden source-specific error term (sigma_{hj}^2) explain the variation observed in the data as much as I can. I think that should be doable with current TCA implementation, based on the documentation?

@E-R
Copy link
Contributor

E-R commented Dec 31, 2023

The code is not expected to throw an error in that case. The Z_hat estimates of a cell type with fraction 0 in a certain sample are expected to be tiny (but not necessarily exactly zero).

Decreasing tau too much is expected to compromise the quality of the model and the accuracy of estimates so this is not advisable.

@YushaLiu
Copy link
Author

Thanks for your reply! I see. But is it still doable to fix tau at some small value, instead of having to have TCA estimate it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants