-
Notifications
You must be signed in to change notification settings - Fork 41
Correlation values could be fisher z-transformed before t-tests? #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Dear @alexeperon first of all: Thanks for your comments! This is a valid concern: The t-test does indeed assume a normal distribution and this of course not really true for correlation values (nor any of the other comparison matters for that matter.) Whether we really should apply this transformation is a different question though. The proofs that this transform improves the match with the normal distribution generally assume that the underlying data comes from a normal distribution (or similar distributional assumptions) which is also not true in our case. I do not think this would be the most important correction to add for three reasons:
Thus, I am happy for people using the tests without transformations and believe this is not an error per se. ImplementationThat being said, I don't see immediate reasons why doing the transformation should hurt or cannot help. Thus adding this as a possibility seems like a good idea. Rather than implementing this in the eval functions, I would have a tendency to just code a transform results function that takes a results object applies the transform to all the model evaluation results and returns a new results object after the transformation. For consistency, one should then display the transformed correlations in plots, too, I think. |
Hi Heiko, Thank you for the speedy and thorough response! This sounds good - and it's reassuring to know that false positive rates are about right with or without correction. A separate transform results function seems like a good way to handle things. Thanks again and have a good evening, |
@alexeperon are you happy for me to use the code above in a PR to add this? |
Hi Jasper, I'd add the following as a separate function, following discussions with Heiko above. I've checked it and it seems to work well.
|
Hi all,
When performing group-level analysis using RSA, we might want to test correlation values against 0. This can be done in rsatoolbox using the .inference.eval_fixed function, for example, in which the distribution of correlation values is tested against 0.
Conventionally, the correlation values (r or rho) are z-transformed using a fisher transform prior to running t-tests. This ensures that the correlation scores are normally distributed (which is a necessary assumption for t-tests).
It appears that rsatoolbox does not currently do this - which might result in biased results for groups of subjects with high correlation scores.
There are multiple points where this could be added, but probably the logical point is in the rsatoolbox.inference.evaluate module, specifically the eval_fixed function. This means that (1) the compare functions still provide correlation scores, and (2) any downstream functions use z-scored evaluation values.
The current version of the function (0.1.5) looks like this:
Instead, we could use the method variable to see if our scores are correlations. If they are, they should be z-scored. Let's add in also a 'z-score' input in case we want to avoid this for other downstream functions.This results in the following:
Please note I am not an expert statistician, and so this suggestion comes with the caveat that I don't know how the distribution of other methods (cosine, cosine_cov, corr_cov) works with a t-test - which is why only 'corr', 'tau-a' and 'rho-a' are included in the above code. It is also worth noting that this is a substantial change, as it means all downstream functions using correlation values will then use z-scored values, whether a t-test or not. However, this does avoid changing all other downstream functions, and the added 'zscore' input gives user control over whether they want to use raw correlation values or not.
A further caveat is that I have only implemented this for the eval_fixed function. It may also be useful for other functions in rsatoolbox.inference.evaluate, but honestly I'm not quite sure so don't want to tamper with it!
I hope this is somewhat useful. For a couple of quick online sources, see:
https://en.wikipedia.org/wiki/Fisher_transformation
https://www.newbi4fmri.com/tutorial-9-mvpa-rsa
https://dartbrains.org/content/RSA.html
Best wishes,
Alex
The text was updated successfully, but these errors were encountered: