Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpretation problem when comparing identical high-level terms #32

Open
tivdnbos opened this issue Aug 13, 2020 · 3 comments
Open

Interpretation problem when comparing identical high-level terms #32

tivdnbos opened this issue Aug 13, 2020 · 3 comments

Comments

@tivdnbos
Copy link
Contributor

When identical high-level terms are compared, a low score is returned, e.g.:
GO:0030170 (pyridoxal phosphate binding) vs GO:0030170 gives 99% similarity
GO:0043167 (ion binding) vs GO:0043167 gives 55% similarity
GO:0003674 (molecular function) vs GO:0003674 gives 0% similarity

I also tested what happens if that term is multiple times in the list (e.g. 10x GO:0043167 vs 1x GO:0043167) but this gives the same result, 55% in this case

@rababerladuseladim
Copy link
Contributor

According to the authors, the simrel method is aimed at comparing gene products rather than functional profiles. Thus, generic terms are penalized: “Generic terms do not have a high relevance for the comparison of the exact function of different gene products.” In my opinion, this does not make sense for comparing profiles. The simrel method without the penalty becomes the simLin method.

@tivdnbos
Copy link
Contributor Author

tivdnbos commented Aug 14, 2020

I suggest to make a different branch where we test it with simLin. What do you think @rababerladuseladim @pverscha ?

@rababerladuseladim
Copy link
Contributor

I redid the analysis with the simLin metrik, to be found here: https://github.com/MEGA-GO/manuscript-data-analysis/tree/use_lin_metric
Sample clustering is not affected, the ranges for the similarity change a bit towards higher levels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants