-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistency between python and C++ versions #4
Comments
Can you share the phenotype file?
…On Fri, Jun 26, 2020, 04:47 Devin R Leopold ***@***.***> wrote:
I was able to get the C++ version compiled, but now I am seeing large
discrepancies with the results compared to the python version. The problem
seems to be with the C++ version. For example, here is the output for 2
phenotypes using the python version:
(base) ***@***.***:~/Projects/Populus_GWAS$ python/bin/python3 python/feather/permutation_testing.py --kinship_eigenvalues output/scratch/e.vals --kinship_eigenvectors output/scratch/e.vecs --phenotypes output/scratch/Y --permutation --samc --n_permutations 100000
Note: No covariates supplied, using a constant intercept covariate.
Calculating heritability estimates...
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 24.56it/s]
Calculating permutation p-values...
100%|████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [06:31<00:00, 195.90s/it]
n_phen h2_est param_p perm_p RSE
0 0.92585 1.967e-09 4.49e-06 0.03457
1 0.53729 1.094e-14 0.0001163 0.0514
With the exact same data, here is some output from the C++ version:
(base) ***@***.***:~/Projects/Populus_GWAS$ ./python/feather/permutation_testing_samc --eigenvalues_filename output/scratch/e.vals --eigenvectors_filename output/scratch/e.vecs --phenotypes_filename output/scratch/Y --heritabilities_filename output/scratch/heritability_estimates.txt --phenotype_indices 0,1 --n_permutations 1000000 --samc true
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
0.889689
0.772495
Without using samc, jut the fast permutation method, the results are more
similar, though the python version seems to consistently give lower
p-values over many runs of each method. Here example of the same 2
phenotypes using fast permutations in the python version:
(base) ***@***.***:~/Projects/Populus_GWAS$ python/bin/python3 python/feather/permutation_testing.py --kinship_eigenvalues output/scratch/e.vals --kinship_eigenvectors output/scratch/e.vecs --phenotypes output/scratch/Y --permutation --fast --n_permutations 100000
Note: No covariates supplied, using a constant intercept covariate.
Calculating heritability estimates...
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 22.86it/s]
Calculating permutation p-values...
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00, 5.54s/it]
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00, 5.54s/it]
100%|████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [10:57<00:00, 328.74s/it]
n_phen h2_est param_p perm_p
0 0.92585 1.967e-09 0
1 0.53729 1.094e-14 0.00011
And here is the C++:
(base) ***@***.***:~/Projects/Populus_GWAS$ ./python/feather/permutation_testing_samc --eigenvalues_filename output/scratch/e.vals --eigenvectors_filename output/scratch/e.vecs --phenotypes_filename output/scratch/Y --heritabilities_filename output/scratch/heritability_estimates.txt --phenotype_indices 0,1 --n_permutations 100000
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
0.0016
0.00572
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEJVP6X7BFQXJ7TGDJFK7DRYP42XANCNFSM4OI4WUSA>
.
|
Sure. Here is the phenotype file, as well as the eigenvalues and eigenvectors I used in the example posted above. I also just tested the C++ version on the example data and the result was similar to what is shown in the documentation, but I do not see an obvious reason why my data would be behaving differently. |
I think I have pinpointed the problem. The input files I was using were tab delimited, but it appears that the C++ version of feather does not function properly with this format. I noticed that the example files are space delimited, and when I switched my input files to this format the output is as expected. I do not know C++, so I did not poke around in the code to try and figure out why this is and why the fast permutation in C++ gives closer to correct (but not quite right) output than the samc approach. I think it would be good to generalize the code it to accept tab delimited files or have the program throw an error if the input format in not what is required, particularly given that the tab delimited files are common and work with the python version of feather. |
Thanks for noticing this issue. I am away from the computer for the rest of
the week, I hope to address it soon next week.
…On Mon, Jun 29, 2020, 21:37 Devin R Leopold ***@***.***> wrote:
I think I have pinpointed the problem. The input files I was using were
tab delimited, but it appears that the C++ version of feather does not
function properly with this format. I noticed that the example files are
space delimited, and when I switched my input files to this format the
output is as expected. I do not know C++, so I did not poke around in the
code to try and figure out why this is and why the fast permutation in C++
gives closer to correct (but not quite right) output than the samc
approach. I think it would be good to generalize the code it to accept tab
delimited files or have the program throw an error if the input format in
not what is required, particularly given that the tab delimited files are
common and work with the python version of feather.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEJVP7KRBCS4OE2CAKFSRLRZDNM7ANCNFSM4OI4WUSA>
.
|
I was able to get the C++ version compiled, but now I am seeing large discrepancies with the results compared to the python version. The problem seems to be with the C++ version. For example, here is the output for 2 phenotypes using the python version:
With the exact same data, here is some output from the C++ version:
Without using samc, jut the fast permutation method, the results are more similar, though the python version seems to consistently give lower p-values over many runs of each method. Here example of the same 2 phenotypes using fast permutations in the python version:
And here is the C++:
The text was updated successfully, but these errors were encountered: