-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
outdated instruction "pheweb phenolist glob-files" #217
Comments
Thanks for pointing out “glob” vs “glob-files”! I just fixed it.
For rsids, I think you’re right. Anyways, it’s a harmless change, right?
If you make a PR with that change I’ll merge it. Or did you already make
one? I can’t remember anymore. Actually, I think it would be even better
to just match on chr:pos and ignore ref/alt.
…On Wed, Feb 14, 2024 at 9:34 PM Jie Huang ***@***.***> wrote:
Hi, there:
I noticed *pheweb phenolist* *glob-files* on this instruction page
https://github.com/statgen/pheweb/blob/master/etc/detailed-loading-instructions.md#configuration-options.
However, it won't work. It should be just "glob", not "glob-files".
The first time when I run *pheweb process*, I think pheweb automatically
download the huge dbsnp reference files. If I already have these huge dbsnp
files downloaded for another software, how could I speficy its location in
the config.py file so that it won't need to be re-downloaded?
Previously, I have pointed out that many GWAS simply report A1/A2, not the
strict REF/ALT. Once the reference genomes get bigger and bigger, the
REF/ALT specification would become really indefinite and inconsistent.
Therefore, I strongly suggest that pheweb would accommodate the use of
A1/A2 and be able to map rsIDs with a better flexibility. Otherwise, most
SNPs in a GWAS would have no mapped rsIDs after running pheweb. You guys *only
need to change one line (line 140), for add_rsids.py*.
*Before*:
rsids = [rsid['rsid'] for rsid in rsid_group if cpra['ref'] == rsid['ref']
and are_match(cpra['alt'], rsid['alt'])]
*After*:
rsids = [rsid['rsid'] for rsid in rsid_group if (cpra['ref'] ==
rsid['ref'] and are_match(cpra['alt'], rsid['alt'])) or (cpra['ref'] ==
rsid['alt'] and are_match(cpra['alt'], rsid['ref']))]
Please kindly consider this.
Thank you & best regards,
Jie
—
Reply to this email directly, view it on GitHub
<#217>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGSPCJMLLTFKU6FZBELGITYTVX4HAVCNFSM6AAAAABDJPBWHWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZTKNJSGQ4TGNQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Dear Peter:
Thanks!
I am not sure what “make a PR with that change” refers to.
Previously we discussed about this, at #173 (comment)
It is critical that we match by CHR:POS:A1:A2, not just CHR:POS.
I simply propose that we match by alleles, not demanding that the REF allele must mean the reference allele from the wild monkeys.
One day when all monkeys and all humans are sequenced, we might have difficulty to tell who is “reference” vs. “alternative”.
So as long as we are talking about the same two alleles, we should allow users to use different notations such as “effect” vs. “non-effect”.
As you see from the code that I proposed, it is just one line. I fully tested it and it worked perfectly.
Best regards,
Jie
发件人: Peter VandeHaar ***@***.***>
发送时间: 2024年2月16日 0:20
收件人: statgen/pheweb ***@***.***>
抄送: Jie Huang ***@***.***>; Author ***@***.***>
主题: Re: [statgen/pheweb] outdated instruction "pheweb phenolist glob-files" (Issue #217)
Thanks for pointing out “glob” vs “glob-files”! I just fixed it.
For rsids, I think you’re right. Anyways, it’s a harmless change, right?
If you make a PR with that change I’ll merge it. Or did you already make
one? I can’t remember anymore. Actually, I think it would be even better
to just match on chr:pos and ignore ref/alt.
On Wed, Feb 14, 2024 at 9:34 PM Jie Huang ***@***.*** <mailto:***@***.***> > wrote:
Hi, there:
I noticed *pheweb phenolist* *glob-files* on this instruction page
https://github.com/statgen/pheweb/blob/master/etc/detailed-loading-instructions.md#configuration-options.
However, it won't work. It should be just "glob", not "glob-files".
The first time when I run *pheweb process*, I think pheweb automatically
download the huge dbsnp reference files. If I already have these huge dbsnp
files downloaded for another software, how could I speficy its location in
the config.py file so that it won't need to be re-downloaded?
Previously, I have pointed out that many GWAS simply report A1/A2, not the
strict REF/ALT. Once the reference genomes get bigger and bigger, the
REF/ALT specification would become really indefinite and inconsistent.
Therefore, I strongly suggest that pheweb would accommodate the use of
A1/A2 and be able to map rsIDs with a better flexibility. Otherwise, most
SNPs in a GWAS would have no mapped rsIDs after running pheweb. You guys *only
need to change one line (line 140), for add_rsids.py*.
*Before*:
rsids = [rsid['rsid'] for rsid in rsid_group if cpra['ref'] == rsid['ref']
and are_match(cpra['alt'], rsid['alt'])]
*After*:
rsids = [rsid['rsid'] for rsid in rsid_group if (cpra['ref'] ==
rsid['ref'] and are_match(cpra['alt'], rsid['alt'])) or (cpra['ref'] ==
rsid['alt'] and are_match(cpra['alt'], rsid['ref']))]
Please kindly consider this.
Thank you & best regards,
Jie
—
Reply to this email directly, view it on GitHub
<#217>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGSPCJMLLTFKU6FZBELGITYTVX4HAVCNFSM6AAAAABDJPBWHWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZTKNJSGQ4TGNQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.*** <mailto:***@***.***> >
—
Reply to this email directly, view it on GitHub <#217 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGNS6774HZLPCQIW2UO7Y7TYTYYS3AVCNFSM6AAAAABDJPBWHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBWGQ3DMMZRGU> .
You are receiving this because you authored the thread. <https://github.com/notifications/beacon/AGNS677HD6UKMZH7R7SN3I3YTYYS3A5CNFSM6AAAAABDJPBWHWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTUAS4AW.gif> Message ID: ***@***.*** ***@***.***> >
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, there:
I noticed pheweb phenolist glob-files on this instruction page https://github.com/statgen/pheweb/blob/master/etc/detailed-loading-instructions.md#configuration-options. However, it won't work. It should be just "glob", not "glob-files".
The first time when I run pheweb process, I think pheweb automatically download the huge dbsnp reference files. If I already have these huge dbsnp files downloaded for another software, how could I speficy its location in the config.py file so that it won't need to be re-downloaded?
Previously, I have pointed out that many GWAS simply report A1/A2, not the strict REF/ALT. Once the reference genomes get bigger and bigger, the REF/ALT specification would become really indefinite and inconsistent. Therefore, I strongly suggest that pheweb would accommodate the use of A1/A2 and be able to map rsIDs with a better flexibility. Otherwise, most SNPs in a GWAS would have no mapped rsIDs after running pheweb. You guys only need to change one line (line 140), for add_rsids.py.
Before:
rsids = [rsid['rsid'] for rsid in rsid_group if cpra['ref'] == rsid['ref'] and are_match(cpra['alt'], rsid['alt'])]
After:
rsids = [rsid['rsid'] for rsid in rsid_group if (cpra['ref'] == rsid['ref'] and are_match(cpra['alt'], rsid['alt'])) or (cpra['ref'] == rsid['alt'] and are_match(cpra['alt'], rsid['ref']))]
Please kindly consider this.
Thank you & best regards,
Jie
The text was updated successfully, but these errors were encountered: