Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Versioning and docs for API #111

Open
3 tasks
welchr opened this issue Sep 19, 2018 · 6 comments
Open
3 tasks

Versioning and docs for API #111

welchr opened this issue Sep 19, 2018 · 6 comments
Assignees

Comments

@welchr
Copy link
Member

welchr commented Sep 19, 2018

Current API endpoint:

http://pheweb.sph.umich.edu:5000/api/variant/10:114758349-C-T

If the T2D portal is going to hit this for their UKBB PheWAS queries, we would need to provide a versioned URL (in case of backwards incompatible changes.) Also some documentation or possibly a metadata endpoint describing the current dataset (e.g. what is the imputation, what build, etc.)

  • Version URL
  • Documentation
  • Metadata endpoint

@pjvandehaar Would you be able to look at this sometime soon?

@welchr
Copy link
Member Author

welchr commented Sep 20, 2018

Also does this API endpoint have the phenotype groupings? Is that what is prefixed onto the phenostrings?

@pjvandehaar
Copy link
Collaborator

In a title like 20002_1220: Non-cancer illness code, self-reported: diabetes, 20002_1220 is the UKB code for the phenotype, and Non-cancer... is the UKB name for the phenotype. In the JSON they're called phenocode and phenostring. We don't have categories loaded for this dataset, but I think I can add them quickly if you would use them.

@pjvandehaar
Copy link
Collaborator

pjvandehaar commented Sep 21, 2018

Unfortunately, the categories are not very balanced, so they will be hard to render on a PheWAS plot. There are large categories and dozens of categories with <10 phenotypes. To display them, I can manually merge some similar categories.

@pjvandehaar
Copy link
Collaborator

pjvandehaar commented Sep 21, 2018

I believe that this API endpoint is still missing some data. Some phenotypes are missing sections of the genome, because we ran out of disk space on the loading machine, and fixing it was never a high priority. We've discussed this before, but I still want to be sure you're okay with that.

@pjvandehaar
Copy link
Collaborator

pjvandehaar commented Sep 21, 2018

For the metadata endpoint, what are you looking for? How's /api/v1/metadata.json:

{
  "build": "GRCh37",
  "description": "Analysis of UKB data by Ben Neale's lab, round 1, imputed using UK10K and HRC."
  "link": "http://www.nealelab.is/blog/2017/7/19/rapid-gwas-of-thousands-of-phenotypes-for-337000-samples-in-the-uk-biobank"
}

Should imputation get its own field? How are you hoping to use it? Perhaps maximum sample size as well?

@welchr
Copy link
Member Author

welchr commented Sep 23, 2018

Apologies, Peter - I think we can actually put this on hold for now (at least from my end.)

So far the LZ API seems to be able to serve the UKBB SAIGE HRC analysis, and my guess is we will probably go with that barring any major issues appearing. It makes it easier on the Broad since they're already setup to handle our PheWAS requests and responses.

In the long run, though, I don't think Postgres will be able to handle much more than this. So we may need to revert back to a PheWeb-like API backed by your storage solution. It's still worth considering the points above about versioning, docs, metadata, etc., I think.

Regarding metadata - an imputation field would be good, to let people know which panel was used when imputing the genotypes used in the analysis. Something as simple as "HRC" or "1000G Phase 3" or "TOPMED" is at least helpful and better than nothing. Including it in the description is another option, but without making it a required field, it can be forgotten.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants