Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ingest Loss-of-Function variant data from OTAR2075 #105

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

vivienho
Copy link
Contributor

This PR adds a new DAG to ingest LOF variant data curated in OTAR2075. The ingested output is provided as an input to the variant_index step to annotate an existing variant index.

Copy link
Collaborator

@project-defiant project-defiant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, new dag for harmonising the OTAR2075 results!

@@ -150,6 +150,7 @@ nodes:
step.variant_index_path: '{release_dir}/variant_index'
step.amino_acid_change_annotations:
- gs://otar013-ppp/OTAR2081_foldx/foldx_variant_annotation
step.lof_curation_variant_annotations_path: gs://otar013-ppp/OTAR2075_lof_curation/lof_curation_variant_annotations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that not be another annotation in the list above for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are different types: one is an AminoAcidVariants object and one is a VariantIndex object so should be separate inputs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, thus we already have another variant index from gnomad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you mean. I can add it to the variant_index_path instead of it being a separate parameter

dataproc:
python_main_module: gs://genetics_etl_python_playground/initialisation/cli.py
cluster_metadata:
GENTROPY_REF: v2.0.1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that v2.1.0 is already out, I think we have to freeze the gentropy first after your PR is merged before we can merge this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@project-defiant
Copy link
Collaborator

@vivienho please see the comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scoping integration of LoF project data into the Variant Page
2 participants