Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example showing how to update a dataset in ord-data #758

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

bdeadman
Copy link
Collaborator

No description provided.

@bdeadman
Copy link
Collaborator Author

New example showing how I updated the Golden dataset (open-reaction-database/ord-data#214) after we were notified that some reactant and product SMILES strings were incorrect. The idea behind this example is to document the process of:

  1. loading in a dataset, and some replacement data in .csv
  2. looping over the reactions in the dataset and updating the required fields, including adding a new record_modified entry
  3. comparing the old and new datasets to review the changes
  4. preparing the file for upload to ord-data

Example enumerating over 2 template files (to account for differences in reaction analysis), and then merging the dataset.
Write the resulting dataset to a pb.gz file with the assinged ord id.
@bdeadman
Copy link
Collaborator Author

Additional example showing how to generate dataset by enumerating template files over spreadsheets. Two templates used to accommodate some differences in the tabulated data, and then the datasets are merged together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants