Adios dataset name 319 #320

kshitij-v-mehta · 2025-01-27T17:42:48Z

Update Feb 03: PR ready for merging.

~~PR for review only. Not ready for merging yet.~~

…a sample ORNL#319

kshitij-v-mehta · 2025-01-27T17:47:07Z

examples/csce/train_gap.py

@@ -288,6 +288,7 @@ def __getitem__(self, idx):
                data = generate_graphdata_from_smilestr(
                    smilestr, ytarget, csce_node_types, var_config
                )
+                data.dataset_name = 'csce'


@jychoi-hpc @allaffa Should the dataset_name be added this way manually, or should we read it from the json file?

…to adios_dataset_name_319

jychoi-hpc

Thank you for the update. It looks good to me.

allaffa · 2025-01-30T15:58:18Z

hydragnn/utils/datasets/adiosdataset.py

+
+        # Look for the dataset_name in any one of the Data samples and add it as an ADIOS attribute
+        dataset_name = self._get_dataset_name()
+        if dataset_name is not None:


@kshitij-v-mehta
Why do we need this if the dataset_name attributed is already take care of in the parent class AbstractBaseDataset?

This is the ADIOS writer. It checks for the presence of the dataset_name in the object. If it exists, it adds it to the ADIOS file.
Similarly for the reader, if it finds dataset_name in the file, it reads it and adds it to the object. The get method of the parent class then reads this attribute.

…e it ORNL#319

kshitij-v-mehta · 2025-02-06T15:16:25Z

Can this PR be merged? The check is failing due to a formatting issue in
examples/multidataset_deepspeed/train.py, which is unrelated to this PR's code.

kshitij-v-mehta · 2025-02-06T21:23:31Z

@pzhanggit This PR adds the dataset name to the ADIOS file. Does it also need to be added to the Pickle output?

pzhanggit · 2025-02-07T15:20:08Z

@pzhanggit This PR adds the dataset name to the ADIOS file. Does it also need to be added to the Pickle output?

Thanks, Kshitij! I think so, as long as we continue to support Pickle. @allaffa what do you think?

kshitij-v-mehta · 2025-02-07T17:48:13Z

@pzhanggit This PR adds the dataset name to the ADIOS file. Does it also need to be added to the Pickle output?

Thanks, Kshitij! I think so, as long as we continue to support Pickle. @allaffa what do you think?

Never mind, I added it to the pickle output as well.

* add multiple branches to Base model * adjust based on Max's new datasets with dataset_name * add json for compute grad energy physics-informed force prediction * revert time change * reverse prior change * Add check to ensure correct output dim for energy in physics-informed force prediction * Implement the check after setting var_config * revert change * Shorten comment --------- Co-authored-by: Rylie Weaver <[email protected]> Co-authored-by: Rylie Weaver <[email protected]>

* data attributes updated for consistency across datasets * non-normalized chemical composition added as data attribute * download_dataset.sh added for transition1x example * download dataset flag updated * scripts updated * development of tranistion1x scripts continues * transiton1x scripts completed * black formatting fixed * printouts removed * parallelizatin of data reading introduced * blsck formatting fixed * detach().clone() used to defined normalized energy per atom and black formatting * add compute_grad_energy=False as explicit argument * add data name as attributed to each data object * compute_grad_energy is parsed as input argument with default value set to False * edge_index, edge_attr, and edge_shifts explicitly itnroduced in the definition of the Data object * changed data.force into data.forces for ani1x and qm7x examples * smiles_string added as data attribute * remove redundant logic on energy normalization from omat24 example * force threshold value increased to 1000 for ani-1x * Reverted smiles_utils.py to version from commit 3c3c434 * xyz2mol functionalities put in a separate file * download dataset script added for qm7x * renamed data.force as data.forces in ani1x * natoms converted into a tensor * verbosity level ntroduced for ani1x * Z corrected into atomic_numbers for qm7x example * bug fixed for data attributes in transition1x * try-except in transition1x rescoped * transform coordinates fixed in transition1x * iterate_tqdm used in utils.create_graph_data for transition1x example * total_enerfgy replaced with energy --------- Co-authored-by: Massimiliano Lupo Pasini <[email protected]> Co-authored-by: Massimiliano Lupo Pasini <[email protected]>

* Add multidataset example with deepspeed support * Change base.json to follow GPS' requirement

… formatting

…y_per_atom

…tional transforms accounting for pbc

…reading

…e and shape for each attribute, and appropriate positional transform

Draft Fix for PBC Examples

…me_319

…t not required. ORNL#319

Kshitij V. Mehta added 2 commits January 27, 2025 12:38

adds dataset_name attribute to the base class and reads it into a dat…

083ae32

…a sample ORNL#319

adds dataset_name to the Data class a kwarg ORNL#319

edafeb5

kshitij-v-mehta requested review from allaffa and jychoi-hpc January 27, 2025 17:42

dataset_name example for csce ORNL#319

8fca19f

kshitij-v-mehta commented Jan 27, 2025

View reviewed changes

allaffa and others added 4 commits January 27, 2025 13:06

GraphGPS arguments added to JSON file for CSCE examplke

c859eaf

Merge remote-tracking branch 'max_fork/update_csce_ising_examples' in…

4d62ab9

…to adios_dataset_name_319

formatting edits ORNL#319

ca57aa4

formatting ORNL#319

c62961d

kshitij-v-mehta marked this pull request as ready for review January 28, 2025 15:36

jychoi-hpc approved these changes Jan 28, 2025

View reviewed changes

kshitij-v-mehta assigned kshitij-v-mehta and unassigned kshitij-v-mehta Jan 28, 2025

allaffa reviewed Jan 30, 2025

View reviewed changes

checking to see if dataset_name is in keys before attempting to remov…

4c3e53c

…e it ORNL#319

pzhanggit and others added 10 commits February 10, 2025 13:36

Merge remote-tracking branch 'upstream/main' into adios_dataset_name_319

b425edc

Add multidataset example with deepspeed support (ORNL#316)

924532e

* Add multidataset example with deepspeed support * Change base.json to follow GPS' requirement

data attributes updated for consistency across datasets

f03174e

non-normalized chemical composition added as data attribute

7689405

scripts updated

a09f17f

development of tranistion1x scripts continues

214c28e

black formatting fixed

95150b8

detach().clone() used to defined normalized energy per atom and black…

d65a975

… formatting

allaffa and others added 24 commits February 11, 2025 15:00

smiles_string added as data attribute

7898c88

Reverted smiles_utils.py to version from commit 3c3c434

1edc5fd

xyz2mol functionalities put in a separate file

a0919dc

total_energy and total_energy_per_atom replaced with energy and energ…

99f883f

…y_per_atom

commented out fields for transition1x

2147739

duplicated liens removed

e8c313f

First draft full

0eecdf7

Merge remote-tracking branch 'upstream/main' into adios_dataset_name_319

d916ae8

Utility functions to read pbc uniformly as a tensor and applying posi…

517bd1b

…tional transforms accounting for pbc

Convert data.pbc bool --> int during writing and int --> bool during …

75014f1

…reading

Alexandria fully updated to have the same data_object attributes, typ…

3038ee8

…e and shape for each attribute, and appropriate positional transform

re-order linesf

2991aa4

make sure on cpu for ASE

386edff

Update examples full

7108e84

Revise comment

8e4592b

correct bool check for tensor

ab52d94

Remove unnecessary imports

dae490f

adjust comments

faad02e

Use PBC as int instead full

3bec7c9

Make sure to view cell as (3,3)

0b78cf2

Keep samples with default if we cant read pbc and tweak pbc transforms

385c62d

Merge pull request ORNL#25 from RylieWeaver/fix-examples-pbc

a6e01a3

Draft Fix for PBC Examples

Merge branch 'Predictive_GFM_2025_fix_examples' into adios_dataset_na…

8c9cf28

…me_319

Removed adding a blank attributed 'dataset_name' to the Data class. I…

8232537

…t not required. ORNL#319

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adios dataset name 319 #320

Adios dataset name 319 #320

kshitij-v-mehta commented Jan 27, 2025 •

edited

Loading

kshitij-v-mehta Jan 27, 2025

jychoi-hpc left a comment

allaffa Jan 30, 2025

kshitij-v-mehta Jan 30, 2025

kshitij-v-mehta commented Feb 6, 2025 •

edited

Loading

kshitij-v-mehta commented Feb 6, 2025

pzhanggit commented Feb 7, 2025

kshitij-v-mehta commented Feb 7, 2025 •

edited

Loading

Adios dataset name 319 #320

Are you sure you want to change the base?

Adios dataset name 319 #320

Conversation

kshitij-v-mehta commented Jan 27, 2025 • edited Loading

kshitij-v-mehta Jan 27, 2025

Choose a reason for hiding this comment

jychoi-hpc left a comment

Choose a reason for hiding this comment

allaffa Jan 30, 2025

Choose a reason for hiding this comment

kshitij-v-mehta Jan 30, 2025

Choose a reason for hiding this comment

kshitij-v-mehta commented Feb 6, 2025 • edited Loading

kshitij-v-mehta commented Feb 6, 2025

pzhanggit commented Feb 7, 2025

kshitij-v-mehta commented Feb 7, 2025 • edited Loading

kshitij-v-mehta commented Jan 27, 2025 •

edited

Loading

kshitij-v-mehta commented Feb 6, 2025 •

edited

Loading

kshitij-v-mehta commented Feb 7, 2025 •

edited

Loading