Feature/virtual nodes #26

sidnb13 · 2023-05-22T22:15:05Z

Summary

This is a very large PR, and contains all of the current work involving virtual-node GNNs, in addition to a variety of improvements and features to the preprocessing and config management pipeline. Below are some more details on the features integrated. These may not be comprehensive, so I'll try to update/explain more based on feedback. Some main issues to be further discussed:

Updating example configs to reflect the new developments, and creating documentation to match.
Ensuring that the new features don't break existing pipelines (i.e. backward compatibility).

General

main.py entrypoint conceptual changes. We now have support for WandB sweeps. MatDeepLearn can now serve 1) original functionality, in which a config file is specified for a task 2) entrypoint for an automated sweep, in which the config is read from WandB rather than a file, and the task is self-contained 3) as a way to generate sweeps given a sweep configuration. This integrates with jobs (discussed next) and rather than immediately training, generates the commands needed to launch asynchronous sweeps. Sweeps can be run sequentially or in parallel (feature is still being ironed out) but parallel is recommended since runs are prone to failure.
The concept of jobs, which are either ran locally or via Slurm, and can be integrated into the registry. Each job type creates a custom entrypoint, which is either a command or a batch file to be ran by the user. MatDeepLearn can operate in "job-script" mode where it generates these entry points to be ran by the user. Useful for iterative experiments. Integrated with sweeps, as discussed above.
New custom dataclasses derived from torch_geometric.data.Data objects. Implements custom batching routines in order to work with some of the new attributes introduced by virtual nodes.
More utilities which implement complicated dictionary merging features and profiling methods.

Models

Add two new models to the registry (CGCNN-VN and CGCNN-VN-HG, the heterogeneous variant).
Addition of TorchMD-net and gemnet-oc
A new routines concept, which is a way to organize model-augmenting sub-features such as custom pooling methods.
A new layers concept, which is a way to organize pieces of network architectures that do not function as standalone components.

Preprocessing

Batch processing. This is done via taking advantage of PyG batching internals. We specify the batch size in config, process batches of data, then convert the processed data back to individual examples (Note: will be helpful to document the general approach to doing this, since it is a bit involved implementation-wise).
Batch processing capable transforms. We can specify this in config, and non-supported transforms return appropriate errors. Each transform must choose whether or not to implement this.
Transforms can now share a common set of arguments with preprocessing, this is done via the config file.
Addition of new transforms for the virtual node feature.
Additional helper methods.

…ooling

depends on num_nodes

sidnb13 · 2023-06-01T20:19:55Z

Some more notes:

The task configuration has a separate section for wandb configuration. See an example config file for more details. This allows only a subset of hyperparameters to be tracked to prevent clutter, and allows for sweep configuration. Sweeps (feature still pending more testing) are disabled by default, I plan to add more documentation on how they work for future use.
Model hyperparameters are under model.hyperparams in config rather than just model. This nested object is directly passed to the model. Done to improve organization. This requires existing configs to be modified.

sidnb13 · 2023-06-09T15:30:08Z

Fixed a couple of more minor config-related incompatibility bugs. Also in case of single-device training, resolved ambiguity provided by rank. The _forward implementation only uses rank if distributed training is enabled, else allocates the device based on config (using min_alloc_gpu as fallback).

Implemented (WIP) a model for simpler attention mechanism based pooling. Encodes a bit into the node features to indicate virtual (0) or real (1) and performs self-attention graph pooling. For now we do global, but would be straightforward to implement a hierarchical approach.

…atDeepLearn_dev into feature/virtual-nodes

sidnb13 and others added 30 commits December 19, 2022 08:16

Add ALIGNN work to separate branch

fbf36fc

Fix transform bugs

2168d2d

Add in pre-training transform functionality

85e20e4

Implement load from checkpoint

718c77c

Remove aux files and testing folder

5e6d045

Initiate virtual nodes project

d9dfbc4

Implement preliminary virtual node functionality

e41e709

Update graphite config

fea59d7

Add basic metric plotting

bbe469a

Modify CGCNN for virtual nodes

2776b01

Fix json_wrap bug for virtual nodes

28689ef

Address issues from PR #12

e8c4d3d

Address issues from PR #12

0fcb993

Fix from_config bug

bae8ef1

Fix merge conflicts

d5cf8bd

Add wandb capability

51e1674

WIP: 28689ef Fix json_wrap bug for virtual nodes

d26dc31

Add ase import to processor

a019a3f

Wandb capability

a63a637

Fix trainer argument bug

ed76bb5

Fix bug, wandb config not seen

59a05bd

Fix issues from #12

7693418

Remove extra parameters and fix bugs #12

4810e8b

Refactor transforms and target_index

63f2835

Fix GetY dimensionality bug

60c53b8

Merge branch 'feature/alignn-model' into virtual-nodes

30dcaf1

Fix trainer and config files

cbc8496

Extract virtual nodes to transform, add FC hidden layers for atomic p…

c14671d

…ooling

Virtual nodes into transform abstraction

4839111

Remove unnecessary code

be6a576

Sidharth Narayanan Baskaran and others added 11 commits May 22, 2023 22:41

Minor fixes

166cd51

Fix small bugs

b252bc7

Add hetero-attention architecture, config save with override params

c11dcfd

available memory in GB

7358736

Add path printed when logging metadata search

218be82

Minor formatting fix

b4ed50c

Recentralize HAN models, implement weighted RV pooling with attention

dc82db2

Merge branch 'main' into feature/virtual-nodes

c25d458

Remove unused files

d3adb30

Fix minor bugs, GetY now

f5b04c1

depends on num_nodes

Formatting

ee8d2c5

sidnb13 added the enhancement New feature or request label Jun 7, 2023

sidnb13 and others added 3 commits June 7, 2023 19:42

SAGpool with node class identity encoding

6bb780b

Fix bugs related to config and remove checkpoint_path option

c9dc1c5

Add parallel option to trainer, resolve rank ambiguity

7bcfdb0

sidnb13 and others added 13 commits June 9, 2023 11:36

Add parallel option to trainer, resolve rank ambiguity

913533d

Merge branch 'feature/virtual-nodes' of https://github.com/Fung-Lab/M…

a9c9bb4

…atDeepLearn_dev into feature/virtual-nodes

Merge branch 'feature/virtual-nodes' of https://github.com/Fung-Lab/M…

1b78200

…atDeepLearn_dev into feature/virtual-nodes

Merge branch 'feature/virtual-nodes' of https://github.com/Fung-Lab/M…

ce86782

…atDeepLearn_dev into feature/virtual-nodes

Pass correct edge feature attribute in cgcnn_vn

9bfee65

Fix bugs with saving processed data

5e651db

Fix flags

08c111e

replace gcnconv with cgconv

4532cb2

fix pool

855a0ad

rename CustomData to VirtualNodeData

3baae28

Allow backward compatibility with DDP

3f0e049

Fix backward compat with DDP

2f095f8

fix bugs

610f465

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/virtual nodes #26

Feature/virtual nodes #26

Uh oh!

sidnb13 commented May 22, 2023

Uh oh!

sidnb13 commented Jun 1, 2023

Uh oh!

sidnb13 commented Jun 9, 2023 •

edited

Loading

Uh oh!

Uh oh!

Feature/virtual nodes #26

Are you sure you want to change the base?

Feature/virtual nodes #26

Uh oh!

Conversation

sidnb13 commented May 22, 2023

Summary

General

Models

Preprocessing

Uh oh!

sidnb13 commented Jun 1, 2023

Uh oh!

sidnb13 commented Jun 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sidnb13 commented Jun 9, 2023 •

edited

Loading