-
Notifications
You must be signed in to change notification settings - Fork 190
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When packaging latest NequIP models on FasRC login node, I see this error in the output:
[2025-10-01 21:34:57,523][nequip.train.ema][INFO] - [rank: 0] Loading EMA weights for evaluation model.
/n/home03/skavanagh/miniconda3/envs/pytorch_2.7.1/lib/python3.12/site-packages/torch/package/package_exporter.py:911: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage_type_str = obj.pickle_storage_type()
Error while loading libcue_ops.so: libcuda.so.1: cannot open shared object file: No such file or directory
It doesn't seem to break packaging or anything, but just to flag in case! Sorry if this has been posted before, but not that I remember. From checking Slack, I see that I saw this error before when trying to compile Allegro models with CuEq.
Full output:
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] Version Information:
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] torch 2.7.1+cu128
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] e3nn 0.5.6
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] nequip 0.15.0
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] allegro 0.7.1
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] talaria 0.1.0
[2025-10-01 21:34:52,369][nequip.scripts.package][INFO] - [rank: 0] Building `eager` model for packaging ...
[2025-10-01 21:34:52,369][nequip.model.saved_models.load_utils][INFO] - [rank: 0] Loading model from ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:52,370][nequip.model.saved_models.checkpoint][INFO] - [rank: 0] Loading model from checkpoint file: ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:53,207][nequip.train.ema][INFO] - [rank: 0] Loading EMA weights for evaluation model.
[2025-10-01 21:34:53,262][nequip.data.datamodule._base_datamodule][INFO] - [rank: 0] Found 1 training dataset(s), 1 validation dataset(s), 0 test dataset(s), and 0 predict dataset(s).
[2025-10-01 21:34:56,646][nequip.scripts.package][INFO] - [rank: 0] Building `compile` model for packaging ...
[2025-10-01 21:34:56,646][nequip.model.saved_models.load_utils][INFO] - [rank: 0] Loading model from ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:56,646][nequip.model.saved_models.checkpoint][INFO] - [rank: 0] Loading model from checkpoint file: ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:57,523][nequip.train.ema][INFO] - [rank: 0] Loading EMA weights for evaluation model.
/n/home03/skavanagh/miniconda3/envs/pytorch_2.7.1/lib/python3.12/site-packages/torch/package/package_exporter.py:911: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage_type_str = obj.pickle_storage_type()
Error while loading libcue_ops.so: libcuda.so.1: cannot open shared object file: No such file or directory
[2025-10-01 21:34:58,404][nequip.scripts.package][INFO] - [rank: 0] Packaged model saved to l_edge_3_M_weighted_metric_0.0513-196_cpu.nequip.zip
I guess it might be from trying to load the CuEq libraries before packaging to ensure every library that's possibly needed is there?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working