Skip to content

[BUG]: Error message while packaging #552

@kavanase

Description

@kavanase

Describe the bug
When packaging latest NequIP models on FasRC login node, I see this error in the output:

[2025-10-01 21:34:57,523][nequip.train.ema][INFO] - [rank: 0] Loading EMA weights for evaluation model.
/n/home03/skavanagh/miniconda3/envs/pytorch_2.7.1/lib/python3.12/site-packages/torch/package/package_exporter.py:911: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  storage_type_str = obj.pickle_storage_type()
Error while loading libcue_ops.so: libcuda.so.1: cannot open shared object file: No such file or directory

It doesn't seem to break packaging or anything, but just to flag in case! Sorry if this has been posted before, but not that I remember. From checking Slack, I see that I saw this error before when trying to compile Allegro models with CuEq.

Full output:

[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] Version Information:
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] torch 2.7.1+cu128
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] e3nn 0.5.6
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] nequip 0.15.0
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] allegro 0.7.1
[2025-10-01 21:34:52,366][nequip.utils.versions.package_versions][INFO] - [rank: 0] talaria 0.1.0
[2025-10-01 21:34:52,369][nequip.scripts.package][INFO] - [rank: 0] Building `eager` model for packaging ...
[2025-10-01 21:34:52,369][nequip.model.saved_models.load_utils][INFO] - [rank: 0] Loading model from ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:52,370][nequip.model.saved_models.checkpoint][INFO] - [rank: 0] Loading model from checkpoint file: ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:53,207][nequip.train.ema][INFO] - [rank: 0] Loading EMA weights for evaluation model.
[2025-10-01 21:34:53,262][nequip.data.datamodule._base_datamodule][INFO] - [rank: 0] Found 1 training dataset(s), 1 validation dataset(s), 0 test dataset(s), and 0 predict dataset(s).
[2025-10-01 21:34:56,646][nequip.scripts.package][INFO] - [rank: 0] Building `compile` model for packaging ...
[2025-10-01 21:34:56,646][nequip.model.saved_models.load_utils][INFO] - [rank: 0] Loading model from ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:56,646][nequip.model.saved_models.checkpoint][INFO] - [rank: 0] Loading model from checkpoint file: ckpts/nequip_MPA/weighted_metric_0.0513-196.ckpt ...
[2025-10-01 21:34:57,523][nequip.train.ema][INFO] - [rank: 0] Loading EMA weights for evaluation model.
/n/home03/skavanagh/miniconda3/envs/pytorch_2.7.1/lib/python3.12/site-packages/torch/package/package_exporter.py:911: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  storage_type_str = obj.pickle_storage_type()
Error while loading libcue_ops.so: libcuda.so.1: cannot open shared object file: No such file or directory
[2025-10-01 21:34:58,404][nequip.scripts.package][INFO] - [rank: 0] Packaged model saved to l_edge_3_M_weighted_metric_0.0513-196_cpu.nequip.zip

I guess it might be from trying to load the CuEq libraries before packaging to ensure every library that's possibly needed is there?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions