Skip to content

[BUG] Warnings with pytorch_2.9 #560

@kavanase

Description

@kavanase

Using the latest nequip develop (private) with pytorch_2.9, I see the following warnings in my training output:

Firstly, TF32 related warnings:

/n/home03/skavanagh/miniconda3/envs/pytorch_2.9/lib/python3.13/site-packages/torch/__init__.py:1551: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)
  return _C._get_float32_matmul_precision()
You are using a CUDA device ('NVIDIA A100-SXM4-40GB MIG 3g.20gb') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision

Here it's implying that TF32 is not being used, but I am using TF32Scheduler with 0: true, which worked fine with previous versions (and did not show any of these warnings), and the latest version of nequip develop I thought had the required PT 2.9 TF32 backend updates. This shows after VAL RUN START and before Initializing distributed....

Then I also get these warnings, before the first validation/training loop:

/n/home03/skavanagh/miniconda3/envs/pytorch_2.9/lib/python3.13/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
  warnings.warn(
/n/home03/skavanagh/miniconda3/envs/pytorch_2.9/lib/python3.13/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
  warnings.warn(
...
Validation DataLoader 0:   0%|                                                                                            | 0/6916 [00:00<?, ?it/s][rank0]:W1020 16:19:48.288000 3814862 site-packages/torch/fx/experimental/symbolic_shapes.py:6833] _maybe_guard_rel() was called on non-relation expression Eq(s52, s86) | Eq(s86, 1)
[rank0]:W1020 16:19:59.949000 3814862 site-packages/torch/fx/experimental/symbolic_shapes.py:6833] _maybe_guard_rel() was called on non-relation expression Eq(s52, s86) | Eq(s86, 1)
/n/home03/skavanagh/miniconda3/envs/pytorch_2.9/lib/python3.13/site-packages/torch/_inductor/compile_fx.py:312: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance.
  warnings.warn(

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions