Project Page | Paper | Code
Oral at ICML 2026
Hao Wei, Bjoern List, Nils Thuerey
Technical University of Munich
ReViT is the first Vision Transformer framework that enforces strict rotational equivariance on grid-based physical fields. By mapping scalar and vector inputs into locally invariant representations derived from physics-based canonical bases, ReViT enables standard self-attention without symmetry violations — yielding significant accuracy gains across 2D and 3D PDE benchmarks.
- Strict rotational equivariance for Vision Transformers on grid-based PDE data
- Local canonicalization via physics-based canonical bases — no group lifting needed
- Up to 65% MSE reduction over state-of-the-art baselines on 3D turbulence benchmarks
- ~53× memory reduction compared to lifted equivariant alternatives
- Exact chiral octahedral group O equivariance and approximate SO(3) equivariance
@inproceedings{ReViT2026,
title = {{ReViT}: Rotational-equivariant Vision Transformers for Neural {PDE} Solvers},
author = {Hao Wei and Bjoern List and Nils Thuerey},
booktitle = {Forty-Third International Conference on Machine Learning},
year = {2026},
}This project page template is borrowed from Nerfies.