GaussRender: Learning 3D Occupancy with Gaussian Rendering.
Loick Chambon1,2, Eloi Zablocki2, Alexandre Boulch2, Mickael Chen3, Matthieu Cord1,2.
1Valeo.ai, 2Sorbonne University, 3Hcompany.ai.
![]() |
![]() |
GaussRender is a 3D Occupancy module that can be plugged into any 3D Occupancy model to enhance its predictions and ensure 2D-3D consistency while improving mIoU, IoU, and RayIoU. |
Understanding the 3D geometry and semantics of driving scenes is critical for safe autonomous driving. Recent advances in 3D occupancy prediction have improved scene representation but often suffer from spatial inconsistencies, leading to floating artifacts and poor surface localization. Existing voxel-wise losses (e.g., cross-entropy) fail to enforce geometric coherence. In this paper, we propose GaussRender, a module that improves 3D occupancy learning by enforcing projective consistency. Our key idea is to project both predicted and ground-truth 3D occupancy into 2D camera views, where we apply supervision. Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure. To achieve this efficiently, we leverage differentiable rendering with Gaussian splatting. GaussRender seamlessly integrates with existing architectures while maintaining efficiency and requiring no inference-time modifications. Extensive evaluations on multiple benchmarks (SurroundOcc-nuScenes, Occ3D-nuScenes, SSCBench-KITTI360) demonstrate that GaussRender significantly improves geometric fidelity across various 3D occupancy models (TPVFormer, SurroundOcc, Symphonies), achieving state-of-the-art results, particularly on surface-sensitive metrics. The code is open-sourced at https://github.com/valeoai/GaussRender.
![]() |
GaussRender can be plugged to any model. The core idea is to transform voxels into gaussians before performing a depth and a semantic rendering. |
- γ19/03/2025γ GaussRender code has been uploaded.
- γ07/02/2025γ GaussRender is on arxiv.
- Release other checkpoints.
GaussRender can be plugged into any 3D model. We have dedicated experiments on multiple 3D benchmarks (SurroundOcc-nuScenes, Occ3D-nuScenes, SSCBench-KITTI360) and on multiple models (TPVFormer, SurroundOcc, Symphonies) to evaluate its performance.
Models | TPVFormer (ours) | TPVFormer | SurroundOcc (ours) | SurroundOcc | OccFormer | RenderOcc |
---|---|---|---|---|---|---|
Type | w/ GaussRender | base | w/ GaussRender | base | base | base |
mIoU | 30.48 π₯ | 27.83 | 30.38 π₯ | 29.21 | 21.93 | 26.11 |
RayIoU | 38.3 π₯ | 37.2 | 37.5 π₯ | 35.5 | - | 19.5 |
Models | TPVFormer (ours) | TPVFormer | SurroundOcc (ours) | SurroundOcc | OccFormer | GaussianFormerv2 |
---|---|---|---|---|---|---|
Type | w/ GaussRender | base | w/ GaussRender | base | base | base |
IoU | 32.05 π₯ | 30.86 | 32.61 π₯ | 31.49 | 31.39 | 30.56 |
mIoU | 20.58 π₯ | 17.10 | 20.82 π₯ | 20.30 | 19.03 | 20.02 |
Models | SurroundOcc (ours) | SurroundOcc | Symphonies (ours) | Symphonies | OccFormer | MonoScene |
---|---|---|---|---|---|---|
Type | w/ GaussRender | base | w/ GaussRender | base | base | base |
IoU | 38.62 | 38.51 | 44.08 π₯ | 43.40 π₯ | 40.27 | 37.87 |
mIoU | 13.34 | 13.08 | 18.11 π₯ | 17.82 π₯ | 13.81 | 12.31 |
Environment
# Create basic env
micromamba create -n gaussrender python=3.8.16 -y -c conda-forge
micromamba activate gaussrender
# Install torch
pip install torch==2.0.0 torchvision==0.15.1 --index-url https://download.pytorch.org/whl/cu118
# Install mmlibs.
pip install -U openmim
mim install mmcv==2.0.1
mim install mmdet==3.0.0
mim install "mmdet3d==1.1.1"
mim install "mmsegmentation==1.0.0"
# Install other libraries
pip install uv
uv pip install pillow==8.4.0 typing_extensions==4.8.0 torchmetrics==0.9.3 timm==0.9.2
uv pip install spconv-cu118 einops ipykernel
uv pip install protobuf==4.25.3
# Compile extensions
cd extensions/diff-gaussian-rasterization
rm -r build dist
python setup.py build install
cd -
Occ3d-nuScenes
Follow instructions on the official repository here.
- Download pickle files (cf. here):
You should have two pickle files: 'bevdetv2-nuscenes_infos_train.pkl' and 'bevdetv2-nuscenes_infos_val.pkl'.
- Download annotations:
cd ./data/occ3d_nuscenes
wget -O gts.tar.gz https://drive.usercontent.google.com/download?id=17HubGsfioQr1d_39VwVPXelobAFo4Xqh&export=download&authuser=0&confirm=t&uuid=59c53966-3370-4393-b1f6-b35ad8ab45d4&at=AEz70l7o-wie2--xDlpvY0XGvAU3:1740048920248
tar -xvzf gts.tar.gz
cd -
The folder should have the following structure:
./data
- nuscenes
- occ3d_nuscenes
- bevdetv2-nuscenes_infos_train.pkl
- bevdetv2-nuscenes_infos_val.pkl
- gts
- scene-*
SurroundOcc-nuScenes
Follow the instructions on the official repository here.
- Download pickle files:
mkdir data/surroundocc_nuscenes
cd ./data/surroundocc_nuscenes/
wget -O nuscenes_infos_train_sweeps_occ.pkl https://cloud.tsinghua.edu.cn/d/bb96379a3e46442c8898/files/\?p\=%2Fnuscenes_infos_train_sweeps_occ.pkl\&dl\=1
wget -O nuscenes_infos_val_sweeps_occ.pkl https://cloud.tsinghua.edu.cn/d/bb96379a3e46442c8898/files/?p=%2Fnuscenes_infos_val_sweeps_occ.pkl&dl=1
cd -
- Download annotations:
cd ./data/surroundocc_nuscenes/
wget -O samples_train.zip https://cloud.tsinghua.edu.cn/seafhttp/files/92f71370-3686-44ab-814e-8af648ba01e6/train.zip
wget -O samples_val.zip https://cloud.tsinghua.edu.cn/seafhttp/files/0499c96e-2176-4f6c-b30a-968c81dd7bdd/val.zip
unzip samples_train.zip
unzip samples_val.zip
cd -
The folder should have the following structure:
./data
- nuscenes
- surroundocc_nuscenes
- nuscenes_infos_train_sweeps_occ.pkl
- nuscenes_infos_val_sweeps_occ.pkl
- samples
- *.pcd.bin.npy
SSCBench-KITTI360
mkdir ./data/sscbench_kitti360
cd ./data/
for part in aa ab ac ad ae af ag ah ai aj; do
wget "https://huggingface.co/datasets/ai4ce/SSCBench/resolve/main/sscbench-kitti/sscbench-kitti-part_${part}?download=true" -O "sscbench-kitti-part_${part}"
done
sudo cat *-part_* > combined.sqfs
sudo apt-get update && sudo apt-get install squashfs-tools
unsquashfs combined.sqfs
mv squashfs-root/sscbench-kitti ./
cd -
Launch a simple training
To train a model, you need to specify the config file and the associated dataset. Config files are in the ./config folder.
For instance to train a TPVFormer model on the Occ3d-nuScenes dataset, run the following command:
python train.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/occ3d/tpv
Fast prototyping
To prototype, develop or debug a customized model, we can use the mini option (on nuScenes) to load 20% of the dataset. Note also that --cfg-options
allows you to modify the mmcv configuration.
python train.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/debug --cfg-options train_dataset_config.is_mini=True val_dataset_config.is_mini=True
Simple evaluation
To evaluate a trained model, use the following command:python eval.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/eval/ --resume-from ckpts/final/occ3d_tpv_render.pth
Note that using the '--short' option you can perform a sanity-check evaluation on 100 data.
Evaluate image metrics
If you want to evaluate image metrics, you should overwrite the camera strategy leading to this command:python eval.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/eval/ --resume-from ckpts/final/occ3d_tpv_render.pth --cfg-options model.aggregator.render_kwargs.render_gt_mode=sensor model.aggregator.render_kwargs.cam_idx="[0,1,2,3,4,5]" model.aggregator.pre_render_kwargs.overwrite_opacity=True
Where 'sensor' means the sensor strategy, i.e sensor reference frame. The argument 'overwrite_opacity' ensures that empty voxels have a 0 opacity.
To evaluate a base model add --no-strict-state
to do not have conflicts when resuming the model.
Install environment
To avoid conflict with the existing environment, I recommend to create another one.micromamba create -n mayavi python==3.8 -c conda-forge
micromamba activate mayavi
pip install numpy vtk pyqt5
pip install mayavi
pip install pyside2
pip install scipy jupyter ipywidgets ipyevents configobj
pip install https://github.com/enthought/mayavi/zipball/main
Visualize predictions
Before to visualize outputs using mayavi, you should save them in a folder during inference using:python out/save_indexed_preds.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/eval_local/occ3d/tpv/std --resume-from ckpts/final/occ3d_tpv_std.pth --no-strict-state --indices 0 1 2 3 4 5 6 7 8 9
The previous code save the indices [0,1,2,3,4,5,6,7,8,9] corresponding to the first 10 frames of the validation set in the following directory: inspect/DATASET/MODEL/SCENE_TAG/FRAME_TAG.
Then activate your mayavi environment and perform the rendering:
python visualisation/create_rendered_imgs.py --dataset occ3d --folder inspect/results/occ3d/surroundocc_render
And create an animation using:
python visualisation/anims_rendered_imgs.py --rendered-folder rendered/occ3d/surroundocc_render
Many thanks to these excellent open source projects:
If you liked our work, do not hesitate to also see:
- PointBeV: sparse BeV 2D segmentation.
If this work is helpful for your research, please consider citing the following BibTeX entry and putting a star on this repository.
@misc{chambon2025gaussrenderlearning3doccupancy,
title={GaussRender: Learning 3D Occupancy with Gaussian Rendering},
author={Loick Chambon and Eloi Zablocki and Alexandre Boulch and Mickael Chen and Matthieu Cord},
year={2025},
eprint={2502.05040},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.05040},
}