Skip to content

GaussRender: Learning 3D Occupancy with Gaussian Rendering (official repository)

License

Notifications You must be signed in to change notification settings

valeoai/GaussRender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Official implementation of: GaussRender: Learning 3D Occupancy with Gaussian Rendering.

arXiv

GaussRender: Learning 3D Occupancy with Gaussian Rendering.
Loick Chambon1,2, Eloi Zablocki2, Alexandre Boulch2, Mickael Chen3, Matthieu Cord1,2.
1Valeo.ai, 2Sorbonne University, 3Hcompany.ai.

GaussRender is a 3D Occupancy module that can be plugged into any 3D Occupancy model to enhance its predictions and ensure 2D-3D consistency while improving mIoU, IoU, and RayIoU.

Abstract

Understanding the 3D geometry and semantics of driving scenes is critical for safe autonomous driving. Recent advances in 3D occupancy prediction have improved scene representation but often suffer from spatial inconsistencies, leading to floating artifacts and poor surface localization. Existing voxel-wise losses (e.g., cross-entropy) fail to enforce geometric coherence. In this paper, we propose GaussRender, a module that improves 3D occupancy learning by enforcing projective consistency. Our key idea is to project both predicted and ground-truth 3D occupancy into 2D camera views, where we apply supervision. Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure. To achieve this efficiently, we leverage differentiable rendering with Gaussian splatting. GaussRender seamlessly integrates with existing architectures while maintaining efficiency and requiring no inference-time modifications. Extensive evaluations on multiple benchmarks (SurroundOcc-nuScenes, Occ3D-nuScenes, SSCBench-KITTI360) demonstrate that GaussRender significantly improves geometric fidelity across various 3D occupancy models (TPVFormer, SurroundOcc, Symphonies), achieving state-of-the-art results, particularly on surface-sensitive metrics. The code is open-sourced at https://github.com/valeoai/GaussRender.

GaussRender can be plugged to any model. The core idea is to transform voxels into gaussians before performing a depth and a semantic rendering.

Updates:

βŒ› Todo:

  • Release other checkpoints.

πŸš€ Main results

πŸ”₯ 3D Occupancy

GaussRender can be plugged into any 3D model. We have dedicated experiments on multiple 3D benchmarks (SurroundOcc-nuScenes, Occ3D-nuScenes, SSCBench-KITTI360) and on multiple models (TPVFormer, SurroundOcc, Symphonies) to evaluate its performance.

Occ3d-nuScenes

3D mIoU and IoU of several models on the Occ3D-nuScenes dataset. Best result marked with πŸ₯‡, second best with πŸ₯ˆ.
Models TPVFormer (ours) TPVFormer SurroundOcc (ours) SurroundOcc OccFormer RenderOcc
Type w/ GaussRender base w/ GaussRender base base base
mIoU 30.48 πŸ₯‡ 27.83 30.38 πŸ₯ˆ 29.21 21.93 26.11
RayIoU 38.3 πŸ₯‡ 37.2 37.5 πŸ₯ˆ 35.5 - 19.5

SurroundOcc-nuScenes

3D mIoU and IoU of several models on the SurroundOcc-nuScenes dataset. Best marked with πŸ₯‡, second with πŸ₯ˆ.
Models TPVFormer (ours) TPVFormer SurroundOcc (ours) SurroundOcc OccFormer GaussianFormerv2
Type w/ GaussRender base w/ GaussRender base base base
IoU 32.05 πŸ₯ˆ 30.86 32.61 πŸ₯‡ 31.49 31.39 30.56
mIoU 20.58 πŸ₯ˆ 17.10 20.82 πŸ₯‡ 20.30 19.03 20.02

SSCBench-KITTI360

3D mIoU and IoU of several models on SSCBench-KITTI360. Best πŸ₯‡, second πŸ₯ˆ.
Models SurroundOcc (ours) SurroundOcc Symphonies (ours) Symphonies OccFormer MonoScene
Type w/ GaussRender base w/ GaussRender base base base
IoU 38.62 38.51 44.08 πŸ₯‡ 43.40 πŸ₯ˆ 40.27 37.87
mIoU 13.34 13.08 18.11 πŸ₯‡ 17.82 πŸ₯ˆ 13.81 12.31

πŸ”¨ Setup

➑️ Install

Environment
# Create basic env
micromamba create -n gaussrender python=3.8.16 -y -c conda-forge
micromamba activate gaussrender

# Install torch
pip install torch==2.0.0 torchvision==0.15.1 --index-url https://download.pytorch.org/whl/cu118

# Install mmlibs.
pip install -U openmim
mim install mmcv==2.0.1
mim install mmdet==3.0.0
mim install "mmdet3d==1.1.1" 
mim install "mmsegmentation==1.0.0"

# Install other libraries
pip install uv
uv pip install pillow==8.4.0 typing_extensions==4.8.0 torchmetrics==0.9.3 timm==0.9.2
uv pip install spconv-cu118 einops ipykernel
uv pip install protobuf==4.25.3

# Compile extensions
cd extensions/diff-gaussian-rasterization
rm -r build dist
python setup.py build install
cd -

➑️ Dataset

Occ3d-nuScenes

Follow instructions on the official repository here.

  1. Download pickle files (cf. here):

You should have two pickle files: 'bevdetv2-nuscenes_infos_train.pkl' and 'bevdetv2-nuscenes_infos_val.pkl'.

  1. Download annotations:
cd ./data/occ3d_nuscenes
wget -O gts.tar.gz https://drive.usercontent.google.com/download?id=17HubGsfioQr1d_39VwVPXelobAFo4Xqh&export=download&authuser=0&confirm=t&uuid=59c53966-3370-4393-b1f6-b35ad8ab45d4&at=AEz70l7o-wie2--xDlpvY0XGvAU3:1740048920248 
tar -xvzf gts.tar.gz
cd -

The folder should have the following structure:

./data
    - nuscenes
    - occ3d_nuscenes
        - bevdetv2-nuscenes_infos_train.pkl
        - bevdetv2-nuscenes_infos_val.pkl
        - gts
            - scene-*
SurroundOcc-nuScenes

Follow the instructions on the official repository here.

  1. Download pickle files:
mkdir data/surroundocc_nuscenes
cd ./data/surroundocc_nuscenes/
wget -O nuscenes_infos_train_sweeps_occ.pkl https://cloud.tsinghua.edu.cn/d/bb96379a3e46442c8898/files/\?p\=%2Fnuscenes_infos_train_sweeps_occ.pkl\&dl\=1 
wget -O nuscenes_infos_val_sweeps_occ.pkl https://cloud.tsinghua.edu.cn/d/bb96379a3e46442c8898/files/?p=%2Fnuscenes_infos_val_sweeps_occ.pkl&dl=1
cd -
  1. Download annotations:
cd ./data/surroundocc_nuscenes/
wget -O samples_train.zip https://cloud.tsinghua.edu.cn/seafhttp/files/92f71370-3686-44ab-814e-8af648ba01e6/train.zip
wget -O samples_val.zip https://cloud.tsinghua.edu.cn/seafhttp/files/0499c96e-2176-4f6c-b30a-968c81dd7bdd/val.zip
unzip samples_train.zip
unzip samples_val.zip
cd -

The folder should have the following structure:

./data
    - nuscenes
    - surroundocc_nuscenes
        - nuscenes_infos_train_sweeps_occ.pkl
        - nuscenes_infos_val_sweeps_occ.pkl
        - samples
            - *.pcd.bin.npy
SSCBench-KITTI360
mkdir ./data/sscbench_kitti360
cd ./data/
for part in aa ab ac ad ae af ag ah ai aj; do
  wget "https://huggingface.co/datasets/ai4ce/SSCBench/resolve/main/sscbench-kitti/sscbench-kitti-part_${part}?download=true" -O "sscbench-kitti-part_${part}"
done
sudo cat *-part_* > combined.sqfs
sudo apt-get update && sudo apt-get install squashfs-tools 
unsquashfs combined.sqfs
mv squashfs-root/sscbench-kitti ./ 
cd -

πŸ”„ Training

Launch a simple training

To train a model, you need to specify the config file and the associated dataset. Config files are in the ./config folder.

For instance to train a TPVFormer model on the Occ3d-nuScenes dataset, run the following command:

python train.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/occ3d/tpv 
Fast prototyping

To prototype, develop or debug a customized model, we can use the mini option (on nuScenes) to load 20% of the dataset. Note also that --cfg-options allows you to modify the mmcv configuration.

python train.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/debug --cfg-options train_dataset_config.is_mini=True val_dataset_config.is_mini=True

πŸ”„ Evaluation

Simple evaluation To evaluate a trained model, use the following command:
python eval.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/eval/ --resume-from ckpts/final/occ3d_tpv_render.pth

Note that using the '--short' option you can perform a sanity-check evaluation on 100 data.

Evaluate image metrics If you want to evaluate image metrics, you should overwrite the camera strategy leading to this command:
python eval.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/eval/ --resume-from ckpts/final/occ3d_tpv_render.pth --cfg-options model.aggregator.render_kwargs.render_gt_mode=sensor model.aggregator.render_kwargs.cam_idx="[0,1,2,3,4,5]" model.aggregator.pre_render_kwargs.overwrite_opacity=True

Where 'sensor' means the sensor strategy, i.e sensor reference frame. The argument 'overwrite_opacity' ensures that empty voxels have a 0 opacity.

To evaluate a base model add --no-strict-state to do not have conflicts when resuming the model.

➑️ Visualization

Install environment To avoid conflict with the existing environment, I recommend to create another one.
micromamba create -n mayavi python==3.8 -c conda-forge
micromamba activate mayavi
pip install numpy vtk pyqt5
pip install mayavi
pip install pyside2
pip install scipy jupyter ipywidgets ipyevents configobj
pip install https://github.com/enthought/mayavi/zipball/main
Visualize predictions Before to visualize outputs using mayavi, you should save them in a folder during inference using:
python out/save_indexed_preds.py --dataset occ3d --py-config config/tpvformer/render.py --work-dir out/eval_local/occ3d/tpv/std --resume-from ckpts/final/occ3d_tpv_std.pth --no-strict-state --indices 0 1 2 3 4 5 6 7 8 9

The previous code save the indices [0,1,2,3,4,5,6,7,8,9] corresponding to the first 10 frames of the validation set in the following directory: inspect/DATASET/MODEL/SCENE_TAG/FRAME_TAG.

Then activate your mayavi environment and perform the rendering:

python visualisation/create_rendered_imgs.py --dataset occ3d --folder inspect/results/occ3d/surroundocc_render

And create an animation using:

python visualisation/anims_rendered_imgs.py --rendered-folder rendered/occ3d/surroundocc_render

πŸ‘ Acknowledgements

Many thanks to these excellent open source projects:

❀️ Other repository

If you liked our work, do not hesitate to also see:

✏️ Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry and putting a star on this repository.

@misc{chambon2025gaussrenderlearning3doccupancy,
      title={GaussRender: Learning 3D Occupancy with Gaussian Rendering}, 
      author={Loick Chambon and Eloi Zablocki and Alexandre Boulch and Mickael Chen and Matthieu Cord},
      year={2025},
      eprint={2502.05040},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.05040}, 
}

Releases

No releases published

Packages

No packages published