Skip to content

pfnet-research/blowin-in-the-wild

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Blowin' in the Wild: Dynamic Looping Gaussians from Still Images

Blowin' in the Wild is a method for 4D Gaussian splatting with in-the-wild still images. The photos are casually taken at irregular intervals, NOT as a video, which makes 4D novel view synthesis more challenging.

We can reconstruct in-the-wild scenes, "blowin' in the wind," by optimizing per-shot embeddings, per-Gaussian embeddings, and a transformation MLP, in addition to 3DGS parameters. Upon reconstruction, we can see 4D novel view synthesis from the real-time viewer with manual or automatic looping dynamics by manipulating a per-shot embedding.

I'd like you to please read the blog for more understanding.

marigold_nvs

viewer.mp4

Let us emphasize again: no video source was used for this novel view synthesis.

Installation

This code is partially based on gsplat, an open-source library for Gaussian splatting.

Dependence: Please install Pytorch first.

pip install -r requirements.txt

Data Preparation

For synthetic scenes:
The dataset provided in D-NeRF is supported. It is available on their dropbox.

For real scenes:
Any colmap data can be used. Use this script.
We also provide two sample datasets.

Training

For training dnerf scenes, run

CUDA_VISIBLE_DEVICES=0 python scripts/trainer.py default --data_dir <path-to-dnerf-data> --result_dir <path-to-result-dir> --ckpt None --init_extent 1.0 --interpolate-val --dataset_type dnerf

For traning colmap scenes, run

CUDA_VISIBLE_DEVICES=0 python scripts/trainer.py default --data_dir <path-to-dnerf-data> --result_dir <path-to-result-dir> --ckpt None --init_extent 1.0 --dataset_type colmap --strategy.refine_stop_iter 4000 --data_factor 4

Rendering

For launch viewer from trained scenes, please run with ckpt:

# for dnerf data
CUDA_VISIBLE_DEVICES=0 python scripts/trainer.py default --data_dir <path-to-dnerf-data> --result_dir <path-to-result-dir> --ckpt <path-to-ckpt> --init_extent 1.0 --interpolate-val --dataset_type dnerf

# for real data
CUDA_VISIBLE_DEVICES=0 python scripts/trainer.py default --data_dir <path-to-dnerf-data> --result_dir <path-to-result-dir> --ckpt <path-to-ckpt> --init_extent 1.0 --dataset_type colmap --strategy.refine_stop_iter 4000 --data_factor 4

Related Work

E-D3DGS (Bae et al. 2024) proposes a similar framework for 4D reconstruction, but based on well-captured multi-view videos. We found that, even when observed images are taken at irregular and long intervals (i.e., when we can no longer assume temporal smoothness and continuity), our method can still reconstruct each image well using per-Gaussian and per-shot embeddings. Furthermore, we demonstrated that the embedding space allows for smoothly controllable dynamic novel view synthesis.
WildGaussians (Kulhanek et al., 2024) extends 3DGS to in-the-wild settings where the appearance may vary. Analogically, our method extends 3DGS to in-the-wild settings where objects may move.

Citation

If you use this repository or refer to the ideas in this work, please cite it as follows:

@misc{kohyama2024blowin,
    author = {Kai Kohyama and Toru Matsuoka and Sosuke Kobayashi and Hiroharu Kato},
    title = {Blowin' in the Wild: Dynamic Looping Gaussians from Still Images},
    howpublished = {\url{https://github.com/pfnet-research/blowin-in-the-wild}},
    year = {2024}
}

This work is derived from a 2024 internship project at Preferred Networks, Inc.