Skip to content

Added VGGT as an option for sfm-tool #3642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jckhng
Copy link

@jckhng jckhng commented Apr 25, 2025

Uses VGGT (also added it as submodule) to calculate poses and initial sparse point cloud and save it for ns-train reconstruction.

Example usage:
ns-process-data images --sfm-tool vggt --data xxx --output-dir xxx

jckhng and others added 6 commits April 16, 2025 00:05
… structure for ns-train to work on.

example usage:
ns-process-data images-vggt --data data/nerfstudio/poster/images/ --output-dir data/nerfstudio/poster-vggt --conf-threshold 75
subsequently run:
ns-train nerfacto --data data/nerfstudio/poster-vggt
…fm-tool vggt to better describe the functionality
@abrahamezzeddine
Copy link

abrahamezzeddine commented Apr 27, 2025

Hi,

I just tried it out and what I can notice is that the VGGT tool uses around 0.3-0.35GB VRAM per image (seems like resolution does not matter that much). I tried to use downsampled images to see if that would make a difference but it did not.

Question: would it be possible to split a folder of images into sizes that almost matches the GPU RAM, perform the sfm tool one batch at a time and then align the different models?

In addition to that, seems that VGGT was really struggling with my datasets... duplicate layers of all points and quite compressed environment... :(

@NiklasVoigt
Copy link

NiklasVoigt commented May 13, 2025

c16adb9 causes index out of bounds, image dir has original sized images and transform json has w*h from downsized

The transformation is incorrect when using the original-sized images.

Original Size (3840*2160):
original_sized

Downsized (518*294):
downsized

@jckhng
Copy link
Author

jckhng commented May 16, 2025

c16adb9 causes index out of bounds, image dir has original sized images and transform json has w*h from downsized

The transformation is incorrect when using the original-sized images.

Sorry, I messed up the references when checking in and submitting the pull request. I will fix it later. There is an update in vggt_to_colmap.py (see below) to handle this problem. We prefer to use original images when using ns-train rather than the downsampled ones. But VGGT typically outputs the transforms.json for the downsampled images, so we can't use original images right away.

The updated code can be found here:
https://github.com/jckhng/vggt/blob/59d1b24ad7953b35fe3cd51bc8c53123cd02a8cc/vggt_to_colmap.py#L65-L82

Basically, it calculate the pointcloud (sparse_pc.ply) using intrinsic_downsampled (from using downsampled images), but uses the original images to calculate the intrinsic matrix for computing transforms.json. So this allow us to use original images for higher quality reconstruction.

This should work with using original images.

@jckhng
Copy link
Author

jckhng commented May 16, 2025

In addition to that, seems that VGGT was really struggling with my datasets... duplicate layers of all points and quite compressed environment... :(

I suspect its the same issue as what @NiklasVoigt has encountered. I will get the pullrequest to reference the correct version of vggt_to_colmap.py

But in meantime you can refer to the latest code in www.github.com/jckhng/vggt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants