Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the actual input to NICER-SLAM? #9

Closed
be-hd opened this issue Apr 23, 2024 · 2 comments
Closed

What's the actual input to NICER-SLAM? #9

be-hd opened this issue Apr 23, 2024 · 2 comments

Comments

@be-hd
Copy link

be-hd commented Apr 23, 2024

I have a monocular video, what are the step to run NICER-SLAM to get to the final output of the demo video with 3D map and camera trajectory?
Looks like NICER-SLAM requires not only monocular video/frames as an input, but also requires:

  1. COLMAP output (cameras.bin, images,bin, points3D.bin) --> How to generate these given a monocular video? What are these values stand for?
  2. Depth info (monocular depth estimation using a 3rd party package)
  3. Normal info (monocular normal estimation using a 3rd party package)
  4. Optical flow (estimation using a 3rd party package)
  5. Camera pose???
  6. Anything else?

Then how the accuracy of these input impact the final output of NICER-SLAM?

Thank you.

@OnHaDe
Copy link

OnHaDe commented May 22, 2024

You are correct, despite the implication of the paper you need more data than the simple image sequence. These are only used to calculate the loss and not as direct inputs as far as I can tell.

  1. Get COLMAP. Put the video frames in a directory of your choice and run something like:
    colmap automatic_reconstructor --workspace_path /your/workspace/path --image_path /your/dataset/frames --vocab_tree path/to/vocab_tree.bin --single_camera 1 --num_threads 12
    The last three arguments are dependant on your circumstances, you can refer to the COLMAP documentation for that
    2 - 4. Are handled by the preprocessing scripts (presuming you set up the conda environments)
  2. The COLMAP output includes the camera poses. They are transformed to a new format in the preprocessing scripts.
  3. That's all

@Zzh2000
Copy link
Collaborator

Zzh2000 commented Jun 22, 2024

The COLMAP and the camera pose are not a necessity for running NICER-SLAM. You can check the code and find that they are used for getting the scene bound and normalizing the scene scale to -1 to 1 to satisfy the VolSDF/MonoSDF code base. The gt poses are good for debugging (e.g. gt_cam=True meaning debugging by giving every camera gt pose) and they are not involved in the tracking/mapping process.

@Zzh2000 Zzh2000 closed this as completed Jun 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants