-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic segmentation + running Hydra on custom data #1
Comments
Thank you for your interest in our work! On the choice of a 2D semantic segmentation network: We actually use an off-the-shelf network for 2D semantic segmentation as mentioned in our paper (we use HRNet from the MIT Scene-Parsing benchmark). The pre-trained pytorch models can be found here. Using other networks or sourced of ground-truth semantics is also possible as long as you post-process the semantic labels and set up your configs correctly downstream. Actually setting up 2D semantic segmentation for use with Hydra (and Kimera-Semantics): The code we haven't released yet just loads in a ONNX model (which we exported from pytorch) into TensorRT, subscribes to a RGB image topic, runs inference on received RBG images, and then condenses and colors the resulting semantic label image. You could also do the same with pytorch directly (we just wanted to avoid python2 to python3 issues in 18.04). There are typically two steps when setting up a new segmentation network:
Once you have that, you just need to combine the depth and semantic image using this nodelet in Running with a different pose source Our plan is to add instructions for configuring Hydra to work for new datasets and sensors, but it may be a little bit before I have a chance to write them up, and they will be specific to Kimera-VIO when I do write them up. That said, it's certainly possible to set up a launch file for a different sensor and odometry source. The base launch files that we share between our different datasets are here and here. You can look at how we specialize our launch files here and here respectively. The two inputs that are required for Hydra to function are a semantic pointcloud (by default, Hydra subscribes to Finally, if you end up working on standing up your own pipeline, I'd kindly request that if possible, you take notes on the process (that we can add as documentation) to help others also use Hydra (your experience may be more informative to other users than the instructions that I write-up). Please also don't hesitate to reach out with other questions! I'll leave this issue open for now as temporary documentation as I'm sure others have similar questions. |
Hi @feramozur, have you been successful in using Hydra on data from your robot? |
Do you think ZED 2i camera is a suitable sensor setup to try Hydra for real-time use? |
Hi, yes the ZED 2i is likely suitable for use as an input to Hydra (and it's on our list of cameras to try out). We've used similar cameras in the past for Hydra (the Kinect Azure for the RSS paper and the realsense D455 in some of our testing) though the underlying stereo technologies will probably affect the reconstruction quality in different environments (i.e. the time-of-flight sensor in the kinect vs. the active/passive D455 combo vs. the passive zed with a larger baseline). If you're trying to run with Kimera-VIO, then the zed or the d455 are reasonable options: the zed reconstruction quality won't suffer but isn't rolling shutter, and the d455 is global shutter on the infrared cameras (I believe) but requires turning off the IR emitter which can reduce the depth quality. |
Thanks for your reply! I
The mesh coming in RVIZ looks like this: Can you please point me in the right direction? I`ve also had issues with ORB-SLAM3 and one opinion there is that ZED camera is a rolling shutter camera, while the algorithm requires a global shutter. Could this be an issue? Or is it just poor calibration of the camera? Many thanks in advance, any hint would be highly appreciated! |
I'm unfortunately not a maintainer for Kimera VIO so I probably won't be able to help too much, but will say that the mesher output that you're seeing isn't really related to the quality of the mesh that Kimera Semantics produces, which is what Hydra actually uses for reconstruction (i.e. getting the mesher in Kimera VIO to work is irrelevant for Hydra). As long as you get a reasonable pointcloud out of the zed camera and a reasonable pose estimate from Kimera VIO, you should be set for running with Hydra (after setting up a new config and launch files for it). w.r.t. rollling shutter / calibration: Rolling shutter effects can affect pose estimation quality, but are most severe under high velocity or fast rotations and I doubt it would be causing catastrophic issues. There could be issues with the intrinsics calibration for the zed if you did it yourself, as the left/right images might already be undistorted to do passive stereo (and calibrating an already undistorted image can lead to problems). However, having never directly worked with the camera myself, I'm not sure what the factory calibration looks like / whether the left/right images are already undistorted/rectified. |
Thanks! Do you know how can I configure Kimera Semantics to take the pose and point cloud directly from ZED ros wrapper and avoid using Kimera VIO at all? An example of the launch file would be super useful! |
I put together an untested sample launch file that tries to configure both the You will still need to set up a 2D semantic segmentation network external to the launch file, which is discussed in the first two sections in my response earlier in the issue here . It might also be worth looking also at the pointers in that response to how the files for |
Hi thanks a lot for your reply! It helped a lot and I managed to launch semantic mesh building and it works pretty well. Still have to figure out loop closure. So I have a couple of questions:
Below I attach my launch file and csv file with colors remapped (is this the way it`s supposed to be?)
|
No, sorry (when I wrote response I forgot that Hydra depends on some unreleased kimera-vio code to actually compute the relative pose of the two keyframes for the vision-based loop closures). However, if you have information from the ZED as to when a loop-closure was detected (and what the relative pose is), then you could publish that as a
No, you shouldn't have to do this (that's only an argument for the |
Many thanks for your reply, the problem with humans got solved! I`m still unsure about what exactly you meant by:
So would I have to rewrite the pose_graph_publisher_node to construct the pose_graph based on the odometry coming from ZED? I'd greatly appreciate it if you could expand on this further. Also, would it be easier to do LCD in Kimera-PGMO instead of ZED SDK? In this case, I'd only have odometry and depth images coming out of ZED. How can I enable LCD in this case if I provide the BoW? Or is it something that kimera-vio (unreleased) is supposed to do? |
I'm glad you were able to resolve the issue with dynamic labels! If you get a chance, adding a brief statement about what was wrong may help other people who are having similar issues (editing your comment above is probably the best way to do this)
No, Kimera-PGMO doesn't generate loop closure proposals. Hydra and Kimera-PGMO share the same machinery for processing visual loop closures (
My apologies, I made a bad assumption on the part of the zed (that the zed gave you access to information about loop closures that it detected), which definitely made my previous instructions confusing. More documentation as to how Hydra handles loop closures: The backend (and the frontend) subscribe to Hydra can also detect "internal" loop closures. An odometric pose graph is enough for the scene graph-based loop closures, but the object-based registration (to solve for the relative pose between two keyframes) can be poorly conditioned if you don't have enough objects in the scene or the semantic segmentation is noisy. Hydra's loop closure detector will proceed down the descriptor hierarchy (places, then objects, then BoW, see the paper for exact details). If BoW descriptors are not present, Hydra will skip doing BoW detection and visual registration. To enable BoW based detection, you'd have to publish the BoW descriptor for each keyframe via the Note that Hydra will always respect external loop closures, even if internal loop closure detection is enabled. We have switches in kimera to disable detecting loop closures so we don't get duplicate visual loop closures when running with internal loop closures. All that said, there are four ways to proceed here:
|
Hi @nathanhhughes, thanks for all the great information you provide in this issue. I was just wondering if there were any updates regarding the release of the modified kimera-vio so one could have access to loop closures in hydra? Thank you! |
Hi @araujokth , sorry for the delay in responding. It's looking like we'll have the version of Kimera-VIO that we used pushed as a pre-release in the next couple weeks (we're trying to have it out before the IROS submission deadline). |
Thanks so much for the reply @nathanhhughes ! Very looking forward for that update and good luck with the IROS submission! |
@araujokth (and others) Thank you for your patience! We've added our changes to a pre-release version of Kimera and updated the documentation on the Hydra side here |
Fantastic, thanks for the update! I will give it a try! |
Hi, @nathanhhughes @MikeDmytrenko ,I also tried running Hydra on the zed2i camera. I created my own dataset and did not use semantic images (only RGB images) for testing. My launch file is the same as @MikeDmytrenko, but I reported an error: Do you know where the problem is? Is it a problem with the dataset or do I need to change some parameters? I am using version 1.0 of Hydra |
First of all, thank you for releasing such an interesting work in the field of Semantic SLAM.
Considering that the Semantic Segmentation Network used by Hydra is not yet ready for publication, I'd like to ask if there are alternatives that can be considered in the meantime, including the use of other networks, or the supply of custom "ground truth" data (in that case I'd like to know which format Hydra expects).
Additionally, I was wondering if there are plans to add support for custom bag files/sequences, or alternatively adding instructions to do so, reutilizing the existing launch file infrastructure. We are very interested in testing Hydra with data recorded from our own robot, which is fitted with an RGBD camera, and it could also provide odometry information directly (
nav_msgs/Odometry
) instead of requiring a visual-inertial odometry component. It is also not presently clear to us which topics are needed as inputs to the system.The text was updated successfully, but these errors were encountered: