Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with bbox #44

Open
madaan-nikhil opened this issue Jul 30, 2023 · 3 comments
Open

Issue with bbox #44

madaan-nikhil opened this issue Jul 30, 2023 · 3 comments

Comments

@madaan-nikhil
Copy link

I encountered an issue with the bbox coordinates provided in the dataset. Specifically, when attempting to plot the bbox and point cloud for scene_00000 and room_id 485142, I noticed discrepancies. The point cloud was generated using the method outlined in the link (#9 (comment)), and to better visualize the bbox coordinates, I reduced the opacity of the point cloud and marked the bbox with red points.

Screen Shot 2023-07-30 at 1 54 20 AM

@madaan-nikhil
Copy link
Author

@bertjiazheng can you please help with this?

@bertjiazheng
Copy link
Owner

Hi,

Sorry for the late reply.

It is possible that the coordinate system for the point cloud and bounding box annotations may differ. The point cloud is derived from the RGB-D image and has the camera center as the center of the coordinate system. On the other hand, the bounding boxes are labeled in the world system. To align the point cloud with the world coordinate system, you can add the position of the camera.

Best,
Jia

@madaan-nikhil
Copy link
Author

Hi,
Thanks for your reply. I am adding the camera centre to make sure that pnt cld is in world coord system (Image shared above is after adding the camera centre). Here, is the method I am using for generating the per room pnt cld.

def generate_per_room_point_cloud(self, random_level=0, color=False, normal=False):
        
        normals = []
        points = {}
        points['colors'] = []
        points['coords'] = []
        # Getting Coordinates
        for i in range(len(self.depth_paths)):
            try:
                coords = []
                print(self.depth_paths[i])
                depth_img = cv2.imread(self.depth_paths[i], cv2.IMREAD_ANYDEPTH)
                x_tick = 180.0/depth_img.shape[0]
                y_tick = 360.0/depth_img.shape[1]
                for x in range(depth_img.shape[0]):
                    for y in range(depth_img.shape[1]):
                        # need 90 - -09
                        alpha = 90 - (x * x_tick)
                        beta = y * y_tick -180
                        depth = depth_img[x,y] + np.random.random()*random_level
                        z_offset = depth*np.sin(np.deg2rad(alpha))
                        xy_offset = depth*np.cos(np.deg2rad(alpha))
                        x_offset = xy_offset * np.sin(np.deg2rad(beta))
                        y_offset = xy_offset * np.cos(np.deg2rad(beta))
                        point = np.asarray([x_offset, y_offset, z_offset])
                        coords.append(point + self.camera_centers[i])
                points['coords'].append(np.asarray(coords))
            except:
                file1 = open("logs.txt", "a")  # append mode
                file1.write(f"{self.depth_paths[i]} missing \n")
                file1.close()

        if color:
            # Getting RGB color
            for i in range(len(self.rgb_paths)):
                try:
                    colors = []
                    # print(self.rgb_paths[i])
                    rgb_img = cv2.imread(self.rgb_paths[i])
                    for x in range(rgb_img.shape[0]):
                        for y in range(rgb_img.shape[1]):
                            colors.append(rgb_img[x, y])
                    points['colors'].append(np.asarray(colors)/255.0)
                except:
                    file1 = open("logs.txt", "a")  # append mode
                    file1.write(f"{self.rgb_paths[i]} missing \n")
                    file1.close()

        if normal:
            # Getting Normal
            for i in range(len(self.normal_paths)):
                # print(self.normal_paths[i])
                normal_img = cv2.imread(self.normal_paths[i])
                for x in range(normal_img.shape[0]):
                    for y in range(normal_img.shape[1]):
                        normals.append(normalize(normal_img[x, y].reshape(-1, 1)).ravel())
            points['normals'] = normals
        return points

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants