-
Notifications
You must be signed in to change notification settings - Fork 59
Grids Ontology Improvement #827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a27d779
a195f51
1e494c8
2547fbb
b366978
b901387
ab63ad8
db52114
854c271
d7dbbc8
e40e396
e238385
fb8cf49
3bcc8ab
e759466
210c52c
222968b
4c34bf4
2f4b36c
6b5fff3
ae2f871
79ac80e
a2318a3
3035cd4
091e6cc
803f960
e222677
04323d1
5babd23
2881a7e
0037402
acdefdb
fc1d537
7276830
0007f78
9588ac7
c3b2ba8
c0b3d8c
7203b23
5fd59ca
83083fc
3dacf0d
deb2a30
bfed96f
4023891
5e8c483
618a974
e7e0cc4
7d7d04a
416cbbc
90612b7
d2b10c0
d49a319
286428f
8deb0c9
884cb6f
d48b693
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -23,6 +23,7 @@ | |
| from typing import ( | ||
| Iterable, | ||
| Optional, | ||
| Tuple, | ||
| Type, | ||
| Hashable, | ||
| TypeVar, | ||
|
|
@@ -33,7 +34,7 @@ | |
| overload, | ||
| List, | ||
| ) | ||
|
|
||
| import math | ||
| import numpy as np | ||
|
|
||
| from forte.data.container import ContainerType | ||
|
|
@@ -49,6 +50,7 @@ | |
| "FList", | ||
| "FNdArray", | ||
| "MultiEntry", | ||
| "Grid", | ||
| ] | ||
|
|
||
| default_entry_fields = [ | ||
|
|
@@ -635,5 +637,223 @@ def index_key(self) -> int: | |
| return self.tid | ||
|
|
||
|
|
||
| class Grid: | ||
| """ | ||
| Regular grid with a grid configuration dependent on the image size. | ||
| It is a data structure used to retrieve grid-related objects such as grid | ||
| cells from the image. Grid itself doesn't store any data. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the core functionality of a Btw, is there any use case you find that we need to retrieve the image inside the grid cell? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think bounding box prediction is usually based on only image on inside a grid cell (anchor box) in YOLO algorithm. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| Based the image size and the grid shape, | ||
| we compute the height and the width of grid cells. | ||
| For example, if the image size (image_height,image_width) is (640, 480) | ||
| and the grid shape (height, width) is (2, 3) | ||
| the size of grid cells (self.c_h, self.c_w) will be (320, 160). | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess you might want to discuss how we handle the cases where the numbers are indivisible? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. discussed |
||
|
|
||
| However, when the image size is not divisible by the grid shape, we round | ||
| up the resulting size(floating number) to an integer. | ||
| In this way, as each grid | ||
| cell taking one more pixel, we make the last grid cell per column and row | ||
| size(height and width) to be the remainder of the image size divided by the | ||
| grid cell size which is smaller than other grid cell. | ||
| For example, if the image | ||
| size is (128, 128) and the grid shape is (13, 13), the first 11 grid cells | ||
| per column and row will have a size of (10, 10) since 128/13=9.85, so we | ||
| round up to 10. The last grid cell per column and row will have a size of | ||
| (8, 8) since 128%10=8. | ||
|
|
||
|
|
||
| We require each grid to be bounded/intialized with one image size since | ||
| the number of different image shapes are limited per computer vision task. | ||
| For example, we can only have one image size (640, 480) from a CV dataset, | ||
| and we could augment the dataset with few other image sizes | ||
| (320, 240), (480, 640). Then there are only three image sizes. | ||
| Therefore, it won't be troublesome to | ||
| have a grid for each image size, and we can check the image size during the | ||
| initialization of the grid. | ||
| By contrast, if the grid is totally "free-form" | ||
| that we don't initialize it with any | ||
| image size and pass the image size directly into the method/operation on | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You said "we don't have to initialize it with any image size", but your There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rewrote the docstring. |
||
| the fly, the API would be more complex and image size check would be | ||
| repeated everytime the method is called. | ||
|
|
||
| Args: | ||
| height: the number of grid cell per column, the unit is one grid cell. | ||
| width: the number of grid cell per row, the unit is one grid cell. | ||
| image_height: the number of pixels per column in the image. | ||
| image_width: the number of pixels per row in the image. | ||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| height: int, | ||
| width: int, | ||
| image_height: int, | ||
| image_width: int, | ||
| ): | ||
| if image_height <= 0 or image_width <= 0: | ||
| raise ValueError( | ||
| "both image height and width must be positive" | ||
| f"but the image shape is {(image_height, image_width)}" | ||
| "please input a valid image shape" | ||
| ) | ||
| if height <= 0 or width <= 0: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why don't we also check for |
||
| raise ValueError( | ||
| f"height({height}) and " | ||
| f"width({width}) both must be larger than 0" | ||
| ) | ||
| if height >= image_height or width >= image_width: | ||
| raise ValueError( | ||
| "Grid height and width must be smaller than image height and width" | ||
| ) | ||
|
|
||
| self._height = height | ||
| self._width = width | ||
|
|
||
| self._image_height = image_height | ||
| self._image_width = image_width | ||
|
|
||
| # if the resulting size of grid is not an integer, we round it up. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needs to go into the docstring, users need to know. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rewrote the docstring. |
||
| # The last grid cell per row and column might be out of the image size | ||
| # since we constrain the maximum pixel locations by the image size | ||
| self.c_h, self.c_w = ( | ||
| math.ceil(image_height / self._height), | ||
| math.ceil(image_width / self._width), | ||
| ) | ||
|
|
||
| if self.c_h <= 0 or self.c_w <= 0: | ||
| raise ValueError( | ||
| "cell height and width must be positive" | ||
| f"but the cell shape is {(self.c_h, self.c_w)}" | ||
| "please adjust image shape or grid shape accordingly" | ||
| ) | ||
|
|
||
| def get_grid_cell(self, img_arr: np.ndarray, h_idx: int, w_idx: int): | ||
| """ | ||
| Get the array data of a grid cell from image of the image payload index. | ||
hunterhector marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| The array is a masked version of the original image, and it has | ||
| the same size as the original image. The array entries that are not | ||
| within the grid cell will masked as zeros. The image array entries that | ||
| are within the grid cell will kept. | ||
|
|
||
| Note: all indices are zero-based and counted from top left corner of | ||
| the image. | ||
|
|
||
| Args: | ||
| img_arr: image data represented as a numpy array. | ||
| h_idx: the zero-based height(row) index of the grid cell in the | ||
| grid, the unit is one grid cell. | ||
| w_idx: the zero-based width(column) index of the grid cell in the | ||
| grid, the unit is one grid cell. | ||
|
|
||
| Raises: | ||
| ValueError: ``h_idx`` is out of the range specified by ``height``. | ||
| ValueError: ``w_idx`` is out of the range specified by ``width``. | ||
|
|
||
| Returns: | ||
| numpy array that represents the grid cell. | ||
| """ | ||
| if not 0 <= h_idx < self._height: | ||
| raise ValueError( | ||
| f"input parameter h_idx ({h_idx}) is" | ||
| "out of scope of h_idx range" | ||
| f" {(0, self._height)}" | ||
| ) | ||
| if not 0 <= w_idx < self._width: | ||
| raise ValueError( | ||
| f"input parameter w_idx ({w_idx}) is" | ||
| "out of scope of w_idx range" | ||
| f" {(0, self._width)}" | ||
| ) | ||
| # initialize a numpy zeros array | ||
| array = np.zeros((self._image_height, self._image_width)) | ||
| # set grid cell entry values to the values of the original image array | ||
| # (entry values outside of grid cell remain zeros) | ||
| # An example of computing grid height index range is | ||
| # index * cell height : min((index + 1) * cell height, image_height). | ||
| # It's similar for computing cell width index range | ||
| # Plus, we constrain the maximum pixel locations by the image size as | ||
| # the last grid cell per row and column might be out of the image size | ||
| array[ | ||
| h_idx * self.c_h : min((h_idx + 1) * self.c_h, self._image_height), | ||
| w_idx * self.c_w : min((w_idx + 1) * self.c_w, self._image_width), | ||
| ] = img_arr[ | ||
| h_idx * self.c_h : min((h_idx + 1) * self.c_h, self._image_height), | ||
| w_idx * self.c_w : min((w_idx + 1) * self.c_w, self._image_width), | ||
| ] | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What would the behavior related to the residual (incomplete) grids on the edge? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the residual grid cell (index range) will be smaller than other regular ones. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needed to be explained in the docstring clearly. |
||
| return array | ||
|
|
||
| def get_grid_cell_center(self, h_idx: int, w_idx: int) -> Tuple[int, int]: | ||
| """ | ||
| Get the center pixel position of the grid cell at the specific height | ||
| index and width index in the ``Grid``. | ||
| The computation of the center position of the grid cell is | ||
| dividing the grid cell height range (unit: pixel) and | ||
| width range (unit: pixel) by 2 (round down) | ||
|
|
||
| Suppose an extreme case that a grid cell has a height range | ||
| (unit: pixel) of (0, 3) | ||
| and a width range (unit: pixel) of (0, 3) the grid cell center | ||
| would be (1, 1). | ||
| Since the grid cell size is usually very large, | ||
| the offset of the grid cell center is minor. | ||
|
|
||
| Note: all indices are zero-based and counted from top left corner of | ||
| the grid. | ||
|
|
||
| Args: | ||
| h_idx: the height(row) index of the grid cell in the grid, | ||
| the unit is one grid cell. | ||
| w_idx: the width(column) index of the grid cell in the | ||
| grid, the unit is one grid cell. | ||
|
|
||
| Returns: | ||
| A tuple of (y index, x index) | ||
| """ | ||
|
|
||
| return ( | ||
mylibrar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| (h_idx * self.c_h + min((h_idx + 1) * self.c_h, self._image_height)) | ||
| // 2, | ||
| (w_idx * self.c_w + min((w_idx + 1) * self.c_w, self._image_width)) | ||
| // 2, | ||
| ) | ||
|
|
||
| @property | ||
| def num_grid_cells(self): | ||
| return self._height * self._width | ||
|
|
||
| @property | ||
| def height(self): | ||
| return self._height | ||
|
|
||
| @property | ||
| def width(self): | ||
| return self._width | ||
|
|
||
| def __repr__(self): | ||
| return str( | ||
| (self._height, self._width, self._image_height, self._image_width) | ||
| ) | ||
|
|
||
| def __eq__(self, other): | ||
| if other is None: | ||
| return False | ||
| return ( | ||
| self._height, | ||
| self._width, | ||
| self._image_height, | ||
| self._image_width, | ||
| ) == ( | ||
| other._height, | ||
| other._width, | ||
| other.image_height, | ||
| other.image_width, | ||
| ) | ||
|
|
||
| def __hash__(self): | ||
| return hash( | ||
| (self._height, self._width, self._image_height, self._image_width) | ||
| ) | ||
|
|
||
|
|
||
| GroupType = TypeVar("GroupType", bound=BaseGroup) | ||
| LinkType = TypeVar("LinkType", bound=BaseLink) | ||
Uh oh!
There was an error while loading. Please reload this page.