Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grids Ontology Improvement #827

Closed
wants to merge 57 commits into from
Closed

Grids Ontology Improvement #827

wants to merge 57 commits into from

Conversation

hepengfe
Copy link
Collaborator

@hepengfe hepengfe commented Jun 7, 2022

This PR fixes #817.

Description of changes

  • change Grids -> Grid and move it to core.py as a data structure rather than an ontology
    • dealing edge cases of residual grid cells(last grid cells per row/column). We use the image size as upper bound for these grid cells so they might be smaller than regular grid cells.
  • improve grid initialization in BoundingBox
  • add more properties and functions in Box based on three use cases.
    • A standalone box (not associated with grid)
    • Ground truth bounding box. (known absolute position and shape)
    • Predicted bounding box. (associated with a grid cell and known height/width offsets)
  • add complete docstring with use cases in Box
  • Box condition attributes
    • self._is_grid_associated
    • self._is_box_center_set()

Possible influences of this PR.

Describe what are the possible side-effects of the code change.

Test Conducted

Describe what test cases are included for the PR.

@hepengfe hepengfe added topic: ontology All issues related to ontologies in core, top and base_ontology topic:cv labels Jun 7, 2022
@hepengfe hepengfe self-assigned this Jun 7, 2022
@hepengfe
Copy link
Collaborator Author

hepengfe commented Jun 7, 2022

The implementation design considerations is:

  1. Regular shape CV objects should have a reference point. In our case, Box is an example and BoundingBox is even more special that its reference point needs to be center of grid cell.
  2. Due to the limitation of ontology object __init__, we can not let Grid be the parameter to initialize BoundingBox and associate BoundingBox and Grid.
  3. Based on the considerations above, I initialize both BoundingBox center and Box center (reference point) as the image center and record whether it's a default box center and whether it's associated with a grid. And it will give corresponding Warning/ValueError appropriately.
  4. I provide methods to set box center(reference point) based on user input point location (cy, cx) or Grid.

Several questions about the design:
I think Grid should be a class that is somehow independent from DataPack and specific image payload. As I think Grid is a just an interface that able to provides grid cell location given an image shape. As images in a dataset is basically the same, we might shouldn't create a Grid for every image. But it seems like Grid is bound to DataPack and DataPack contains few images. If we store grid inside DataPack, we might have to duplicate grid for each image.

@hepengfe hepengfe requested review from hunterhector and mylibrar June 7, 2022 23:49
@codecov
Copy link

codecov bot commented Jun 7, 2022

Codecov Report

Merging #827 (d48b693) into master (8e03abe) will increase coverage by 0.05%.
The diff coverage is 80.87%.

@@            Coverage Diff             @@
##           master     #827      +/-   ##
==========================================
+ Coverage   80.75%   80.81%   +0.05%     
==========================================
  Files         254      254              
  Lines       19364    19574     +210     
==========================================
+ Hits        15637    15818     +181     
- Misses       3727     3756      +29     
Impacted Files Coverage Δ
forte/data/data_store.py 93.17% <ø> (ø)
forte/data/entry_converter.py 82.89% <ø> (-0.44%) ⬇️
forte/data/ontology/top.py 76.45% <68.96%> (+0.70%) ⬆️
forte/data/ontology/core.py 76.13% <72.09%> (-0.79%) ⬇️
forte/data/data_pack.py 85.49% <100.00%> (-0.03%) ⬇️
tests/forte/grids_test.py 100.00% <100.00%> (ø)
tests/forte/image_annotation_test.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e03abe...d48b693. Read the comment docs.

@hepengfe hepengfe closed this Jun 8, 2022
@hepengfe hepengfe reopened this Jun 8, 2022
@hepengfe
Copy link
Collaborator Author

hepengfe commented Jun 8, 2022

IMG_F121392AF097-1

Usually, there are many regular ontologies(such as bounding box) and many images but only few grids. The number of grids actually needed only depends on the number of different image sizes and the number of grid configurations.

My proposal is that
Past design: Each datapack stores few images and store few grids. The number of grids in total is dependent the number of images. With a large datasets, there will be many duplicated grids distributed in many DataPack.

New design: we implement a Grid interface such that only a collection of grids (probably tens of them) is needed to query-related data given image and grid-related parameters/regular ontology such as BoundingBox. Grid itself just like other ontology doesn't store image data. In addition, it is not bound to any other regular ontology.

If this design is expected, we might not want it to inherit entry as it will create new ones for each DataPack.

@hunterhector
Copy link
Member

hunterhector commented Jun 8, 2022

Usually, there are many regular ontologies(such as bounding box) and many images but only few grids. The number of grids actually needed only depends on the number of different image sizes and the number of grid configurations.
My proposal is that Past design: Each datapack stores few images and store few grids. The number of grids in total is dependent the number of images. With a large datasets, there will be many duplicated grids distributed in many DataPack.
New design: we implement a Grid interface such that only a collection of grids (probably tens of them) is needed to query-related data given image

Good thoughts on limiting "number of grids", feels like a good idea, though I haven't considered everything about it.

and grid-related parameters/regular ontology such as BoundingBox. Grid itself just like other ontology doesn't store image data. In addition, it is not bound to any other regular ontology.

I have a few suggestions on this detail though. You can already see the problem of making Grid be like other ontology items. First, it cannot be stored as data along with each DataPack (since this will cause duplications). Second, it cannot inherit an Entry. I think this basically means we shouldn't consider Grid inside the ontology framework.

Instead, if you want to achieve this, we probably will make Grid a basic data structure used by the Forte DataPack system only. (i.e. Grid is more like a List, SortedList, TList or Adjacent Array, instead of an Entry). In DataPack, we can create some default Grids and users won't involve in this process, they are the underlying data structures that speed up certain computations only.

Having Grid as a new data structure does not mean that users won't be able to use it, for example, we can imagine supporting Grid as an attribute type for a certain ontology (i.e Grid is not used as an Entry type, but can be used as an attribute type, like TList). Consider the following specification:

{
  "entry_name": "User.Defined.Type",
  "entry_attributes":
   {
       "grid": "forte.data.core.grid",      
   }
}

We can allow users to invoke some functions like computing the position of a particular bounding box with any grids, even the one they defined. However, these should be advanced and rare use cases for now.

Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To continue this PR, let's make Grid a data structure, like Span.

@hepengfe hepengfe marked this pull request as ready for review June 22, 2022 22:43
Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You can store Grid in the data store, but don't store them like other entries, because they are not.
  2. I think you haven't handled (or at least haven't documented) the behavior of the edge cases well. How do we treat the pixels on the side where y > c_h * self._height. Let's make sure we have a common understanding of the behavior, test the edge cases and write the documentation for it

@@ -681,7 +679,7 @@ def _new_grid(
tid: int = self._new_tid() if tid is None else tid
entry: List[Any]

entry = [image_payload_idx, None, tid, type_name]
entry = [None, None, tid, type_name]
entry += self._default_attributes_for_type(type_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a Grid is no longer an entry, why are we still storing them like entries?

@@ -883,7 +881,7 @@ def num_entries(self, entry_type_name: str) -> int:

def _add_entry_raw(
self,
entry_type: Type[Entry],
entry_type: Union[Type[Entry], Type[Grid]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't do this here, we basically don't store them together like other entries anymore.

@@ -1082,7 +1080,6 @@ def add_image_annotation_raw(
def add_grid_raw(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't have add_grid_raw

image_height: int,
image_width: int,
):
if height <= 0 or width <= 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we also check for image_height

self.c_h, self.c_w = (
image_height // self._height,
image_width // self._width,
) # compute the height and width of grid cells
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We round up the division here, but what's the implication? Does it mean some grid cells are larger? Or do we have residual values? Which ones are larger and what's the behavior? We need full documentation of these

] = img_arr[
h_idx * self.c_h : (h_idx + 1) * self.c_h,
w_idx * self.c_w : (w_idx + 1) * self.c_w,
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the behavior related to the residual (incomplete) grids on the edge?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the residual grid cell (index range) will be smaller than other regular ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needed to be explained in the docstring clearly.

@@ -1010,40 +876,89 @@ class Box(Region):
pack: the container that this ``Box`` will be added to.
image_payload_idx: the index of the image payload. If it's not set,
it defaults to 0 which meaning it will load the first image payload.
height: the height of the box, the unit is one image array entry.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users have no idea what "one image array entry" means

cy: the row index of the box center in the image array,
the unit is one image array entry.
the unit is one image array entry. If not set, the box center
will be set to center of image (half height of the image).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really want to provide default values here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea I was thinking is that regular shape makes sense only when it has a reference point (box center here).
I am thinking two layers of conditions

  1. Whether it has a non-default reference point (maybe we don't need to set them to an actual value here).
  2. Whether it's associated with a grid. (It might or might not depending on the box use cases)

Copy link
Collaborator

@mylibrar mylibrar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we will need a comprehensive example to showcase the power of Grid (maybe in another PR).

grid cell center.
3. When we predict a box, we will have the predicted box shape (height,
width) and the offset between the box center and the grid cell center,
then we can compute the box center.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a standalone Box also for ground truth box or predicted box? Do we assume that the input parameters to a predicted box will be the box shape and gird center offset?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In the first use case, only cy and cx are required to be set.

@mylibrar mylibrar requested a review from hunterhector July 1, 2022 21:12
Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing about this PR is the documentation. It is a bit hard for me to understand the purpose of classes and functions. We need to create a simple tutorial to explain how these things work (and also include some examples in the docstring itself).

@mylibrar I also haven't checked the implementation details, please check that part.

@@ -172,7 +170,9 @@ def __init__(self, pack_name: Optional[str] = None):
self._data_store: DataStore = DataStore()
self._entry_converter: EntryConverter = EntryConverter()
self.image_annotations: List[ImageAnnotation] = []
self.grids: List[Grids] = []
self.payloads: List[np.ndarray] = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an unused variable. And more, this make a new assumption that we store a list of payload in the format as np.ndarray. Let's not do this now (at least not in this PR).

self.grids: List[Grids] = []
self.payloads: List[np.ndarray] = []

self.grids: List[Grid] = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not used in this file.

"""
Regular grid with a grid configuration dependent on the image size.
It is a data structure used to retrieve grid-related objects such as grid
cells from the image. Grid itself doesn't store any data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the core functionality of a Grid is used to index the image area so that we can easily compute the overlap of any two areas. Let's don't forget this.

Btw, is there any use case you find that we need to retrieve the image inside the grid cell?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think bounding box prediction is usually based on only image on inside a grid cell (anchor box) in YOLO algorithm.

Copy link
Member

@hunterhector hunterhector Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Is it a convention or just a choice of a particular library (YOLO)?
  2. you will need to explain these things to the users. before that, can you make sure you explain this to the team so people understand what you are doing?

we compute the height and the width of grid cells.
For example, if the image size (image_height,image_width) is (640, 480)
and the grid shape (height, width) is (2, 3)
the size of grid cells (self.c_h, self.c_w) will be (320, 160).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you might want to discuss how we handle the cases where the numbers are indivisible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed

The grid can be totally "free-form" that we don't initialize it with any
image size and pass the image size directly into the method/operation on
the fly. However, since the number of different image shapes are limited,
we can bound one grid to one image size, and we can also do the image/grid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is still confusing, try to explain this to someone who never works on this and see if they understand. For example, why does one need to bound the grid to the image size? What are "size check"?

@property
def cy(self) -> int:
"""
Compute and return row index of the box center in the image array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

units?

return (self._cy, self._cx)
def box_center(self) -> Tuple[int, int]:
"""
Get the box center by using the property function cy() and cx().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't write implementation details in function docstring, users often won't care (and might get confused).

"""
Get corners of box.
Compute and return the corners of the box.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a corner? Which corners? Shape of the tuple?


@property
def area(self):
def area(self) -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like area should also exist in the upper classes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True if the offset is set, False otherwise.
"""
return self._cy_offset is not None and self._cx_offset is not None


class BoundingBox(Box):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what differences would a boundingbox have comparing to a box?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally I don't think a box need be associated with a grid but it can be associated with grid anyway. A bounding box are much more often associated with a grid due to the CV algorithm.

@hepengfe hepengfe mentioned this pull request Jul 8, 2022
@hepengfe
Copy link
Collaborator Author

Closed as created a new PR #876
Also, Box is kept as it is in master branch(no changes made).

@hepengfe hepengfe closed this Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:cv topic: ontology All issues related to ontologies in core, top and base_ontology
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New Grid Ontology Design
3 participants