Skip to content

Conversation

@Sullivan-Patrick
Copy link
Contributor

Description

Changes jts polygon serde to:
-Always allow serialization/deserialization of single Polygons, accepting zero-area rings.
-Always fail on encountering zero-area rings in MultiPolygon serde.

See facebookincubator/velox#15558 for parallel change in C++ for context and motive.

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

@Sullivan-Patrick Sullivan-Patrick requested a review from a team as a code owner November 20, 2025 23:51
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 20, 2025

Reviewer's Guide

Enhanced JTS polygon serialization/deserialization to properly handle zero-area rings and enforce canonical orientation, with MultiPolygon serde now rejecting zero-area rings and extensive tests added for degenerate polygons.

Sequence diagram for MultiPolygon deserialization with zero-area ring rejection

sequenceDiagram
    participant "readPolygon()"
    participant "isClockwise()"
    participant "PrestoException"
    "readPolygon()"->>"isClockwise()": Check ring orientation
    "isClockwise()"-->>"readPolygon()": Return ZERO_AREA
    "readPolygon()"->>"PrestoException": Throw error for zero-area ring
    "PrestoException"-->>"readPolygon()": Exception propagates
Loading

Sequence diagram for MultiPolygon serialization with zero-area ring rejection

sequenceDiagram
    participant "writePolygon()"
    participant "canonicalizePolygonCoordinates()"
    participant "PrestoException"
    "writePolygon()"->>"canonicalizePolygonCoordinates()": Canonicalize rings
    "canonicalizePolygonCoordinates()"-->>"writePolygon()": Return true if zero-area ring found
    "writePolygon()"->>"PrestoException": Throw error for zero-area ring
    "PrestoException"-->>"writePolygon()": Exception propagates
Loading

Class diagram for updated JtsGeometrySerde polygon serde logic

classDiagram
    class JtsGeometrySerde {
        +readPolygon(SliceInput input, boolean multitype) Geometry
        +writePolygon(Geometry geometry, SliceOutput output, boolean multitype)
        +canonicalizePolygonCoordinates(Coordinate[] coordinates, int[] partIndexes, boolean[] shellPart) boolean
        +reverse(Coordinate[] coordinates, int start, int end)
        +isClockwise(Coordinate[] coordinates) ClockwiseResult
        +isClockwise(Coordinate[] coordinates, int start, int end) ClockwiseResult
    }
    class ClockwiseResult {
        CW
        CCW
        ZERO_AREA
    }
    JtsGeometrySerde --> ClockwiseResult
Loading

Class diagram for ClockwiseResult enum

classDiagram
    class ClockwiseResult {
        <<enum>>
        CW
        CCW
        ZERO_AREA
    }
Loading

File-Level Changes

Change Details Files
Enhance polygon serde to detect zero-area rings and enforce orientation with errors for MultiPolygon
  • Replace boolean isClockwise method with ClockwiseResult enum to distinguish CW, CCW, and zero-area
  • Update deserialization to throw on zero-area rings in MultiPolygon and correctly segment polygons by orientation
  • Modify serialization to canonicalize coordinates, capture zero-area rings, and error for MultiPolygon
  • Refactor canonicalizePolygonCoordinates to return a zero-area flag and reverse rings based on shell/hole orientation
presto-geospatial-toolkit/src/main/java/com/facebook/presto/geospatial/serde/JtsGeometrySerde.java
Add comprehensive tests for degenerate polygon serde behavior
  • Introduce testDegeneratePolygons covering orientation normalization for single Polygon and MultiPolygon
  • Add cases allowing zero-area rings in single Polygon serde and expecting errors in MultiPolygon serde
  • Implement helper methods for validating correct output and expected failures
presto-main-base/src/test/java/com/facebook/presto/geospatial/TestGeoFunctions.java
Update expected error message for invalid polygon self-intersection test
  • Adjust assertion in testGeometryInvalidReason to expect 'Self-intersection' instead of previous message
presto-main-base/src/test/java/com/facebook/presto/geospatial/TestGeoFunctions.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Extract the zero-area ring detection and error handling in readPolygon and writePolygon into a shared helper to reduce duplication and consolidate behavior.
  • The testDegeneratePolygons method has grown very large—consider breaking it into smaller parameterized tests or grouping similar cases to improve readability and maintenance.
  • Resolve or clarify the TODO in readPolygon for distinguishing internal versus user errors to ensure consistent error classification and avoid leaking implementation details.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Extract the zero-area ring detection and error handling in readPolygon and writePolygon into a shared helper to reduce duplication and consolidate behavior.
- The testDegeneratePolygons method has grown very large—consider breaking it into smaller parameterized tests or grouping similar cases to improve readability and maintenance.
- Resolve or clarify the TODO in readPolygon for distinguishing internal versus user errors to ensure consistent error classification and avoid leaking implementation details.

## Individual Comments

### Comment 1
<location> `presto-geospatial-toolkit/src/main/java/com/facebook/presto/geospatial/serde/JtsGeometrySerde.java:506` </location>
<code_context>
+        }

-        if ((isShell && !isClockwise) || (!isShell && isClockwise)) {
+        if ((isShell && clockwiseResult == ClockwiseResult.CCW) || (!isShell && clockwiseResult == ClockwiseResult.CW)) {
             // shell has to be counter clockwise
             reverse(coordinates, start, end);
</code_context>

<issue_to_address>
**nitpick:** The comment above this line incorrectly states the required orientation.

Update the comment to state that shells must be clockwise and holes must be counter-clockwise, matching the code's logic.
</issue_to_address>

### Comment 2
<location> `presto-main-base/src/test/java/com/facebook/presto/geospatial/TestGeoFunctions.java:1433-1444` </location>
<code_context>
+        // Second polygon is zero area
+        testDegeneratePolygonsFuncInvalid("MULTIPOLYGON (((1 1, 2 1, 2 2, 1 1)), ((1 1, 2 2, 3 3, 2 2, 1 1)))");
+
+        // Single polygon with zero area
+        testDegeneratePolygonsFuncInvalid(
+                        "MULTIPOLYGON (((5 10, 25 30, 15 20, 5 10)))");
+        testDegeneratePolygonsFuncInvalid(
</code_context>

<issue_to_address>
**suggestion (testing):** Missing test for single Polygon with zero-area ring (should succeed).

Please add a test for a single Polygon with a zero-area ring to verify that it passes, as required by the PR.

```suggestion
        // MultiPolygons with zero-area rings. These need to fail because our
        // serialization format holds MultiPolygons as single vectors that rely on
        // orientation for determining shell start points.

        // Second polygon is zero area
        testDegeneratePolygonsFuncInvalid("MULTIPOLYGON (((1 1, 2 1, 2 2, 1 1)), ((1 1, 2 2, 3 3, 2 2, 1 1)))");

        // Single polygon with zero area
        testDegeneratePolygonsFuncInvalid(
                        "MULTIPOLYGON (((5 10, 25 30, 15 20, 5 10)))");
        testDegeneratePolygonsFuncInvalid(
                "MULTIPOLYGON (((1 1, 1 2, 1 3, 1 1)))");

        // Single Polygon with a zero-area ring (should succeed)
        testDegeneratePolygonsFuncValid("POLYGON ((1 1, 2 2, 3 3, 1 1))");
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

}

if ((isShell && !isClockwise) || (!isShell && isClockwise)) {
if ((isShell && clockwiseResult == ClockwiseResult.CCW) || (!isShell && clockwiseResult == ClockwiseResult.CW)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: The comment above this line incorrectly states the required orientation.

Update the comment to state that shells must be clockwise and holes must be counter-clockwise, matching the code's logic.

Comment on lines +1433 to +1444
// MultiPolygons with zero-area rings. These need to fail because our
// serialization format holds MultiPolygons as single vectors that rely on
// orientation for determining shell start points.

// Second polygon is zero area
testDegeneratePolygonsFuncInvalid("MULTIPOLYGON (((1 1, 2 1, 2 2, 1 1)), ((1 1, 2 2, 3 3, 2 2, 1 1)))");

// Single polygon with zero area
testDegeneratePolygonsFuncInvalid(
"MULTIPOLYGON (((5 10, 25 30, 15 20, 5 10)))");
testDegeneratePolygonsFuncInvalid(
"MULTIPOLYGON (((1 1, 1 2, 1 3, 1 1)))");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Missing test for single Polygon with zero-area ring (should succeed).

Please add a test for a single Polygon with a zero-area ring to verify that it passes, as required by the PR.

Suggested change
// MultiPolygons with zero-area rings. These need to fail because our
// serialization format holds MultiPolygons as single vectors that rely on
// orientation for determining shell start points.
// Second polygon is zero area
testDegeneratePolygonsFuncInvalid("MULTIPOLYGON (((1 1, 2 1, 2 2, 1 1)), ((1 1, 2 2, 3 3, 2 2, 1 1)))");
// Single polygon with zero area
testDegeneratePolygonsFuncInvalid(
"MULTIPOLYGON (((5 10, 25 30, 15 20, 5 10)))");
testDegeneratePolygonsFuncInvalid(
"MULTIPOLYGON (((1 1, 1 2, 1 3, 1 1)))");
// MultiPolygons with zero-area rings. These need to fail because our
// serialization format holds MultiPolygons as single vectors that rely on
// orientation for determining shell start points.
// Second polygon is zero area
testDegeneratePolygonsFuncInvalid("MULTIPOLYGON (((1 1, 2 1, 2 2, 1 1)), ((1 1, 2 2, 3 3, 2 2, 1 1)))");
// Single polygon with zero area
testDegeneratePolygonsFuncInvalid(
"MULTIPOLYGON (((5 10, 25 30, 15 20, 5 10)))");
testDegeneratePolygonsFuncInvalid(
"MULTIPOLYGON (((1 1, 1 2, 1 3, 1 1)))");
// Single Polygon with a zero-area ring (should succeed)
testDegeneratePolygonsFuncValid("POLYGON ((1 1, 2 2, 3 3, 1 1))");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant