Skip to content

Conversation

@bretttully
Copy link

@bretttully bretttully commented Dec 9, 2025

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

Add validation to detect when a polygon's interior is not simply connected, which violates the OGC Simple Feature Specification requirement that "the interior of every Surface is a connected point set."

What This Detects

The validation detects two types of interior disconnection:

  1. Multi-touch: Two rings sharing 2+ touch points at different coordinates
  2. Cycle: Rings forming a cycle through distinct single-touch coordinates

Algorithm

Uses the GeometryGraph edge intersection data to find where rings touch:

  1. Collect intersections: For each edge intersection, record the coordinate, which ring (edge) it belongs to, and whether it's a vertex of that ring
  2. Sort and group: Sort intersections by (x, y, edge_idx) using total_cmp for consistent ordering, then group by coordinate
  3. Build touch map: For coordinates where 2+ rings meet (and at least one has it as a vertex), record the touch for each ring pair
  4. Detect disconnection:
    • If any ring pair has 2+ touch coordinates → disconnected
    • Build a graph of single-touch ring pairs and DFS for cycles through distinct coordinates

A "touch" requires at least one ring to have the point as an actual vertex (not a mid-segment crossing).

Key Insight

Multiple rings meeting at a single coordinate does NOT disconnect the interior. For example, three triangular holes meeting at one point still have a connected interior around their perimeter. Disconnection only occurs when touches happen at distinct coordinates.

Error Reporting

The error message includes the problematic coordinates to aid debugging:

polygon interior is not simply connected at coordinate(s): (2, 2), (3, 3)

Performance

Uses PreparedGeometry to cache R-tree structures for interior/exterior containment checks. The sorting-based coordinate grouping provides better cache locality than hash-based approaches.

Copy link
Member

@michaelkirk michaelkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! Thank you for the pull request. This would be a great feature to incorporate, but I've got some questions to work through.

let edge = RefCell::borrow(edge);
edge.edge_intersections()
.iter()
.filter_map(|ei| coord_to_bits(&ei.coordinate()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: f64::to_bits

Is this in pursuit of getting the floating points to fit into a HashSet? There's a reason you can't put floats in a HashSet, and it's not clear to me that this usage is valid.

Can you talk more about this? Maybe it's OK, but I've just never used floats this way.

The only problematic examples I can think of off the top of my head are:

  1. NaN coords which have many representations, but we don't make any guarantees for NaN coords anyway, so that's probably fine

  2. -0.0 vs. +0.0 - topologically those are the same number, but they'd appear as distinct entries, which seems like it could confound this algorithm.

I'm not sure if there are any others.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this comment, the algo is now using ordered points via total_cmp -- gave a performance boost too.

RingRole::Exterior,
RingRole::Interior(0)
)]
let errors = polygon.validation_errors();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the assert_validation_errors macro. If you're having trouble I can help, but I'm not sure if the issue is you tried and it didn't work or if this is just what the AI spit out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just what the AI spit out.

I have to say I spent a long time deliberating whether or not to make this PR. I've never written a line of rust before, and I really didn't want to pollute the codebase with AI slop. Nor did I want to waste your time with overly verbose / poor code.

I find the solution readable, and the tests make sense to me (and are validated using shapely), so I hope I'm not too far off track!

geo/CHANGES.md Outdated
## 0.32.0 - 2025-12-05

- Add simply connected interior validation for polygons. Polygons with holes that touch at vertices in ways that disconnect the interior (e.g., two holes sharing 2+ vertices, or cycles of holes each sharing a vertex) are now detected as invalid via `Validation::is_valid()`. This aligns with OGC Simple Features and matches PostGIS behavior.
- Performance: Polygon validation now uses `PreparedGeometry` to cache R-tree structures, improving validation speed by 26-45% for polygons with many holes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PreparedGeometry is just used for this newly added validation right? So there's no world in which people validating before would see a speedup, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added a let prepared_exterior = PreparedGeometry::from(&polygon_exterior); for the exterior-interior relate calls that has a nice improvement in the benchmarks for polygons with lots of holes.

bretttully and others added 7 commits December 19, 2025 11:55
These tests document the missing validation check for simply connected
polygon interiors, as noted in polygon.rs:13-14. The OGC Simple Feature
Specification requires that "the interior of every Surface is a
connected point set."

Test cases:
- Two L-shaped holes sharing 2 vertices (disconnects interior)
- Checkerboard level 0: 13 holes sharing vertices at grid points
- Checkerboard level 1: Nested checkerboard with 26 holes
- Valid cases: holes sharing 1 vertex, non-touching holes

The checkerboard pattern demonstrates that a simple "two holes sharing
2+ vertices" check is insufficient - four holes meeting at single
vertices can also disconnect the interior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add validation to detect when a polygon's interior is not simply connected,
which violates the OGC Simple Feature Specification requirement that
"the interior of every Surface is a connected point set."

The validation detects two types of disconnection:
- Multi-touch: Two rings sharing 2+ vertices at different coordinates
- Cycle detection: Rings forming a cycle through distinct single-touch points

Key implementation details:
- Build a touch graph where nodes are rings and edges connect touching rings
- Detect vertex-vertex touches (same coordinate on both rings)
- Detect vertex-on-edge touches (vertex lies on another ring's edge)
- Use DFS to find cycles through distinct coordinates

The error message now includes the problematic coordinates to aid debugging,
e.g., "polygon interior is not simply connected at coordinate(s): (2, 2), (3, 3)"

Tests cover:
- Two L-shaped holes sharing two vertices (invalid)
- Checkerboard patterns with shared vertices (invalid)
- Four holes forming a cycle of single touches (invalid)
- Three holes meeting at one vertex (valid - interior still connected)
- Holes with vertex-on-edge touches to exterior (invalid)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Remove the 1.2MB validate.geojson file and complex GeoJSON parsing
code. Use existing fixtures from geo_test_fixtures instead:
- east_baton_rouge polygon
- nl_zones MultiPolygon
- nl_plots_wgs84 MultiPolygon

This keeps the benchmark coverage while eliminating the custom
fixture and 50+ lines of parsing code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Replace the custom translate_ring() function with the existing
Translate::translate() method from geo. This avoids duplicating
functionality that already exists in the crate.

Added geo as a dependency to geo-test-fixtures to access the
Translate trait.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Replace the to_bits() hashing approach with total_cmp-based sorting for
grouping coordinates by equality. This approach:

- Uses GeoFloat::total_cmp for consistent ordering (handles -0.0 == +0.0)
- Groups consecutive equal coordinates after sorting
- Sorts by edge_idx within groups for efficient deduplication
- Maintains O(n log n) complexity with better cache locality

Benchmarks show 5-15% improvement over the previous hashing approach.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Replace manual error checking with the assert_validation_errors! macro
for InteriorNotSimplyConnected tests. Use exact coordinates for the
expected touch points instead of pattern matching.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Consolidate the performance note into the main validation bullet point
to clarify that PreparedGeometry is used for interior/exterior
containment checks, not just the new simply-connected check.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@bretttully bretttully force-pushed the fix-simply-connected-interior-validation branch from 935d0ad to 15efe63 Compare December 19, 2025 01:56
This test documents that iOverlay currently produces polygons with
disconnected interiors when boolean operations result in holes sharing
2+ vertices. The new simply-connected interior validation detects this.

When iOverlay fixes issue georust#27, it should split the result into multiple
valid polygons instead of producing one invalid polygon.

Related:
- iOverlay issue: iShape-Rust/iOverlay#27
- geo issue: georust#1470
- Supersedes PR georust#1471

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants