Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: replace geojson uploads with filegdb uploads #54

Merged
merged 6 commits into from
Sep 29, 2023
Merged

Feat: replace geojson uploads with filegdb uploads #54

merged 6 commits into from
Sep 29, 2023

Conversation

jacobdadams
Copy link
Member

I discovered a relatively new package called pyogrio that was built for the geopandas team to solve some of the dep problems with fiona/gdal. Turns out the version available on pypi has the new OpenFileGDB driver with write support built in, so now we've got all the pieces to upload gdbs to AGOL w/o arcpy.

GDB uploads are more efficient, and I've not found any documentation on file size limits. This avoids the geojson chunking requirements along with the projection limitations. Resulting operations should be much, much faster for large datasets.

I've waffled now on a couple releases on whether the FeatureServiceUpdater methods should be class methods or regular methods, but now it's making much more sense for them to be regular methods requiring instantiating the class first. This, plus the gdb upload change, plus the REST loader, will bump the version a full major number.

This PR strips out all the geojson upload cruft and replaces it with file gdb uploads. There may still be some residual stuff elsewhere that supported the goejson stuff, but I think I got the main stuff.

@jacobdadams jacobdadams self-assigned this Sep 27, 2023
@jacobdadams jacobdadams removed their assignment Sep 27, 2023
@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (96fab9b) 91.85% compared to head (32c5ade) 93.28%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #54      +/-   ##
==========================================
+ Coverage   91.85%   93.28%   +1.43%     
==========================================
  Files           7        7              
  Lines        1031     1058      +27     
  Branches      146      143       -3     
==========================================
+ Hits          947      987      +40     
+ Misses         74       62      -12     
+ Partials       10        9       -1     
Files Coverage Δ
src/palletjack/load.py 91.36% <100.00%> (+5.38%) ⬆️
src/palletjack/utils.py 95.04% <100.00%> (+0.19%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@steveoh steveoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perusing the diff and things look great. this will be a nice improvement for palletjack. The only caveat being

WARNING: Pyogrio is still at an early version and the API is subject to substantial change.

src/palletjack/load.py Outdated Show resolved Hide resolved
src/palletjack/load.py Outdated Show resolved Hide resolved
src/palletjack/load.py Outdated Show resolved Hide resolved
src/palletjack/load.py Outdated Show resolved Hide resolved
src/palletjack/load.py Outdated Show resolved Hide resolved
@jacobdadams
Copy link
Member Author

Ok, I think I've addressed and improved everything. I acknowledge the potential API instability, and given that I've run into a project where a single feature exceeds the GeoJSON upload limit, it's something we're going to have to live with and work around. I think the benefits outweigh the upkeep cost.

@jacobdadams jacobdadams merged commit 5041745 into main Sep 29, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants