-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Multiprocessing interpolation and reprojection #661
base: main
Are you sure you want to change the base?
Conversation
Hi @vschaffn @adebardo, nice to see the progress! Summary: Upon reading the code in Why? Moving forward: geoutils/geoutils/raster/delayed.py Line 783 in b4da49b
So all the projected grid matching calculations to map the input chunks to output chunks written above that line (using the GeoGrid classes I mentioned) can simply stay the same 😉
|
Also note that Rasterio has reprojection errors that can appear (especially when comparing a tiled reprojection to a reprojection on the full raster), due to internal inconsistencies... |
To clarify one more remark: In Due to shape deformations, the mapping of input-output tiles between any CRS requires more than an overlap value, and also depends on the input/output chunksizes (what the You can essentially turn this list comprehension here into your loop for geoutils/geoutils/raster/delayed.py Line 784 in b4da49b
And let reproject_block stick the pieces of input tiles in the right places (you might not have a perfect square with all the tiles you opened), and give you the output: geoutils/geoutils/raster/delayed.py Line 644 in b4da49b
|
I should have thought about this last week (I was mostly off work and only popped for the comment above, so I didn't think about the big picture, sorry!): It is only |
Resolves #648.
Description
This PR introduces raster interpolation and reprojection using multiprocessing to optimize memory usage and improve performance.
Code changes
multiproc_interp_points
leverages multiprocessing to create a separate process for each point to be interpolated. For each point, the function calculates the smallest possible raster window to open based on the interpolation method and point coordinates, and use theraster.crop
method to open the window. Theraster.interpolate
method is then applied to the cropped raster window, allowing for efficient interpolation without loading the full raster into memory.multiproc_reproject
utilizes raster tiling (see Implement Tiling for Multiprocessing #652) to perform multiprocessing-based reprojection. A separate process is created for each tile of the raster. Each process opens the corresponding tile using theraster.icrop
method, calculates the bounds of the reprojected tile, snaps the bounds to the target reprojected grid, and reprojects the tile using theraster.reproject
method. The reprojected tiles are written separately to disk, with safeguards in place to prevent data overwriting.Tests
multiproc_interp_points
andmultiproc_reproject
. The test results are compared against the behavior of theraster.interpolate
andraster.reproject
methods to ensure consistency.Note:
Currently, there are some tile alignment issues when performing reprojection operations that involve more than just translation. Further investigation is required to address these challenges.
Difference between tiled reprojection and classic reprojection (exploradores_aster_dem):