Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get region through array slice notation #1824

Open
manthey opened this issue Feb 13, 2025 · 5 comments
Open

Get region through array slice notation #1824

manthey opened this issue Feb 13, 2025 · 5 comments

Comments

@manthey
Copy link
Member

manthey commented Feb 13, 2025

It would be nice to expose data via array slice notation.

That is, source[y1:y2,x1:x2,:] would be the same as source.getRegion(format=TILE_FORMAT_NUMPY, region=dict(left=x1, right=x2, top=y1, bottom=y2)). And, source.shape would return (sizeY, sizeX, bandCount).

Further, using a value like source[y1:y2:2, x1:x2:2. :] or source[y1:y2:3, x1:x2:3. :] would automatically use the nearest neighbor at the smallest resolution that is sufficient for distinct pixels (resample would always be false).

For explicit magnifications and resolutions, there should be a pair of convenience functions: source.atMagnification(10)[...] and sourceAtScale(scale, units)[...]. For scale, specifying units via a string of m, mm, or um (or one of the projection values for geospatial) would be handy. The shape property would work on these arrays, too.

For array responses over a certain size, maybe we should consider returning a dask array rather than a numpy array so that it does kill memory (e.g., asking for source[:, :, :] would be bad.

To handle multi-axis data, we could accept slices like source[frame,y1:y2,x1:y2,:]. I don't know if we could also slice via named tuples (e.g., source[c=4,t=slice(0, 5),y1:y2,x1:x2,:]).

We do want to support rolling axes (e.g., if we are natively Z, T, C, Y, X, S, we should be able to use numpy or some other syntax to conceptually make the axes in another order).

@manthey
Copy link
Member Author

manthey commented Feb 13, 2025

I could see wanting different accessors: source.shape might always be Y, X, S, and source.frame.shape would be F, Y, X, S, and source.full.shape would be all conceptual axes. Or maybe source.axes(axes list)[...] would expose whatever axes order you want.

@manthey
Copy link
Member Author

manthey commented Feb 13, 2025

@cooperlab, if you have additional thoughts on this, please comment.

@manthey
Copy link
Member Author

manthey commented Feb 13, 2025

Using python's __getitem__ method, I think that we reject any single value keys (or should we return Y=, X=[:], S=[:]?). Two-component keys would be Y, X. Three-component keys would be Y, X, S. Four component keys would be FRAME, Y, X, S. Five or more would be in the order of the axes followed by Y, X, S.

@cooperlab
Copy link

It is desirable to check the shape without having to actually generate the array. This info is used for simple things like calculating the number of tiles of a given size in each dimension, and we might not end up reading the entire image at that magnification. So something like this that does the shape arithmetic without actually reading pixels

source.atMagnification(10).shape or source.atScale(scale, units).shape.

@amrosado
Copy link

I made an implementation of this using the tile iterator in simple triton as base which seems equally performant in my tests. I create a dictionary prior to iterating to map the original image's pixels to output pixels based on the pixel size metadata/ If that metadata isn't in the image, I assume pixel dimensions from other similar sources and do a manual rescale given large_image doesn't seem to scale mm without that parameter being defined in the metadata (would be nice to have this feature just in case images don't have the necessary properties). In my use case keeping track of these conversion values allows for mapping various model outputs back to the original image so some way of accessing information used in conversions is essential. I created similar chunks using euclidean distance between tiles because I couldn't use the study interface in my use case. I'm willing to share various elements of that code among our team if there is interest, but will avoid posting them here given they are from the private simple triton project. There were some interesting changes I had to make to the tile iterator to get it to work properly in my use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants