Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSI reader improvements #218

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

stergioc
Copy link
Collaborator

Added

  • slideio backend for WSI is now available and allows loading among others the Olympus .vsi format
  • nn.DataParallel for WSI inference now allows using multiple GPUs and larger batch_sizes

Changed

  • Removed chunking for the io.wsi reader (leads x10 speedup when reading tiles from the sdata)

@stergioc stergioc changed the title Wsi reader improvements WSI reader improvements Feb 25, 2025
Copy link
Collaborator

@quentinblampey quentinblampey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @stergioc! I added two comments (questions, actually)

@@ -79,6 +79,16 @@ def __init__(self, path: str, tilesize: int = 512):
self._writeable = False
self._erasable = False

def __contains__(self, key: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed here?

@@ -36,9 +35,9 @@ def wsi(
scale_image = DataArray(
img[key].transpose("S", f"Y{suffix}", f"X{suffix}"),
dims=("c", "y", "x"),
).chunk(chunks)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still have chunks (since you removed .chunk(chunks))? Do we sometimes have images without any chunks?

@stergioc
Copy link
Collaborator Author

While trying to answer your questions I noticed that even if this branch passes all tests it fails when used for normal WSIs (not the small ones I used for testing). It seems that the Image2DModel.parse function that follows throws an memory error (CMU-3.svs):

numpy._core._exceptions._ArrayMemoryError: Unable to allocate 67.0 GiB for an array with shape (3, 45402, 66000) and data type int64

I will take a closer look at this because there is a ~7-10x drop in reading time when using sopa vs the native read_region from the different backends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants