You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think it would be great if zarr-python could automatically pick a smart shard shape and chunk shape for users, based on an array shape and a dtype (i.e., the stuff that we will know if a user is coming in with a numpy array). Good defaults would make a lot of users happy.
Off the top of my head, the following constraints should factor in to the automatic shard shape / chunk shape:
min / max size (in bytes)
min / max count
shape constraints. some examples:
chunks must tile the shard perfectly (non-configurable)
chunks should have 1 axis length that is fixed to a constant, other lengths can vary to satisfy other constraints
it might be useful to combine a size constraint to shards, and a mixed size / shape constraint to chunks, e.g. "chunks should be ~isotropic, divisible by a power of 2 on each size, inside a shard that is at most 100 MB"
and it's possible that these constraints should be configurable, via the global config, or via keyword arguments to array creation.
Any thoughts? @jbms if you have any tensorstore stories to share about this I would be very interested.
The text was updated successfully, but these errors were encountered:
I think it would be great if
zarr-python
could automatically pick a smart shard shape and chunk shape for users, based on an array shape and a dtype (i.e., the stuff that we will know if a user is coming in with a numpy array). Good defaults would make a lot of users happy.Off the top of my head, the following constraints should factor in to the automatic shard shape / chunk shape:
it might be useful to combine a size constraint to shards, and a mixed size / shape constraint to chunks, e.g. "chunks should be ~isotropic, divisible by a power of 2 on each size, inside a shard that is at most 100 MB"
and it's possible that these constraints should be configurable, via the global config, or via keyword arguments to array creation.
Any thoughts? @jbms if you have any tensorstore stories to share about this I would be very interested.
The text was updated successfully, but these errors were encountered: