You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the moment, our dataset plotting routines are inconsistent. While some plot methods stretch to the range of the image, others simply divide by 3K or 10K and clip to get images to the range of 0–1. I propose we convert the latter to the former and consistently stretch images for all datasets.
Rationale
While technically correct, many of our plotting methods make it difficult to visualize images. This is especially true for datamodules, where normalization has been applied to all images and images are no longer in the uint8 or float32 range of the original data.
Implementation
I propose we use one of the default visualization options used by QGIS:
Clip to 2% to 98% range (exact percentages TBD)
Clip to min/max
Clip to mean ± 2 std dev (exact std dev TBD)
We already have a torchgeo.datasets.utils.percentile_normalization method we could use or modify for this purpose.
Alternatives
We could apply an inverse Normization transform during datamodule plotting. This would help with datamodule plotting, but still suffers from inconsistent dataset plotting. It would only be a couple lines of code though, which would make it much easier.
Additional information
@calebrob6 we discussed this on Slack or somewhere.
Note that this contradicts #496, so we should decide on one approach or another.
The text was updated successfully, but these errors were encountered:
Before #476, we basically had this functionality for most of our GeoDatasets already. RasterDataset.plot would normalize all images to the 2% to 98% range using self.rgb_bands and self.all_bands, and would only need to be overridden for non-image datasets. We should consider bringing this back to save us time.
In my own plotting funcs I have been applying the percentile_normalisation to 'undo' the effects of normalisation, but the result is not always that great. I think an un-normalize followed by percentile_normalisation with dataset specific percentiles would make sense. e.g. min/max can then be achieved with 0 & 100 percentiles. Default to 2 & 98 generally works pretty well, and if QGIS do this it is probably pretty sensible
adamjstewart
changed the title
Proposal for better dataset plotting
Dataset plotting: normalization
Oct 29, 2024
Summary
At the moment, our dataset plotting routines are inconsistent. While some plot methods stretch to the range of the image, others simply divide by 3K or 10K and clip to get images to the range of 0–1. I propose we convert the latter to the former and consistently stretch images for all datasets.
Rationale
While technically correct, many of our plotting methods make it difficult to visualize images. This is especially true for datamodules, where normalization has been applied to all images and images are no longer in the uint8 or float32 range of the original data.
Implementation
I propose we use one of the default visualization options used by QGIS:
We already have a
torchgeo.datasets.utils.percentile_normalization
method we could use or modify for this purpose.Alternatives
We could apply an inverse Normization transform during datamodule plotting. This would help with datamodule plotting, but still suffers from inconsistent dataset plotting. It would only be a couple lines of code though, which would make it much easier.
Additional information
@calebrob6 we discussed this on Slack or somewhere.
Note that this contradicts #496, so we should decide on one approach or another.
The text was updated successfully, but these errors were encountered: