Dataset plotting: normalization #1263

adamjstewart · 2023-04-18T17:03:37Z

Summary

At the moment, our dataset plotting routines are inconsistent. While some plot methods stretch to the range of the image, others simply divide by 3K or 10K and clip to get images to the range of 0–1. I propose we convert the latter to the former and consistently stretch images for all datasets.

Rationale

While technically correct, many of our plotting methods make it difficult to visualize images. This is especially true for datamodules, where normalization has been applied to all images and images are no longer in the uint8 or float32 range of the original data.

Implementation

I propose we use one of the default visualization options used by QGIS:

Clip to 2% to 98% range (exact percentages TBD)
Clip to min/max
Clip to mean ± 2 std dev (exact std dev TBD)

We already have a torchgeo.datasets.utils.percentile_normalization method we could use or modify for this purpose.

Alternatives

We could apply an inverse Normization transform during datamodule plotting. This would help with datamodule plotting, but still suffers from inconsistent dataset plotting. It would only be a couple lines of code though, which would make it much easier.

Additional information

@calebrob6 we discussed this on Slack or somewhere.

Note that this contradicts #496, so we should decide on one approach or another.

The text was updated successfully, but these errors were encountered:

adamjstewart · 2024-04-01T09:12:22Z

Before #476, we basically had this functionality for most of our GeoDatasets already. RasterDataset.plot would normalize all images to the 2% to 98% range using self.rgb_bands and self.all_bands, and would only need to be overridden for non-image datasets. We should consider bringing this back to save us time.

robmarkcole · 2024-06-30T07:47:11Z

In my own plotting funcs I have been applying the percentile_normalisation to 'undo' the effects of normalisation, but the result is not always that great. I think an un-normalize followed by percentile_normalisation with dataset specific percentiles would make sense. e.g. min/max can then be achieved with 0 & 100 percentiles. Default to 2 & 98 generally works pretty well, and if QGIS do this it is probably pretty sensible

adamjstewart added the datasets Geospatial or benchmark datasets label Apr 18, 2023

adamjstewart mentioned this issue Nov 5, 2023

Fix for Blank Images in plot due to Float Tensor Ranges #1712

Merged

adamjstewart added the good first issue A good issue for a new contributor to work on label Apr 1, 2024

adamjstewart mentioned this issue Apr 1, 2024

Add plot method to IntersectionDataset #1971

Open

adamjstewart mentioned this issue May 29, 2024

Update VHR-10 dataset plotting #2092

Merged

adamjstewart mentioned this issue Jun 29, 2024

Figure plotting with Eurosat & classification trainer gives blank images #2139

Closed

adamjstewart changed the title ~~Proposal for better dataset plotting~~ Dataset plotting: normalization Oct 29, 2024

adamjstewart mentioned this issue Oct 29, 2024

Better plotting of true color images #496

Closed

This was referenced Feb 3, 2025

EuroSAT: use percentile normalization during plotting #2557

Draft

Trainers: apply inverse augmentation before plotting #2560

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset plotting: normalization #1263

Dataset plotting: normalization #1263

adamjstewart commented Apr 18, 2023 •

edited

Loading

adamjstewart commented Apr 1, 2024

robmarkcole commented Jun 30, 2024

Dataset plotting: normalization #1263

Dataset plotting: normalization #1263

Comments

adamjstewart commented Apr 18, 2023 • edited Loading

Summary

Rationale

Implementation

Alternatives

Additional information

adamjstewart commented Apr 1, 2024

robmarkcole commented Jun 30, 2024

adamjstewart commented Apr 18, 2023 •

edited

Loading