Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BRIGHT dataset #2520

Merged
merged 14 commits into from
Jan 28, 2025
Merged

Add BRIGHT dataset #2520

merged 14 commits into from
Jan 28, 2025

Conversation

nilsleh
Copy link
Collaborator

@nilsleh nilsleh commented Jan 18, 2025

This PR adds the BRIGHT dataset.

Dataset Features:

* Pre-disaster optical images from MAXAR, NAIP, NOAA Digital Coast Raster Datasets, and the National Plan for Aerial Orthophotography Spain
* Post-disaster SAR images from Capella Space and Umbra
* high image resolution of 0.3-1m

Dataset Format:

* Images are in GeoTIFF format with pixel dimensions of 1024x1024
* Pre-disaster are three channel images
* Post-disaster SAR images are single channel but repeated to have 3 channels

bright_example

@ChenHongruixuan @olidietrich Thank you for the nice work and making the dataset public. I was wondering whether you could include the split .txt files found here inside the zip file on Huggingface such that everything is in one place? And of course if you have any other comments about the PR, feel free to let us know below.

@nilsleh nilsleh marked this pull request as draft January 18, 2025 10:19
@github-actions github-actions bot added documentation Improvements or additions to documentation datasets Geospatial or benchmark datasets testing Continuous integration testing datamodules PyTorch Lightning datamodules labels Jan 18, 2025
@adamjstewart adamjstewart added this to the 0.7.0 milestone Jan 18, 2025
@github-actions github-actions bot removed the datamodules PyTorch Lightning datamodules label Jan 20, 2025
@nilsleh nilsleh marked this pull request as ready for review January 20, 2025 10:48
@nilsleh nilsleh requested a review from isaaccorley January 22, 2025 17:27
torchgeo/datasets/bright.py Show resolved Hide resolved
docs/api/datasets/non_geo_datasets.csv Outdated Show resolved Hide resolved
torchgeo/datasets/bright.py Show resolved Hide resolved
torchgeo/datasets/bright.py Outdated Show resolved Hide resolved
torchgeo/datasets/bright.py Outdated Show resolved Hide resolved
torchgeo/datasets/bright.py Outdated Show resolved Hide resolved
torchgeo/datasets/bright.py Outdated Show resolved Hide resolved
@ChenHongruixuan
Copy link

ChenHongruixuan commented Jan 25, 2025

This PR adds the BRIGHT dataset.

Dataset Features:

* Pre-disaster optical images from MAXAR, NAIP, NOAA Digital Coast Raster Datasets, and the National Plan for Aerial Orthophotography Spain
* Post-disaster SAR images from Capella Space and Umbra
* high image resolution of 0.3-1m

Dataset Format:

* Images are in GeoTIFF format with pixel dimensions of 1024x1024
* Pre-disaster are three channel images
* Post-disaster SAR images are single channel but repeated to have 3 channels

bright_example

@ChenHongruixuan @olidietrich Thank you for the nice work and making the dataset public. I was wondering whether you could include the split .txt files found here inside the zip file on Huggingface such that everything is in one place? And of course if you have any other comments about the PR, feel free to let us know below.

Hi @nilsleh ,

Thank you very much for your contribution and efforts in integrating BRIGHT into torchgeo! We are a little hesitant to include the txt file you mentioned into this for torchgeo, as that is only for the DFC25 division. That's why we provide these txt files in github instead of original dataset zip file, especially considering that the current BRIGHT is not final version.

Best,

@nilsleh
Copy link
Collaborator Author

nilsleh commented Jan 27, 2025

Thank you very much for your contribution and efforts in integrating BRIGHT into torchgeo! We are a little hesitant to include the txt file you mentioned into this for torchgeo, as that is only for the DFC25 division. That's why we provide these txt files in github instead of original dataset zip file, especially considering that the current BRIGHT is not final version.

Thanks for getting back @ChenHongruixuan , no problem, we can also host them on our torchgeo Hugginface, just to make downloading more easier. Also very much looking forward to the final version of the dataset. Feel free to let me know when that is openly available, such that we can integrate support for that as well here.

docs/api/datasets.rst Outdated Show resolved Hide resolved
@ChenHongruixuan
Copy link

Thank you very much for your contribution and efforts in integrating BRIGHT into torchgeo! We are a little hesitant to include the txt file you mentioned into this for torchgeo, as that is only for the DFC25 division. That's why we provide these txt files in github instead of original dataset zip file, especially considering that the current BRIGHT is not final version.

Thanks for getting back @ChenHongruixuan , no problem, we can also host them on our torchgeo Hugginface, just to make downloading more easier. Also very much looking forward to the final version of the dataset. Feel free to let me know when that is openly available, such that we can integrate support for that as well here.

@nilsleh , thank you so much for your support!!

@adamjstewart adamjstewart merged commit e42c404 into microsoft:main Jan 28, 2025
19 checks passed
@adamjstewart
Copy link
Collaborator

Seeing these warnings in CI:

tests/datasets/test_bright.py::TestBRIGHTDFC2025::test_getitem[train]
tests/datasets/test_bright.py::TestBRIGHTDFC2025::test_getitem[val]
tests/datasets/test_bright.py::TestBRIGHTDFC2025::test_getitem[test]
tests/datasets/test_bright.py::TestBRIGHTDFC2025::test_plot[train]
tests/datasets/test_bright.py::TestBRIGHTDFC2025::test_plot[val]
tests/datasets/test_bright.py::TestBRIGHTDFC2025::test_plot[test]
  /opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/rasterio/__init__.py:356: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
    dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)

If the images are not supposed to be georeferenced, we should use PIL instead of rasterio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants