Add BigEarthNet Version2 #2531

nilsleh · 2025-01-24T19:00:10Z

Superseeds #2371 after talking to Ando.

After taking a look, the new version comes with a metadata.parquet file, which makes data handling quiet a bit more straightforward. With a version=2 argument, I felt like there would be many nested if statements and therefore, cleaner to do it this way. If there was a similar metadata.parquet file for V1, then this could be made more condensed.

nilsleh · 2025-01-27T16:55:37Z

torchgeo/datasets/bigearthnet.py

@@ -495,7 +504,7 @@ def _download(self, url: str, filename: Path, md5: str) -> None:
            filename: output filename to write downloaded file
            md5: md5 of downloaded file
        """
-        if not os.path.exists(filename):
+        if not os.path.exists(os.path.join(self.root, filename)):


Saw this, and actually I think this is required, because it should check whether the file exists in root already right?

Good catch. Actually, this line isn't necessary because we already check if the zipfile exists and extract it before we reach the download. Can you open a separate PR to fix this so we can backport it to 0.6.3?

nilsleh · 2025-01-27T16:56:48Z

torchgeo/datasets/bigearthnet.py

+        Returns:
+            the target label
+        """
+        indices = self.metadata_df.iloc[index]['labels']


Also need to check the labels. In the V1 class it only seems possible to select the 19 label versions, because with selecting 43 they also get mapped to 19 if I understand correctly.

In Version2 it appears that there are only 19 labels in the parquet file, ran:

unique_labels = self.metadata_df['labels'].explode().unique().tolist()

to get these unique labels:

['Broad-leaved forest', 'Coniferous forest', 'Inland waters', 'Mixed forest', 'Pastures', 'Urban fabric', 'Arable land', 'Industrial or commercial units', 'Land principally occupied by agriculture, with significant areas of natural vegetation', 'Complex cultivation patterns', 'Transitional woodland, shrub', 'Inland wetlands', 'Natural grassland and sparsely vegetated areas', 'Moors, heathland and sclerophyllous vegetation', 'Marine waters', 'Coastal wetlands', 'Beaches, dunes, sands', 'Permanent crops', 'Agro-forestry areas']

So for V2 will remove the option to specify 43 classes.

nilsleh · 2025-01-28T17:45:05Z

@ando-shah in case you wanna have a look, since you have experience with the dataset already and find anything.

nilsleh added 2 commits January 24, 2025 17:36

v2

a50c9a8

v2

81e4df3

nilsleh marked this pull request as draft January 24, 2025 19:00

nilsleh added this to the 0.7.0 milestone Jan 24, 2025

github-actions bot added documentation Improvements or additions to documentation datasets Geospatial or benchmark datasets testing Continuous integration testing labels Jan 24, 2025

nilsleh added 4 commits January 27, 2025 08:41

v2

142f403

merge main

14d6735

ruff

a4eceee

tsts

9dfbb72

github-actions bot added the dependencies Packaging and dependencies label Jan 27, 2025

nilsleh added 3 commits January 27, 2025 10:35

tests

c40aff1

coverage

9c15478

tests

ef5da79

nilsleh commented Jan 27, 2025

View reviewed changes

nilsleh added 4 commits January 27, 2025 17:57

ruff

fa0f20b

tests

a1732e9

tests

3c3511d

more ruff

66f6536

nilsleh marked this pull request as ready for review January 27, 2025 18:49

nilsleh marked this pull request as draft January 27, 2025 19:11

nilsleh added 5 commits January 27, 2025 20:40

add mapping labels

df2cdba

map labels

df21841

style

67b3653

fix tests

029863d

update clc codes

0555a73

nilsleh marked this pull request as ready for review January 28, 2025 16:16

nilsleh mentioned this pull request Jan 29, 2025

BigEarthNet: Remove unnecessary line #2545

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add BigEarthNet Version2 #2531

Add BigEarthNet Version2 #2531

nilsleh commented Jan 24, 2025 •

edited

Loading

nilsleh Jan 27, 2025

adamjstewart Jan 27, 2025

nilsleh Jan 27, 2025

nilsleh Jan 27, 2025

nilsleh commented Jan 28, 2025

Add BigEarthNet Version2 #2531

Are you sure you want to change the base?

Add BigEarthNet Version2 #2531

Conversation

nilsleh commented Jan 24, 2025 • edited Loading

nilsleh Jan 27, 2025

Choose a reason for hiding this comment

adamjstewart Jan 27, 2025

Choose a reason for hiding this comment

nilsleh Jan 27, 2025

Choose a reason for hiding this comment

nilsleh Jan 27, 2025

Choose a reason for hiding this comment

nilsleh commented Jan 28, 2025

nilsleh commented Jan 24, 2025 •

edited

Loading