Skip to content

UCL/t2e-soar-eu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

t2e-soar-eu

SOAR-EU (Scalable Open Automatable Reproducible — European Urban) is a pedestrian-scale urban data model for the EU-funded TWIN2EXPAND project. It produces standardised, multi-scale spatial metrics at street-segment level for 626 urban centres across EU-27 + Norway, Liechtenstein, and Switzerland.

Installation

uv sync

Project configuration is managed using pyproject.toml. uv is used for package management: uv sync installs all dependencies into a .venv folder.

Configuration

All scripts read the data root from the T2E_DATA_DIR environment variable. Set it in your .env file or export it in your shell:

# Option 1: add to .env (recommended)
T2E_DATA_DIR=/path/to/your/data

# Option 2: export in your shell
export T2E_DATA_DIR=/path/to/your/data

Copy .env.example to .env and fill in T2E_DATA_DIR and any Zenodo credentials you need:

cp .env.example .env

Data Loading

The pipeline requires several external datasets to be downloaded before processing. Each step below produces a GeoPackage that feeds into the next. All commands should be run from the repository root.

All scripts resolve data paths from T2E_DATA_DIR automatically (loaded from .env). Paths can also be passed as positional arguments to override the defaults.

Boundaries

Boundaries are extracted from the GHS Urban Centre Database (GHS-UCDB) R2024A produced by the European Commission Joint Research Centre. Urban centres are defined using the Degree of Urbanisation (DEGURBA) methodology: contiguous 1 km^2 cells with at least 1,500 residents per km^2 and cumulative population of at least 50,000. The dataset is available under the European Commission reuse policy (Decision 2011/833/EU).

Download the GHS-UCDB GeoPackage from the above link, then run:

python -m src.data.generate_boundary_polys

Urban Atlas

Urban Atlas 2021 (~34 GB FlatGeobuf vectors, DOI). Download via the Copernicus Data Space Ecosystem S3 endpoint (see CDSE download instructions below).

python -m src.data.load_urban_atlas_blocks

Tree cover

Street Tree Layer 2021 (~4 GB FlatGeobuf vectors). Download via CDSE S3 alongside Urban Atlas.

python -m src.data.load_urban_atlas_trees

Building Heights

Digital Height Model (~1 GB raster).

python -m src.data.load_bldg_hts_raster

Overture Maps data

Downloads and clips Overture layers (buildings, street edges/nodes, POI places, infrastructure) per city boundary. Each city is saved as a separate GeoPackage.

python -m src.data.load_overture --parallel_workers 6 --zip

The Overture POI schema is based on overture_categories.csv.

Census Data (2021)

Eurostat Census Grid 2021 — population and demographic statistics aggregated to 1 km^2 cells. Download the Version 2021 ZIP dataset.

Metrics

Compute all street-segment metrics:

python -m src.processing.generate_metrics --zip

Downloading Urban Atlas and Street Tree Layer from CDSE

Both Copernicus datasets are distributed as FlatGeobuf files via the Copernicus Data Space Ecosystem S3 endpoint.

  1. Create an account at https://dataspace.copernicus.eu/
  2. Generate S3 credentials from your account dashboard (save the secret key immediately)
  3. Configure the AWS CLI:
aws configure
# Access Key ID: <your CDSE access key>
# Secret Access Key: <your CDSE secret key>
# Default region: (leave blank)
# Default output format: json

export AWS_ENDPOINT_URL=https://eodata.dataspace.copernicus.eu/

The CDSE S3 endpoint does not return files inside subdirectories in a flat listing, so aws s3 cp --recursive alone downloads nothing. Iterate over city directories:

# Urban Atlas 2021 (~34 GB)
S3_BASE="s3://EODATA/CLMS/land_cover_use_in_priority_areas/urban_atlas/clms_ua_land-cover-land-use_europe_V025ha_3yearly_v1/2021/01/01"
DEST="$T2E_DATA_DIR/UA_2021_3035_eu"
aws s3 ls "$S3_BASE/" | awk '{print $2}' | while read dir; do
    aws s3 cp "$S3_BASE/$dir" "$DEST/$dir" --recursive
done

# Street Tree Layer 2021 (~4 GB)
S3_BASE="s3://EODATA/CLMS/land_cover_use_in_priority_areas/urban_atlas/clms_ua_street-tree-layer_europe_V005ha_3yearly_v1/2021/01/01"
DEST="$T2E_DATA_DIR/STL_2021_3035_eu"
aws s3 ls "$S3_BASE/" | awk '{print $2}' | while read dir; do
    aws s3 cp "$S3_BASE/$dir" "$DEST/$dir" --recursive
done

Reference: https://documentation.dataspace.copernicus.eu/APIs/S3.html

Zenodo Upload

The processed dataset can be uploaded to Zenodo using paper_data/zenodo_upload.py. The script bundles per-city GeoPackages by country (to stay within Zenodo's 100-file limit), sets deposit metadata, and supports resumable uploads.

Ensure ZENODO_TOKEN and ZENODO_RECORD_ID are set in your .env file, then:

# Preview what will be uploaded
uv run python paper_data/zenodo_upload.py --dry-run --bundle

# Bundle by country and upload (resumable)
uv run python paper_data/zenodo_upload.py --bundle --resume

# Update metadata only
uv run python paper_data/zenodo_upload.py --metadata-only

Bundles are saved to $T2E_DATA_DIR/zenodo_bundles/ by default (override with --bundle-dir).

Data sources

Source Content Licence
GHS-UCDB R2024A Urban centre boundary polygons EC reuse policy (Decision 2011/833/EU)
Overture Maps (Transportation, Buildings) Street networks, building footprints ODbL
Overture Maps (Places) POI places CDLA-Permissive-2.0
Overture Maps (Infrastructure) Transit stops, street furniture, parking ODbL
Copernicus Urban Atlas 2021 Land-cover/land-use blocks EEA reuse policy (Directive 2003/98/EC)
Copernicus Street Tree Layer 2021 Tree canopy polygons EEA reuse policy (Directive 2003/98/EC)
Copernicus Digital Height Model 2012 Building height raster (10 m, EPSG:3035) EEA reuse policy (Directive 2003/98/EC)
Eurostat Census Grid 2021 Population/demographic cells (1 km^2) EC reuse policy (Decision 2011/833/EU)

Licence

This repository depends on copy-left open source packages licensed as AGPLv3 and therefore adopts the same licence for the code. The dataset published on Zenodo is licensed under the Open Database License (ODbL 1.0) to comply with share-alike requirements of the Overture Maps layers.

Papers

  • Data paper — SOAR-EU dataset description and POI validation (Data in Brief)
  • Atlas paper — Morphological typology of European cities (CEUS)

Citation

If you use this dataset or code, please cite:

Simons, G. (2026). SOAR-EU: Scalable Open Automatable Reproducible pedestrian-scale urban metrics for 626 European urban centres. Available at: https://github.com/UCL/t2e-soar-eu

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors