Read the minutely replication files published by OpenStreetMap planet, and query changesets on Overpass to create full representations of changesets. It also posts the tag changes summary to the OSMCha API.
When a changeset is pushed to OSM, this stack builds a representation of the exact change that happened:
- Changeset metadata - username, id, timestamp, comment etc.
- Elements - each feature that was added, modified, or deleted in the changeset.
- For each element, the current and previous version including geometry and metadata.
- New changesets are pushed to
https://s3.amazonaws.com/mapbox/real-changesets/production/<changeset-id>.json
- Augmented Diffs are pushed by Ovepass are pushed to
https://s3-ap-northeast-1.amazonaws.com/overpass-db-ap-northeast-1/augmented-diffs/<state-id.osc>
. - The latest state id is published here
https://s3-ap-northeast-1.amazonaws.com/overpass-db-ap-northeast-1/augmented-diffs/latest
// 20170309131154
// https://s3.amazonaws.com/mapbox/real-changesets/46700150.json
{
"metadata": {
"id": "46700150",
"created_at": "2017-03-09T06:20:05Z",
"closed_at": "2017-03-09T06:20:06Z",
"open": "false",
"num_changes": "1",
"user": "johnparis",
"uid": "2126146",
"min_lat": "33.5335375",
"max_lat": "33.5335375",
"min_lon": "-7.6846717",
"max_lon": "-7.6846717",
"comments_count": "0",
"tag": [
{
"k": "comment",
"v": "Fix with Osmose"
},
{
"k": "locale",
"v": "en-US"
},
{
"k": "host",
"v": "http://www.openstreetmap.org/id"
},
{
"k": "imagery_used",
"v": "Bing aerial imagery"
},
{
"k": "created_by",
"v": "iD 2.1.3"
}
]
},
"elements": [
{
"id": "4719430892",
"lat": "33.5335375",
"lon": "-7.6846717",
"version": "2",
"timestamp": "2017-03-09T06:20:06Z",
"changeset": "46700150",
"uid": "2126146",
"user": "johnparis",
"old": {
"id": "4719430892",
"lat": "33.5335375",
"lon": "-7.6846717",
"version": "1",
"timestamp": "2017-03-05T23:46:50Z",
"changeset": "46609213",
"uid": "5435265",
"user": "zakaria f",
"action": "modify",
"type": "node",
"tags": {
"name": "لساسفة",
"highway": "bus_stop",
"name:ar": "لساسفة",
"name:en": "Lisassfa",
"name:fr": "Lissasfa"
}
},
"action": "modify",
"type": "node",
"tags": {
"highway": "bus_stop",
"name": "Lissasfa لساسفة",
"name:ar": "لساسفة",
"name:en": "Lisassfa",
"name:fr": "Lissasfa"
}
}
]
}
A lot of processes around inspecting and searching for potentially bad edits on OpenStreetMap depend on being able to view a "changeset" in its entirety. This helps in gauging the context of an edit, see similar edits by the same user, and see edits in their "finished" state (i.e. not in between a changeset).
Our primary tool for visualizing changesets has been changeset-map. We depend on augmented diffs generated by Overpass to generate these changeset representations and visualizations.
Augmented Diffs contains complete representations of changes in OSM for every minute. One can also query for a custom time range, and filter by bounding box or other attributes. These queries can be extremely slow, especially for large changesets, and were a major bottleneck in scaling up changeset reviewing processes.
const run = require('./index');
// To process this file https://planet.openstreetmap.org/replication/minute/006/012/443.osc.gz,
// the value should be 6012443
const minuteReplication = 6012443;
run(minuteReplication);
To process a single replication file, pass the minute replication id to the cli:
yarn process 6012443
If you want to connect it to a Redis queue in order to have a service that process new replication files continuously, start a Redis service, configure the url in the RedisServer
environment variable and use the update-queue command.
yarn update-queue
To backfill a particular changeset
- Make sure you have authorized via
mbx auth <mfa_code>
. - Run
node backfill <stack_name> <changeset_id> <?padding>
- It might take a while for the command to run.
Params
stack
: production | staging | etc
changeset_id
: Only accepts one changeset id
padding
: The range of minutely replication files to look for the changeset id in. eg. [239.osc.gz, (239+padding).osc.gz]
This library requires setting some environment variables, and the AWS credentials to upload the files to S3.
Environment Variable | Default value | Purpose |
---|---|---|
ReplicationBucket | osm-planet-us-west-2 | S3 Bucket where the minute replication files are published. |
OsmchaAdminToken | null | OSMCha admin user token. It will enable posting the changeset Tag Changes to OSMCha. |
OutputBucket | real-changesets | S3 Bucket that will store the real-changesets files. |
OverpassPrimaryUrl | https://overpass.osmcha.org | Main overpass server. |
OverpassSecondaryUrl | https://overpass-api.de | Fallback overpass server. |
RedisServer | null | Redis service URL, in the format redis[s]://[[username][:password]@][host][:port][/db-number] |