-
Notifications
You must be signed in to change notification settings - Fork 27
add timestamp for extracts #209
Comments
@migurski where can we get this info? |
When we do this, also including the timestamp of the data itself should help make it clearer about whether to expect that changes you have made in OSM would be present in the download. The old site only showed the date we created the files...which is more confusing if there was not a weekly planet file update released and yet the extracts had a new date. |
I don't know that this information exists anywhere at the moment; would require info from @heffergm I think. |
@sleepylemur - I believe Grant is out. Are you able to help us find where this date would live in our system? |
As Rhonda mentioned, we have the time of when the last batch of extracts was finished: One complication is that while the extracts are being generated its hard to tell if a specific extract is this week or last week's version, but that's probably something we could just gloss over. |
That's the reason the LastUpdatedAt is enough detail because not all extracts get regenerated every week but the ones that don't get verified for any new changes (as far as i understand) @binx @migurski can we try to show this date for starters - https://s3.amazonaws.com/metro-extracts.mapzen.com/LastUpdatedAt |
Looks good, is that a good canonical URL to use, or might it be available someplace better? |
@migurski That url is a good one to use. |
When might we expect to see that URL updated? |
That URL no longer exists now that we've switched to processing the fixed extract list as part of ODES. It's also largely irrelevant, given the fact that every object uploaded to S3 contains a timestamp. Is there a reason we're not just using those? |
Our users are curious about the freshness of the data, and many of them won't know how to interpret S3 timestamps. We'd like a way to reference the point when the data came from OSM. |
That timestamp (LastUpdatedAt) never indicated when the data came from OSM. It was only intended to indicate when the data was last processed on our end. With the current system, we process the cities.json extract list once a day, and the planet file that we use to cut the extracts is also updated daily. So generally, the extracts are cut from data that is ~24 hours old. If there's now a requirement that we provide an OSM date relevant to the planet with each upload, I can look into doing something that will work with both types of jobs (odes and the bulk processed list). |
Do we create a new planet file from a regularly-updated database? They're normally weekly when pulled from planet.openstreetmap.org. If we get stuff every day and we know this, then we can just put a "fresh daily" message on the site. If there's a chance that it may be as old as week due to cyclical planet file updates, then we should do something more sophisticated. |
In this implementation, the planet is downloaded on initial system setup (essentially from a local mirror) then updated to current with diffs (osmupdate) before being put into production. A cron job then runs daily to apply diffs to bring it up to date regularly. |
So, would you say it's safe for us to say "this data is refreshed from live OSM once daily" in all cases? That should be plenty of freshness message for our visitors. Exciting that we're doing it this frequently; it used to be weekly + weekly. |
Well, we only used to cut extracts once a week, but the data was essentially as fresh as however long the extract run took, since we were pulling a planet and applying diffs as part of the process. In any case, I think wording to the effect that the data used to create any given extract should be at most ~24 hours old is correct. |
Coincidentally, I've discovered a bug related to planet updates, so we're a bit further out of date. Resolving now, and opened https://github.com/mapzen/operations-engineering/issues/361. |
K, I’m going to assign @binx on this issue, and it’s now just a front-end copy change. |
Correct. Il Lun 22 Ago 2016, 5:09 PM Ekta Daryanani [email protected] ha
|
fantastic! also, @heffergm Italian email? |
How about "Fresh data daily!" Sounds like a news item or a baked goods store 😉 |
Day-old data! Half off! |
Fresh data served daily, from server farm to data table... |
That will go down in history as Ingrid’s Greatest Pun. |
I was in the room when she came up with that ;) But @kkowalsky I like that. Can we use it @migurski? |
@souperneon @migurski: the wording exists in the original Metro Extracts blog announcement and might have been in the old documentation... |
Yes we totally should use it. |
we should continue having our weekly extracts date on the website, I can anticipate some users wanting something more concrete than the vague "once a week"
The text was updated successfully, but these errors were encountered: