Skip to content
This repository has been archived by the owner on Sep 24, 2024. It is now read-only.

analyze historic addition/removal rates of boxes #3

Open
mikelmaron opened this issue Aug 16, 2020 · 2 comments
Open

analyze historic addition/removal rates of boxes #3

mikelmaron opened this issue Aug 16, 2020 · 2 comments

Comments

@mikelmaron
Copy link
Collaborator

mikelmaron commented Aug 16, 2020

live data is from sept 2019. @nstory also scraped new data https://github.com/nstory/collection_boxes#2019-to-2020-added-and-removed @iandees as well at https://github.com/iandees/usps-collection-boxes/.

@nstory compared 2019 and 2020/08/15

$ # mailboxes removed (id only appears in old report)
$ comm -23 <( xsv select OUTLETID coll_report.csv | sed -e 1d | sed -e 's/^[0 ]//' | sort -u ) <( xsv select OUTLETID collection_boxes_2020-08-15.csv | sed -e 1d | sed -e 's/^[0 ]//' | sort -u ) | wc -l
4221
$ # mailboxes added (id only appears in new report)
$ comm -13 <( xsv select OUTLETID coll_report.csv | sed -e 1d | sed -e 's/^[0 ]//' | sort -u ) <( xsv select OUTLETID collection_boxes_2020-08-15.csv | sed -e 1d | sed -e 's/^[0 ]//' | sort -u ) | wc -l
967

However, the removed spreadsheet has 642 entries for DC -- almost every on street blue box, which is not accurate at all. So there may be an issue with DC data from the USPS site where data was scraped.

This doesn't appear to be new https://www.uspsoig.gov/blog/disappearing-collection-boxes

@mikelmaron mikelmaron changed the title compare latest data scrape to sept 2019 analyze historic addition/removal rates of boxes Aug 18, 2020
@mikelmaron
Copy link
Collaborator Author

Even more questions https://twitter.com/mikel/status/1295727907038470144

@mikelmaron
Copy link
Collaborator Author

mikelmaron commented Aug 20, 2020

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant