Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document hash field #341

Open
pnoll1 opened this issue May 8, 2023 · 0 comments
Open

Document hash field #341

pnoll1 opened this issue May 8, 2023 · 0 comments
Assignees
Labels
question Further information is requested

Comments

@pnoll1
Copy link

pnoll1 commented May 8, 2023

Is your feature request related to a problem? Please describe.
The hash field is critical to data users and the only documentation is in a 5 year old issue thread and doesn't appear completely accurate. Hash doesn't stay the same even if the content does.

openaddresses/machine#683 says "The hash value is calculated as a content hash, and it can be used to determine that two addresses are identical between different runs of a single source."

pelias/openaddresses#442 says "It turns out that the existing HASH column generated by the OA team is seeded with a random number, so even if the underlying data remains the same, the hash value will change with each rebuild of the OA file."

Describe the solution you'd like
Description of how hash created, gotchas and example use case.

Describe alternatives you've considered

  • use information from above openaddresses/machine issue
    • easy to read hash as being a stable identifier that can be used as a unique for database
  • no documentation
    • field will likely be ignored so why provide it in data download?
@pnoll1 pnoll1 added the question Further information is requested label May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants