Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Validation of Addresses #26

Open
kkkmail opened this issue Nov 11, 2019 · 0 comments
Open

Add Validation of Addresses #26

kkkmail opened this issue Nov 11, 2019 · 0 comments

Comments

@kkkmail
Copy link

kkkmail commented Nov 11, 2019

There are many rules that a valid address should satisfy. However, some of the addresses in the data set are clearly not valid. What I suggest is as follows:

  1. Add an extra column or several columns to the data set to describe the data "quality" of the row.
  2. Build a post-processing engine, which would go over all addresses and apply various country / region / state / etc. specific rules to produce that quality score based on the given address and related data.
  3. I've done that using F# for similar 100M+ address-based data sets and I'd be glad to assist in setting that up here.
  4. Once the framework is setup, further rules can be added by experts who knows the particular details of countries / regions / states / etc...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant