You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are many rules that a valid address should satisfy. However, some of the addresses in the data set are clearly not valid. What I suggest is as follows:
Add an extra column or several columns to the data set to describe the data "quality" of the row.
Build a post-processing engine, which would go over all addresses and apply various country / region / state / etc. specific rules to produce that quality score based on the given address and related data.
I've done that using F# for similar 100M+ address-based data sets and I'd be glad to assist in setting that up here.
Once the framework is setup, further rules can be added by experts who knows the particular details of countries / regions / states / etc...
The text was updated successfully, but these errors were encountered:
There are many rules that a valid address should satisfy. However, some of the addresses in the data set are clearly not valid. What I suggest is as follows:
The text was updated successfully, but these errors were encountered: