You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Address data does not appear to be normalized. For instance, Virginia Engineers PAC's 09/07/2012 contribution says that their primary place of business is "Richmon, VA." (To be fair, that's not address data.) It appears that the software used by many committees is providing normalization, but they normalize differently. For instance, some software normalizes on long street suffixes ("Court," "Boulevard," "Road," etc.), while some software normalizes on short street suffixes ("Ct.," "Blvd.," "Rd.," etc.) So the good news is that reports often have internal consistency that should make it easy to join all of the reports in collective consistency.
Implement the an address normalization system (presumably the USPS's API) to deal with this problem.
The only question is at what point this should be done. Is it appropriate to do this prior to saving the data and generating the JSON? Or is it wrong to alter the SBE's data? Wouldn't this mean making tens of thousands of API calls every time that the parser is run?
This might be an argument for standardizing addresses via a cruder, local function at the time of input, and save the USPS API calls to be used beyond the Saberva pipeline.
The text was updated successfully, but these errors were encountered:
Towards #17. This is not as good as interfacing with the USPS's API,
which can deal with common address errors (e.g., "Richmon" ->
"Richmond"), but instead merely standardizes "Virginia" as "VA,"
"Street" as "St.," etc.
I signed up for the USPS API months ago, getting as far as the part where you wait for approval. I never heard back. So that ain't gonna happen for people.
Address data does not appear to be normalized. For instance, Virginia Engineers PAC's 09/07/2012 contribution says that their primary place of business is "Richmon, VA." (To be fair, that's not address data.) It appears that the software used by many committees is providing normalization, but they normalize differently. For instance, some software normalizes on long street suffixes ("Court," "Boulevard," "Road," etc.), while some software normalizes on short street suffixes ("Ct.," "Blvd.," "Rd.," etc.) So the good news is that reports often have internal consistency that should make it easy to join all of the reports in collective consistency.
Implement the an address normalization system (presumably the USPS's API) to deal with this problem.
The only question is at what point this should be done. Is it appropriate to do this prior to saving the data and generating the JSON? Or is it wrong to alter the SBE's data? Wouldn't this mean making tens of thousands of API calls every time that the parser is run?
This might be an argument for standardizing addresses via a cruder, local function at the time of input, and save the USPS API calls to be used beyond the Saberva pipeline.
The text was updated successfully, but these errors were encountered: