You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data I’m talking about above are valid addresses with all the fields filled out. I didn’t expect this because I thought addresses between files were deduped which is why so many files are hashes only.
Describe the bug
There's multiple copies of same records in processed data
To Reproduce
select hash, count(*) from us_ri_providence_addresses_city group by hash having count(*)>1;
Expected behavior
Duplicates removed after hashing
Additional context
It's unclear if this an intended limitation since I haven't seen any documentation on what guarantees are given for data
The text was updated successfully, but these errors were encountered: