You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After the most recent round of fixes for null geometries, we still have about 10% of records with no geometries:
SELECTcount(cartodb_id) FROM crashes_all_prod
1,950,062 crashes in the whole universe.
SELECTcount(cartodb_id) FROM crashes_all_prod
where
latitude is null
187,456 without geometry. That's 9.6%
Of those, 40,557 have neither on_street nor off_street, so they could never be located. That's 2% of total.
SELECTcount(cartodb_id) FROM crashes_all_prod
where
latitude is nulland
length(on_street_name)=0and
length(off_street_name)=0
35,483 have both cross streets and could be candidates for geocoding. That's 1.8% of the total.
SELECTcount(cartodb_id) FROM crashes_all_prod
where
latitude is nulland
length(on_street_name)>0and
length(off_street_name)>0
111,416 have either on_street or off_street but not both. That's 5.7% of total.
SELECTcount(cartodb_id) FROM crashes_all_prod
where
latitude is nulland
(length(on_street_name)=0OR
length(off_street_name)=0) and not
(length(on_street_name)=0AND
length(off_street_name)=0)
That's the hardest set. In some cases, one could get to borough and maybe even smaller, but geocoding these would put a point at a location on the map that is almost certainly inaccurate. Most geocoders would pick the geospatial center of the length of a street or avenue. This point would then be included in all area calculations and fall into whatever boundaries happen to overlap with that center. This is a common problem where, for example, the center of the US, somewhere in Kansas, gets assigned all kinds of stuff that's clearly not there.
I recommended to Chrstine we not pursue this, but if she wants the 1.8% increase, next step is a manual try at geocoding those to see how the results look.
After the most recent round of fixes for
null
geometries, we still have about 10% of records with no geometries:1,950,062 crashes in the whole universe.
187,456 without geometry. That's 9.6%
Of those, 40,557 have neither on_street nor off_street, so they could never be located. That's 2% of total.
35,483 have both cross streets and could be candidates for geocoding. That's 1.8% of the total.
111,416 have either
on_street
oroff_street
but not both. That's 5.7% of total.That's the hardest set. In some cases, one could get to borough and maybe even smaller, but geocoding these would put a point at a location on the map that is almost certainly inaccurate. Most geocoders would pick the geospatial center of the length of a street or avenue. This point would then be included in all area calculations and fall into whatever boundaries happen to overlap with that center. This is a common problem where, for example, the center of the US, somewhere in Kansas, gets assigned all kinds of stuff that's clearly not there.
I recommended to Chrstine we not pursue this, but if she wants the 1.8% increase, next step is a manual try at geocoding those to see how the results look.
┆Issue is synchronized with this Asana task
The text was updated successfully, but these errors were encountered: