You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm trying to calculate the proportion of a census tract that received any grades. I simply grouped by geoid and summing up tract_prop. Ideally the sum should be no more than 1 (1 means 100% of the areas are graded) but I have many census tracts' values above 1.
I further dug into the data and found there are duplicated rows with the same polygon id and such and only the prop are different. Do you have any insights on this issue?
Here is one example, geoid==34017012300. There are four rows in the dataset.
holc %>% filter(geoid=='34017012300')
holc_id holc_grade id polygon_id sheets name municipali holc_area year msamd state_code county_cod census_tra
1 C2 C 95 4678 1 Kearney 0.5950192 2019 35614 34 017 012300
2 B2 B 95 4682 1 Kearney (Arlington) 0.4377881 2019 35614 34 017 012300
3 C75 C 190 7344 0 0.4619436 2019 35614 34 017 012300
4 B2 B 95 4682 1 Kearney (Arlington) 0.4377881 2019 35614 34 017 012300
geoid tract_prop holc_prop map_id st_name state
1 34017012300 0.05566014 0.008786458 95 Hudson County NJ
2 34017012300 0.80507639 0.172732358 95 Hudson County NJ
3 34017012300 0.13926356 0.028317117 190 Bergen County NJ
4 34017012300 0.13926356 0.029879554 95 Hudson County NJ
The total of these four rows' tract props exceed 1.
Row 2 and 4 are exactly the same with the only difference being the tract prop and holc_prop.
Even if I want to drop one of the rows, I don't know which row should be dropped.
Hi, I'm trying to calculate the proportion of a census tract that received any grades. I simply grouped by geoid and summing up tract_prop. Ideally the sum should be no more than 1 (1 means 100% of the areas are graded) but I have many census tracts' values above 1.
I further dug into the data and found there are duplicated rows with the same polygon id and such and only the prop are different. Do you have any insights on this issue?
Here is one example, geoid==34017012300. There are four rows in the dataset.
The total of these four rows' tract props exceed 1.
Row 2 and 4 are exactly the same with the only difference being the tract prop and holc_prop.
Even if I want to drop one of the rows, I don't know which row should be dropped.
I'm using the shape file via https://github.com/americanpanorama/Census_HOLC_Research/blob/main/2010_Census_Tracts/holc_census_tracts.zip
The text was updated successfully, but these errors were encountered: