Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated Rows with only TractProp and HOLCProp differences #2

Open
ReichYang opened this issue Aug 2, 2023 · 0 comments
Open

Duplicated Rows with only TractProp and HOLCProp differences #2

ReichYang opened this issue Aug 2, 2023 · 0 comments

Comments

@ReichYang
Copy link

Hi, I'm trying to calculate the proportion of a census tract that received any grades. I simply grouped by geoid and summing up tract_prop. Ideally the sum should be no more than 1 (1 means 100% of the areas are graded) but I have many census tracts' values above 1.

I further dug into the data and found there are duplicated rows with the same polygon id and such and only the prop are different. Do you have any insights on this issue?

Here is one example, geoid==34017012300. There are four rows in the dataset.

holc %>% filter(geoid=='34017012300')
holc_id holc_grade id polygon_id sheets name municipali holc_area year msamd state_code county_cod census_tra
1 C2 C 95 4678 1 Kearney 0.5950192 2019 35614 34 017 012300
2 B2 B 95 4682 1 Kearney (Arlington) 0.4377881 2019 35614 34 017 012300
3 C75 C 190 7344 0 0.4619436 2019 35614 34 017 012300
4 B2 B 95 4682 1 Kearney (Arlington) 0.4377881 2019 35614 34 017 012300
geoid tract_prop holc_prop map_id st_name state
1 34017012300 0.05566014 0.008786458 95 Hudson County NJ
2 34017012300 0.80507639 0.172732358 95 Hudson County NJ
3 34017012300 0.13926356 0.028317117 190 Bergen County NJ
4 34017012300 0.13926356 0.029879554 95 Hudson County NJ

The total of these four rows' tract props exceed 1.
Row 2 and 4 are exactly the same with the only difference being the tract prop and holc_prop.

Even if I want to drop one of the rows, I don't know which row should be dropped.

I'm using the shape file via https://github.com/americanpanorama/Census_HOLC_Research/blob/main/2010_Census_Tracts/holc_census_tracts.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant