Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing Data not loading correctly #45

Open
Makosak opened this issue Sep 12, 2023 · 2 comments
Open

Testing Data not loading correctly #45

Makosak opened this issue Sep 12, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@Makosak
Copy link
Collaborator

Makosak commented Sep 12, 2023

7-day testing rates (both options) or only loading for some, not all counties. It's consistent across all time points and testing variable options.
Screen Shot 2023-09-12 at 1 35 20 PM

@Makosak Makosak added this to the Final Project Archive milestone Sep 12, 2023
@Makosak Makosak assigned Makosak and mukeshchugani10 and unassigned Makosak Sep 12, 2023
@Makosak Makosak assigned Makosak and mukeshchugani10 and unassigned Makosak Oct 30, 2023
@Makosak Makosak added the bug Something isn't working label Nov 9, 2023
@Makosak Makosak assigned mradamcox and unassigned mukeshchugani10 Nov 9, 2023
@mradamcox
Copy link
Collaborator

Checking into this, the covid_testing_cdc.csv file currently has only 266 rows, and looking back at an old version through git history in the archived repo, this version from Oct 2022 has 2,794 rows. So, at some point the data stream began returning a much smaller file. Will require more investigation when I can.

@mradamcox
Copy link
Collaborator

@Makosak I'm tempted to give up on this, and remove the testing variables from the map interface, or something like that.

I have checked through more of the git history, and it seems like the data was declining over time, for example on March 1st, 2023, covid_testing_cdc.csv had 800 rows, already well down from a few months before. I've been looking into the function that downloads and parses this data (as far as I can tell) https://github.com/GeoDaCenter/covid/blob/master/data-scripts/cdc/getCdcData.py, and am pretty confused because it looks to be downloading these CDC Community Reports but then looking for a columns that I can't find a trace of in those source files. For example, the script looks for Total RT-PCR diagnostic tests - last 7 days in the Counties sheet, and that isn't present in those xlsx files, going back to at least early 2022 (the closest column name is NAAT positivity rate - last 7 days (may be an underestimate due to delayed reporting)).

So, let me know what you think. We could potentially look further back in the history to the fullest CSV and then patch that into the production data, but we would have to document an exception for the date on that table and it would take a while for me to figure that all out. Not sure how high a priority this is...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants