-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #59 - Inflation Adjustment #157
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for getting this started @JennEYoon
a few things before we can get this merged
@@ -675,6 +675,31 @@ def store_ncdb(filepath): | |||
|
|||
df = df.set_index("geoid") | |||
|
|||
#### Beginning of New Code #### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove these comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you delete the temp directory?
year = df["year"] | ||
|
||
# Inflation logic for ncdb. | ||
inflate_cols = [ | ||
"MDVALHS", | ||
"MDGRENT", | ||
"MDHHY", | ||
] | ||
# Five rows have missing ncdb labels in variables.csv. | ||
# per capita income, missing | ||
# median household income white, missing | ||
# median household income black, missing | ||
# median household income hispanic, missing | ||
# median household income asian, missing | ||
|
||
inflate_available = list(set(df.columns).intersection(set(inflate_cols))) | ||
|
||
if len(inflate_available): | ||
df = adjust_inflation(df, inflate_available, year) | ||
return df | ||
|
||
#### End of New Code #### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will need to work a little bit differently from the ltdb example. Unlike in the store_ltdb
function, we've only got a single dataframe here, so we need to slice the ncdb dataframe by year and apply the adjust_inflation
to each slice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will take a look.
Closing as this is now covered in #216 |
Ready for your review of PR for Issue #59.
Verified that all ltdb variables that are affected by inflation adjustment are included in inflate_cols and the named strings match data from ltdb column of variables.csv. Function file data/data.py, line 494. I have checked every row of variables.csv file. There are no new variables for inflation adjustment. There are 8 variables that are affected by inflation adjustment.
Copied similar logic from ltdb inflate_cols into "store_ncdb" function definition. Use data from ncdb column in variables.csv. In data/data.py, see lines 678 - 701. Only 3 rows related to inflation adjustment have named strings in ncdb column. I am not sure if this logic works as desired. I added a definition for year, which is used in the last 2 lines.
Corrected several typos in documentation string part of data.py. ("dataframe" and "instantiation")
Thank you! Jennifer Yoon