-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate initial flat file from PolicyEngine Enhanced CPS #2
Comments
@nikhilwoodruff said in issue #2:
Why not 2021? That is the year for which the most recent IRS-SOI targeting statistics are available, as I mentioned in this issue 5 comment. |
Can't we extrapolate the targets to 2023? If we're going to extrapolate the microdata to 2023 anyway... |
@nikhilwoodruff said:
Doesn't make any sense to me. Only a few days ago we were planning to develop a 2015 data set from the 2015 PUF and the 2015 CPS, and then evaluate that 2015 data set against a number of 2015 validation targets. All of this was for the same year and involved no extrapolation of any kind. I don't see how the extrapolation of the validation targets (like an historical JCT EITC tax expenditure estimate) can be done in any sensible way. Any way of doing it will introduce reasons why the 2023 data set generates tax expenditure estimates that differ from the extrapolated tax expenditure estimate that have nothing to do with the quality of the constructed data set. |
Ok, thanks for the clarification @martinholmer. Though- how were we planning to validate 2023 anyway? If tax expenditures statistics are uninformative for years after their data year, even if we did start at 2021, surely we wouldn't have helped much to prove the validity of a 2023 dataset? |
@nikhilwoodruff asked in issue #2:
We weren't planning on doing that in Phase 1, Phase 2, or Phase 3. |
@martinholmer I guess I'm just struggling to understand the difference in value between:
They both seem to require the same level of assumption/uncertainty on our part and the same quality guarantee on the final 2023 data, but the former adds extra work in making 2021 data in addition to the other years, which won't be used for policy analysis. |
@nikhilwoodruff, OK. Go ahead with your plan of creating a flattened file from the 2023 PEUS PUF-enhanced CPS data. We will not be able to compare 2023 aggregate tax liabilities with IRS-SOI data (the most recent of which is 2021), but we can compare 2023 aggregate tax liabilities from the 2023 flattened file with recent CBO projections. |
@nikhilwoodruff @martinholmer I was away yesterday and most of today. Here's how I see this:
Agree/disagree? |
@martinholmer The best word I have come up with so far for what we want do regarding data constructed for future years is "examine" the data with an eye toward finding potential problems. "Validation" clearly is wrong because we don't know truth against which to compare the data. "Evaluation" seems wrong for the same reason. "Analyze" seems too open-ended. "Examine" to me implies a more targeted and limited look at the data - focusing on specific things (e.g., how different are our estimates of current law from estimates of CBO and JCT?, how different are our estimates of selected reforms from CBO and JCT?). I asked ChatGPT "What's the difference in meaning between examine and analyze?" and here are key parts of its response (FWIW):
I think @martinholmer is developing tools for examination. What we learn from those tools may lead @nikhilwoodruff and @donboyd5 to do further analysis. For example, we may learn that our data produces higher estimates of cost for certain kinds of reforms than JCT estimates. That might lead @nikhilwoodruff and @donboyd5 to conduct analysis to look into what could be causing that: Do we have more more of one kind of income or filers than JCT? Do we have a different distribution of income or # of taxpayers (if that information is available)? We may not have a way to quantify this - we might even discuss on the phone or by email with JCT, and we might ask them to provide more information about their data. After that, we might conclude that our differences are sensible and we'll stand with what we have, or we might decide to make changes. So, I suggest that, with regard to data for future years, what @martinholmer is doing is developing tools for examination that may lead to deeper analysis. |
@donboyd5, Thanks for you analysis of alternative to the work validation. |
All agreed on the above from me here. |
As the first task in the project, we need to flatten PolicyEngine's Enhanced CPS (2023) file and enable it to be used in Tax-Calculator.
The text was updated successfully, but these errors were encountered: