Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's make a list of additional items to include in soi_from_puf_tmd_2021.csv (or underlying source) #108

Open
donboyd5 opened this issue Jun 25, 2024 · 0 comments

Comments

@donboyd5
Copy link
Collaborator

donboyd5 commented Jun 25, 2024

@nikhilwoodruff, let's make a list of additional items to include in soi_from_puf_tmd_2021.csv (or the underlying source), so that you can add them all at once. @martinholmer you may have thoughts about things I didn't think to mention below.

The primary purpose of this is data-quality diagnosis, but it also could affect future targeting.

The basic idea is to capture concepts that we think are important for (1) tax revenue analysis, or (2) distributional analysis. How important something is depends on extent to which it is data that would play a significant role in (1) an important policy or political issue likely to arise over the next year or two, (b) an issue that is important to project sponsors, or (c) analysis that we think will be important.

We need to make a prioritized list because we don't want this work to crowd out other more-important work. I've listed items below in roughly my sense of priority order. We can defer items we don't have time for but many should be relatively easy because the mapping to IRS is straightforward.

The broad categories of items that I do not see at present - and I could be wrong so please consider this subject to discussion - are:

  • Filers - we want to have a filers subset of the data for 2021, because that allows us to compare our data to many IRS-published aggregates
  • Total # of returns with adjusted gross income; I see total # of nonzero returns for many individual income components, but not for AGI as a whole
  • Payroll tax
  • Major itemized deduction components. See the crosswalk doc. I see that you have SALT and medical-uncapped.
    • I think SALT may need some refinement. Key variables are e18400 and e18500.
    • Interest paid e19200 is large and would be useful.
    • Cash contribuions e19800.
  • Qualified business income deduction (we hit the total, of course, but we do want to see it by AGI range, eventually crossed with filing status)
  • We're going to want to delve into one measure of income tax liability where we are sure we can match the IRS concept with PUF concepts so that we have a definite apples-to-apples comparison. I think that is probably the IRS total income tax concept but another might do if we are 100% sure we can match PUF-IRS concepts perfectly.

We will certainly want to examine universe totals (i.e., including non-filers), as we discussed on the phone @nikhilwoodruff, but let's make that a separate issue because it requires careful examination of the appropriate control totals, how they relate to relevant tmd variables and because it is of slightly lower priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant