-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
$121b of wages of millionaires dropped from puf stage 2 targets?? #399
Labels
Comments
HI @donboyd5, thanks for digging into this! I think you're right that this is a bug, likely introduced when I tried automating the SOI estimates. I'll get a fix up and some checks to keep it from happening again. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Of possible interest to @andersonfrailey @MattHJensen @rickecon @jdebacker:
Been trying to figure out why some of my numbers for 2017 differ substantially from official TaxData & Tax-Calculator so I looked at wage data for 2017. As noted in the title, I think TaxData is dropping $121 billion of wages of millionaires, which would help explain why I get such very different distributions of income and other items vs. TaxData. Here are key facts and inferences:
I can see that taxdata puf stage 2 targets for wages by income range appear to come from the same source as mine: the IRS file "17in14ar.xls" (17=2017, in=individual income tax, 14=Table 1.4, ar=all returns -- landing page).
For example, here is a doctored screenshot of the stage 2 targets that just shows wages, just for 2015-2017:
And here is a screenshot of the wage portion of IRS Table 1.4:
If you look at the first 2017 wage target labeled as "Wages and Salaries: Zero or Less" it matches the IRS "no adjusted gross income" value of $20.869 billion. So far so good, although the target label implies that it is based on wage bins but the IRS data are based on AGI bins.
(Before going further, let me note that this raises a side issue: the IRS data list values by 2017 AGI range, for tax filers. When I retarget the puf, I first calculate AGI for the year in question, then determine filing status, and apply targets to filers in corresponding AGI bins. Unless TaxData calculates AGI before targeting - I don't think it does but I need to read the code - it must be targeting for all records (not just filers) and more importantly, based on some different definition of AGI. I seem to recall it might use the base year of 2011, which would be far away in time from 2017. If that's how it's done, it would seem to result in substantially incorrect wage distributions for 2017 after AGI is calculated. But that's not the question I'm writing about here.)
The second target ($1 < $10k) of $86.507 billion matches the sum of the next 2 IRS AGI ranges, which cover $1-10k, so that looks good. And the next few ranges that I looked at also look good. For example, the $500k-1m range target and corresponding IRS value both are $379.376 billion.
But when we get to millionaires, we have a problem, as far as I can tell. TaxData shows a target of $367.732 billion. Let me zoom in on that here:
As you can see, the puf targets appear to have dropped the $121 billion in wages for those with 2017 AGI of $1-1.5 million. A quick scan across the millionaire row suggests this happened in all years.
Assuming I did my work correctly, this means that puf.csv as reweighted and grown generally will have too little wages for millionaires even if targeting controls to the overall level of wages, and some distributional oddities as records are reweighted in an effort to hit wage targets that are too low for millionaires.
In recent work I did to construct state weights for the puf, I created tables that compared the PUF to IRS values for many values, at 2017 levels, just for records that pass my 2017 filing screen which is based on IRS rules and some inferences. Here is the table for salaries and wages. The target column is the IRS value (in dollars, sorry about the specious precision), puf is the latest puf.csv using all weight and growth defaults, diff is puf minus target, and pdiff is diff as % of target. As you can see after default growth and weighting, the puf has wages that are quite close to 2017 IRS total wages (-1.6%), but about $44.5 billion too little wages for $10-millionaires. Furthermore, if you look across the ranges you can see a lot of maldistribution.
If I made a mistake in examining this I would much appreciate an early alert, but I have checked it over several times.
The text was updated successfully, but these errors were encountered: