Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CFI and RMSEA goodness-of-fit metrics #99

Open
jimmybru opened this issue Mar 21, 2022 · 4 comments
Open

Add CFI and RMSEA goodness-of-fit metrics #99

jimmybru opened this issue Mar 21, 2022 · 4 comments

Comments

@jimmybru
Copy link

My (quite inexpert) understanding is that CFI and RMSEA are the goodness-of-fit measures par excellence when it comes to CFA. It seems that Chi-squared is also useful.

Would it be possible to add these metrics, like around here? https://github.com/EducationalTestingService/factor_analyzer/blob/main/factor_analyzer/confirmatory_factor_analyzer.py#L379

@jbiggsets
Copy link
Collaborator

jbiggsets commented Apr 2, 2022

@desilinguist, I'll have to double-check some of the formulas below, but I think RMSEA is pretty easy and do-able. (You can find a couple references here.)

Here are some of the steps, as far as I understand:

  1. Implement the Chi-squared test statistics. I believe this is just chi2 = self.n_obs * res.fun, where fun comes directly from the minimize results object (see here).

  2. Calculate the degrees of freedom. I believe this is (k * (k - 1) / 2), or dof = self.n_obs * (self.n_obs - 1) / 2, but we may also have to subtract out the number of model parameters at the end.

  3. Calculate the p-value. Once we have the test statistics and the degrees of freedom, this should be as simple as 1 - scipy.stats.chi2.cdf(chi2, dof).

  4. Calculate Root Mean Square Error of Approximation (RMSEA). There are a few different formulas I've seen, but I think they all reduce to rmsea = 0 if chi2 < dof else np.sqrt((chi2 / (dof * self.n_obs) - 1) / (self.n_obs - 1)) or something like that. I'll have to double check this , too. (We can look at psych to see how they implement.)

Getting the Comparative Fit Index (CFI) is a little more involved, since you have to calculate the Chi-squared test statistics for the baseline/null model, where all the variables are independent/uncorrelated. I'll read up on that a bit, but it does seem like a useful thing to add.

As an FYI, I believe that semopy has all of these implemented, but it's been a while since I've looked at that package. Not sure whether/how much we can borrow from that package. Looks like it's under the MIT license. (See here.)

@desilinguist
Copy link
Member

This is amazingly helpful, @jbiggsets! I'll try and take a stab at RMSEA soon. It's funny you mention semopy since that's exactly what @jimmybru is using right now. I told him he should look into replacing that with factor analyzer but he needs these metrics to do that.

@jimmybru
Copy link
Author

jimmybru commented Apr 2, 2022

You guys are Beautiful!

@jimmybru
Copy link
Author

jimmybru commented Apr 2, 2022

Btw: this should probably be another issue, but just in case the answer is obvious, I’ll ask here: I don’t see a way in factor_analyer to specify covariances in the model like you can in semopy. Is that true? If so, is the default that all factors are orthogonal, or is everything allowed to covary with everything?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants