You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently trying to impute a three-level dataset with 87 columns and 71,756 rows. The variables comprise of which 4 identifier columns, 15 continuous outcome variables without missing entries, and 68 predictors and covariates with missing entries:
On level 1 (lowest, represents on individual) there are 16 ordinal and 20 dichotomous variables,
I tried setting model to "binary" to run a logistic mixed effects model for the dichotomous variables ("pmm" for the ordinal, "continuous" for the continuous).
I tried added random slopes and interaction effects.
mice.impute.2lonly.pmm was used instead of mice.impute.2lonly.norm for the top level imputation.
However when running mice (with some variables modeled as "binary" (without random slopes or interactions), I get the following warning:
Warning message in commonArgs(par, fn, control, environment()):
“maxfun < 10 * length(par)^2 is not recommended.”
Execution of mice hangs at this point.
I ran a test with mice (1 iteration), this time with all dichotomous variables as "pmm", and this time the function completed the run. However, adding variables to random_slopes it seemingly gets stuck (running infinitely) on the imputation of the first three variables. Now, my assumption is that this is due to the relatively large dataset, making the the process computationally very demanding.
I am wondering what exactly causes this error message, and if there are ways to avoid it. Also, I would like to know if there are ways to improve computational efficiency of such a large model.
I am not very familiar with mice, but I have some thoughts regarding how the data is imputed:
I am planning to use the imputed data for a structural equation model I've built, where all the variables are grouped into indicators of latent constructs. It therefore seems natural that the indicator variables that belongs to the same construct are imputed together.
In mice there is an argument called blocks which allows for multivariate imputation of the variables grouped together as list elements. However, creating blocks containing variables from different levels created the issue that I got the error message that no top level was defined in the predictorMatrix (i.e. no block set to -2). As an alternative method, it seems the formulas argument can be used in place of a predictor matrix. This options seems ideal, as it allows user defined formulas for each block. Also, if I understand the whole process correctly, the predictorMatrix is only passed on to mice.impute.2lonly.pmm and not mice.impute.ml.lmer. The question then is if the formulas argument can be used to define three-level models using lme4-syntax? ..and can these user defined models in formulas be passed on to mice.impute.ml.lmer? As a more general question, why can't mice.impute.ml.lmer be used for imputation at top level? (At least, it didn't work when I tried.)
Then there's also an argument group_index in mice.impute.ml.lmer used to pass group identifiers to mice.impute.bygroup. From reading the documentation I am still unsure what this function actually does, as I can find little information on it. However, it seems it is designed for grouping variables together by level, but not across grouping of variables from different levels, correct? However, what would distinguish mice.impute.bygroup from creating blocks? ..and what would the difference of doing this, rather than calling models in mice.impute.ml.lmer?
As for computational efficiency, I have no idea if grouping variables together would increase computational efficiency. I could really use some advice on this part.
The text was updated successfully, but these errors were encountered:
pehkawn
changed the title
Running mice.impute.ml.lmer with "binary" logistic model returns errormice.impute.ml.lmer on large three-level dataset: "binary" logistic model returns error, "hangs" when adding random slopes or interactions
Jul 29, 2022
pehkawn
changed the title
mice.impute.ml.lmer on large three-level dataset: "binary" logistic model returns error, "hangs" when adding random slopes or interactionsmice.impute.ml.lmer on large three-level dataset: "binary" logistic model returns error, 'hangs' when adding random slopes or interactions
Jul 29, 2022
I am currently trying to impute a three-level dataset with 87 columns and 71,756 rows. The variables comprise of which 4 identifier columns, 15 continuous outcome variables without missing entries, and 68 predictors and covariates with missing entries:
I've been following Simon Grund's example for modeling three-level data using
mice
with themice.impute.ml.lmer
-function. Naturally, I had to make some adaptations to the example model to fit my data:model
to"binary"
to run a logistic mixed effects model for the dichotomous variables ("pmm"
for the ordinal,"continuous"
for the continuous).mice.impute.2lonly.pmm
was used instead ofmice.impute.2lonly.norm
for the top level imputation.However when running
mice
(with some variables modeled as "binary" (without random slopes or interactions), I get the following warning:Execution of
mice
hangs at this point.I ran a test with
mice
(1 iteration), this time with all dichotomous variables as"pmm"
, and this time the function completed the run. However, adding variables torandom_slopes
it seemingly gets stuck (running infinitely) on the imputation of the first three variables. Now, my assumption is that this is due to the relatively large dataset, making the the process computationally very demanding.I am wondering what exactly causes this error message, and if there are ways to avoid it. Also, I would like to know if there are ways to improve computational efficiency of such a large model.
I am not very familiar with
mice
, but I have some thoughts regarding how the data is imputed:I am planning to use the imputed data for a structural equation model I've built, where all the variables are grouped into indicators of latent constructs. It therefore seems natural that the indicator variables that belongs to the same construct are imputed together.
mice
there is an argument calledblocks
which allows for multivariate imputation of the variables grouped together as list elements. However, creating blocks containing variables from different levels created the issue that I got the error message that no top level was defined in thepredictorMatrix
(i.e. no block set to-2
). As an alternative method, it seems theformulas
argument can be used in place of a predictor matrix. This options seems ideal, as it allows user defined formulas for each block. Also, if I understand the whole process correctly, thepredictorMatrix
is only passed on tomice.impute.2lonly.pmm
and notmice.impute.ml.lmer
. The question then is if theformulas
argument can be used to define three-level models usinglme4
-syntax? ..and can these user defined models informulas
be passed on tomice.impute.ml.lmer
? As a more general question, why can'tmice.impute.ml.lmer
be used for imputation at top level? (At least, it didn't work when I tried.)group_index
inmice.impute.ml.lmer
used to pass group identifiers tomice.impute.bygroup
. From reading the documentation I am still unsure what this function actually does, as I can find little information on it. However, it seems it is designed for grouping variables together by level, but not across grouping of variables from different levels, correct? However, what would distinguishmice.impute.bygroup
from creating blocks? ..and what would the difference of doing this, rather than calling models inmice.impute.ml.lmer
?The text was updated successfully, but these errors were encountered: