7  Mutilevel

The files for this page’s example analysis are

All files are available at this link.

The following issues are discussed:

7.1 Example Model

The dataset for this example has 600 observations. 13.2% of x1i, 8.5% of x2i, and 23.5% of y are missing. y is MNAR; its missingness depends on its own unobserved values. x1i is MAR; its missingness depends on the values of x2i. x2i is MCAR; its missingness is unrelated to any observed or unobserved values.

This dataset has 30 clusters (j) at level-two, each with 20 observations at level-one, for a total sample size of 600.

Below is a table of estimates from a model predicting y with (1) the complete data and (2) the data after some values in x1i, x2i, and y were made missing, using a listwise deletion procedure. The true effects of each predictor are around 1.

  Complete Data Listwise Deletion
Parameter Estimate SE CI Estimate SE CI
Intercept -0.11 0.09 -0.29 – 0.07 -0.33 0.10 -0.53 – -0.13
x1i 1.01 0.09 0.84 – 1.19 0.83 0.09 0.66 – 1.00
x2i 0.99 0.05 0.90 – 1.08 0.91 0.07 0.78 – 1.04
x3i 1.00 0.05 0.91 – 1.09 1.01 0.07 0.88 – 1.14
x2i × x3i 1.03 0.04 0.95 – 1.10 1.02 0.07 0.89 – 1.15
Random Effects
σ2 0.95 0.90
τ00 0.20 j 0.20 j
τ11 0.18 j.x1i 0.11 j.x1i
ρ01 0.59 j 0.58 j
ICC 0.29 0.22
N 30 j 30 j
Observations 600 383

As expected, under MAR and MNAR, estimates are biased. In particular, the estimate for x1i differs by about two standard errors between the two models.

7.2 Blimp Input

The Blimp script for this analysis is as follows:

DATA: 04_multilevel.csv;
CLUSTERID: j;
FIXED: x3i;
MODEL: 
focal:
y ~ x1i x2i x3i x2i*x3i | x1i;
auxiliary:
x1i x2i ~ x3i;
y.missing ~ y x1i x2i x3i x2i*x3i;
SEED: 123;
BURN: 20000;
ITERATIONS: 20000;

Select sections of the script are now discussed.

7.2.1 Clustered Data

For a multilevel dataset, specify the cluster variable after the CLUSTERID command. All variables in the dataset must be numeric, including the cluster variable, so we may need to first encode a string variable as numeric before fitting our model in Blimp.

7.2.2 Random Effects

A random intercept is included by default if there is a cluster. To regress y on x and include a random intercept, we could use this syntax:

CLUSTERID: j;
MODEL: y ~ x;

To add random slopes, include variable names after the | symbol in a model. To regress y on x and include a random intercept and also a random slope for x across level-two units, our script would have these lines:

CLUSTERID: j;
MODEL: y ~ x | x;

7.2.3 Factored Regression Specification

Here we again see an interaction between two variables, at least one of which is missing. We need to specify our predictor model following the procedure discussed in the logistic regression example. x1i has a higher missing rate than x2i so it appears first in the formula, and x3i is fully observed so it appears on the right side:

x1i x2i ~ x3i;

7.2.4 Selection Model for MNAR

A second auxiliary model included in our script reads:

y.missing ~ y x1i x2i x3i x2i*x3i;

The .missing suffix creates an ordinal missingness indicator, which means this line fits a probit regression.

This is called a selection model, where variables in the dataset predict the probability of one variable being missing. Most noteworthy of these predictors is y; a variable predicts whether itself is missing! A key assumption of the selection model that makes this happen is that our errors are normally distributed.

In terms of specifying the selection model, overparameterizing the selection model (i.e., including a lot of variables on the right side) is not a problem, but it can lead to convergence issues.1

Here, we knew in advance that y was MNAR. Outside of simulations, we cannot know whether data is MAR or MNAR. These two missing mechanisms make different assumptions about the relationship of a missingness indicator with unobserved values, which is impossible to test without those missing values. Selection models should be used where there is good theoretical reason to do so. Where we are not sure or have competing hypotheses, we can fit models under different assumptions and report any uncertainty in the results.2

7.2.5 Iterations

This model is more complex than those in the previous chapters: over 50 paramerters, random effects, and a selection model with all variables in the dataset. As a result, our chains needs more iterations for our model to converge.

7.3 Blimp Output

Running the Blimp script produces this output file:


---------------------------------------------------------------------------

                                   Blimp
                                  3.2.10

             Blimp was developed with funding from Institute of
           Education Sciences awards R305D150056 and R305D190002.

            Craig K. Enders, P.I. Email: cenders@psych.ucla.edu
           Brian T. Keller, Co-P.I. Email: btkeller@missouri.edu
                 Han Du, Co-P.I. Email: hdu@psych.ucla.edu
                 Roy Levy, Co-P.I. Email: roy.levy@asu.edu

              Programming and Blimp Studio by Brian T. Keller

                   There is no expressed license given.

---------------------------------------------------------------------------


ALGORITHMIC OPTIONS SPECIFIED:

  Imputation method:                 Fully Bayesian model-based
  MCMC algorithm:                    Full conditional Metropolis sampler with
                                       Auto-Derived Conditional Distributions
  Between-cluster imputation model:  Latent cluster means (LATENT)
  Prior for random effect variances: Zero matrix, df = -(p + 1) (PRIOR2)
  Prior for residual variances:      Zero sum of squares, df = -2 (PRIOR2)
  Prior for predictor variances:     Unit sum of squares, df = 2 (XPRIOR1)
  Chain Starting Values:             Random starting values



BURN-IN POTENTIAL SCALE REDUCTION (PSR) OUTPUT:

  NOTE: Split chain PSR is being used. This splits each chain's
        iterations to create twice as many chains.

  Comparing iterations across 2 chains     Highest PSR   Parameter #  
                          501 to 1000            1.211            43  
                         1001 to 2000            1.206            43  
                         1501 to 3000            1.082            41  
                         2001 to 4000            1.106            43  
                         2501 to 5000            1.074            51  
                         3001 to 6000            1.055            43  
                         3501 to 7000            1.023            49  
                         4001 to 8000            1.014            26  
                         4501 to 9000            1.030            41  
                         5001 to 10000           1.017            43  
                         5501 to 11000           1.016            41  
                         6001 to 12000           1.017            41  
                         6501 to 13000           1.012            51  
                         7001 to 14000           1.011            43  
                         7501 to 15000           1.009            26  
                         8001 to 16000           1.008            51  
                         8501 to 17000           1.017            43  
                         9001 to 18000           1.025            43  
                         9501 to 19000           1.022            51  
                        10001 to 20000           1.013            38  


METROPOLIS-HASTINGS ACCEPTANCE RATES:

  Chain 1:

    Variable                            Type    Probability   Target Value 
    x1i                           imputation          0.498          0.500 

  NOTE: Suppressing printing of 1 chains.
        Use keyword 'tuneinfo' in options to override.


DATA INFORMATION:

  Level-2 identifier:       j
  Sample Size:              600
  Level-2 Clusters:         30
  Missing Data Rates:

                        y = 23.50
                      x1i = 13.17
                      x2i = 08.50



MODEL INFORMATION:

  NUMBER OF PARAMETERS
    Outcome Models:         25
    Predictor Models:       0

  PREDICTORS
    Fixed variables:        x3i

  MODELS

   focal:
    [1]  y ~ Intercept x1i x2i x3i x2i*x3i | Intercept x1i

   auxiliary:
    [2]  x1i ~ Intercept x2i x3i
    [3]  x2i ~ Intercept x3i
    [4]  y.missing ~ Intercept y x1i x2i x3i x2i*x3i


WARNING MESSAGES:

  No warning messages.


MODEL FIT:


  INFORMATION CRITERIA

    Conditional Likelihood
      DIC2                  6837.722
      WAIC                  7167.662


CORRELATIONS AMONG RESIDUALS:

  Summaries based on 20000 iterations using 2 chains.


   Level-1:

Correlations                         Median     StdDev       2.5%      97.5%        PSR      N_Eff 
                                -------------------------------------------------------------------

  y, x1i                              0.000      0.058     -0.112      0.114      1.000  20213.905 
  y, x2i                              0.001      0.058     -0.113      0.113      1.000  19890.143 
  y, y.missing                       -0.008      0.057     -0.120      0.104      1.000  17149.392 
  x1i, x2i                            0.001      0.058     -0.113      0.113      1.000  19253.786 
  x1i, y.missing                      0.001      0.058     -0.114      0.114      1.000  19683.078 
  x2i, y.missing                      0.001      0.057     -0.112      0.112      1.000  18028.872 

                                -------------------------------------------------------------------



OUTCOME MODEL ESTIMATES:

  Summaries based on 20000 iterations using 2 chains.

  focal block:

Outcome Variable:  y          

Parameters                           Median     StdDev       2.5%      97.5%        PSR      N_Eff 
                                -------------------------------------------------------------------
Variances:                      
  L2 : Var(Intercept)                 0.266      0.114      0.126      0.565      1.001   4787.825 
  L2 : Cov(x1i,Intercept)             0.145      0.085      0.029      0.359      1.000   4039.526 
  L2 : Var(x1i)                       0.213      0.101      0.090      0.479      1.000   3040.017 
  Residual Var.                       0.941      0.077      0.806      1.108      1.001   2182.799 

Coefficients:                   
  Intercept                          -0.142      0.121     -0.375      0.104      1.007    785.488 
  x1i                                 0.950      0.109      0.739      1.175      1.007   1080.673 
  x2i                                 0.995      0.064      0.869      1.121      1.001   1506.892 
  x3i                                 1.048      0.063      0.924      1.172      1.001   1570.699 
  x2i*x3i                             1.062      0.063      0.939      1.186      1.000    795.070 

Standardized Coefficients:      
  x1i                                 0.338      0.032      0.271      0.398      1.006   1427.634 
  x2i                                 0.344      0.020      0.304      0.383      1.002   3983.370 
  x3i                                 0.357      0.020      0.317      0.395      1.002   3750.748 
  x2i*x3i                             0.397      0.025      0.352      0.448      1.005    791.280 

Proportion Variance Explained   
  by Coefficients                     0.822      0.021      0.772      0.854      1.001   3263.953 
  by Level-2 Random Intercepts        0.033      0.013      0.016      0.068      1.001   4853.665 
  by Level-2 Random Slopes            0.027      0.012      0.012      0.058      1.000   3607.067 
  by Level-1 Residual Variation       0.115      0.011      0.096      0.138      1.001   1735.684 

                                -------------------------------------------------------------------


  auxiliary block:

Outcome Variable:  x1i        

Parameters                           Median     StdDev       2.5%      97.5%        PSR      N_Eff 
                                -------------------------------------------------------------------
Variances:                      
  L2 : Var(Intercept)                 0.008      0.010      0.000      0.038      1.003    744.775 
  Residual Var.                       0.782      0.049      0.693      0.886      1.000   9679.504 

Coefficients:                   
  Intercept                           0.025      0.043     -0.061      0.109      1.000   7721.251 
  x2i                                 0.330      0.043      0.245      0.417      1.000   6697.415 
  x3i                                 0.294      0.043      0.210      0.377      1.000   6863.699 

Standardized Coefficients:      
  x2i                                 0.319      0.039      0.241      0.394      1.000   6886.435 
  x3i                                 0.280      0.039      0.202      0.354      1.000   7003.486 

Proportion Variance Explained   
  by Coefficients                     0.242      0.031      0.181      0.303      1.000   7237.416 
  by Level-2 Random Intercepts        0.007      0.010      0.000      0.036      1.003    729.177 
  by Level-1 Residual Variation       0.748      0.032      0.685      0.810      1.000   4840.951 

                                -------------------------------------------------------------------



Outcome Variable:  x2i        

Parameters                           Median     StdDev       2.5%      97.5%        PSR      N_Eff 
                                -------------------------------------------------------------------
Variances:                      
  L2 : Var(Intercept)                 0.009      0.011      0.000      0.042      1.002    860.894 
  Residual Var.                       0.869      0.053      0.776      0.980      1.000  12574.055 

Coefficients:                   
  Intercept                          -0.040      0.044     -0.126      0.048      1.000  10571.698 
  x3i                                 0.340      0.041      0.259      0.419      1.001   4730.902 

Standardized Coefficients:      
  x3i                                 0.333      0.037      0.257      0.401      1.001   4387.780 

Proportion Variance Explained   
  by Coefficients                     0.111      0.024      0.066      0.161      1.001   4673.270 
  by Level-2 Random Intercepts        0.009      0.011      0.000      0.041      1.002    850.700 
  by Level-1 Residual Variation       0.878      0.026      0.824      0.925      1.000   2439.786 

                                -------------------------------------------------------------------



Latent Variable:   y.missing  

Parameters                           Median     StdDev       2.5%      97.5%        PSR      N_Eff 
                                -------------------------------------------------------------------
Variances:                      
  L2 : Var(Intercept)                 0.089      0.088      0.006      0.336      1.007    700.322 
  Residual Var.                       1.000      0.000      1.000      1.000        nan        nan 

Coefficients:                   
  Intercept                          -1.312      0.153     -1.657     -1.051      1.003    415.924 
  y                                   0.547      0.163      0.224      0.874      1.001    320.824 
  x1i                                 0.192      0.188     -0.178      0.557      1.000    543.809 
  x2i                                 0.016      0.206     -0.364      0.439      1.004    321.074 
  x3i                                -0.134      0.188     -0.509      0.240      1.001    385.638 
  x2i*x3i                            -0.339      0.227     -0.837      0.058      1.011    261.477 

Standardized Coefficients:      
  y                                   0.844      0.221      0.355      1.221      1.000    331.282 
  x1i                                 0.106      0.105     -0.091      0.316      1.000    508.420 
  x2i                                 0.009      0.108     -0.183      0.238      1.003    333.876 
  x3i                                -0.071      0.096     -0.245      0.131      1.001    403.082 
  x2i*x3i                            -0.199      0.115     -0.420      0.036      1.011    286.297 

Proportion Variance Explained   
  by Coefficients                     0.673      0.054      0.564      0.773      1.002    329.050 
  by Level-2 Random Intercepts        0.026      0.024      0.002      0.091      1.005    684.396 
  by Level-1 Residual Variation       0.295      0.050      0.204      0.396      1.007    326.634 

                                -------------------------------------------------------------------

Select sections of the output are now discussed.

7.3.1 Outcome Models

This model, with this simulated dataset, with this seed, did remarkably well at recovering the true estimates. Notice all those qualifiers. In preparing these materials, many iterations of these models were tested and reconfigured.

Did every model produce posterior parameter distributions with median 1? No.

Were the models almost always more accurate than the listwise deletion model? Yes.

It was not just the more obvious parameter estimates that moved closer to their true values, but also parameters like the random effect variances. In some cases, R had issues with singular fit where the random intercept and slope correlation was a perfect 1.00, but Blimp actually recovered the true correlation (~ 0.5).

7.4 Exercise

Fit a multilevel model with the dataset exercise_04.csv. (All files are available at this link.) Regress y on x1, x2, and x3. You have reason to believe x1i is missing not at random and that x2i is missing conditional on x3i. The groups are given as j. Allow the slope of x3i to vary across groups. The coefficients on all three predictors should be around 1.


  1. Du et al., 2022↩︎

  2. Du et al., 2022↩︎