Complete Data | Listwise Deletion | |||||
Parameter | Estimate | SE | CI | Estimate | SE | CI |
Intercept | -0.31 | 0.08 | -0.46 – -0.16 | -0.36 | 0.08 | -0.53 – -0.20 |
x1 | 0.76 | 0.10 | 0.57 – 0.95 | 0.66 | 0.12 | 0.43 – 0.90 |
x2 | 0.55 | 0.09 | 0.38 – 0.72 | 0.58 | 0.10 | 0.38 – 0.78 |
x1 × x2 | 0.65 | 0.09 | 0.47 – 0.83 | 0.64 | 0.11 | 0.42 – 0.86 |
Observations | 1000 | 800 |
5 Logistic
The files for this page’s example analysis are
02_logistic.R
: the R script that simulates the data and fits frequentist models02_logistic.csv
: the data file02_logistic.imp
: the Blimp script02_logistic.blimp-out
: the Blimp output file
All files are available at this link.
The following issues are discussed:
- Logistic regression
- Missing data mechanisms: MAR
5.1 Example Model
The dataset for this example has 1000 observations. 20% of x2 is missing. x2 is missing at random (MAR); its missingness is conditional on the values of x1.
Below is a table of estimates from a model predicting y with (1) the complete data and (2) the data after some values in x2 were made missing, using a listwise deletion procedure.
As expected under MAR, estimates are biased. The estimate for x1 is underestimated.
5.2 Blimp Input
The Blimp script for this analysis is as follows:
DATA: 02_logistic.csv;
ORDINAL: y;
MISSING: -999;
FIXED: x1;
MODEL:
logit(y) ~ x1 x2 x1*x2;
x2 ~ x1;
SEED: 123;
BURN: 10000;
ITERATIONS: 10000;
Select sections of the script are now discussed.
5.2.1 Ordinal Variables
y is a binary variable. To tell Blimp to interpret it as such, we list its name after the ORDINAL
command.
5.2.2 Missing Data
x1 is fully observed so it is listed after the FIXED
command, but x2 has missing values, which were coded as -999.
Only a single numeric code can be used for missing values across all variables, so when preparing your data be sure to pick one that is never an existing, legitimate value for any variable.
5.2.3 Model Syntax
If we fit a regression model that predicts an ordinal variable, Blimp will by default fit a probit regression:
ORDINAL: y;
MODEL: y ~ x1 x2 x1*x2;
Wrapping the outcome variable in logit()
fits a logistic regression instead.
The predictors in our model include an interaction. Blimp does not have a shortcut for interactions that automatically expands to include the simple effects, so we need to specify each term separately.
5.2.4 Factored Regression Specification
Blimp assumes multivariate normality among predictors. This works fine for continuous and categorical variables. (Blimp assumes normal distributions underlying binary, ordinal, and multicategorical variables.) However, if we have skewed, latent, or count predictors, we need to explicitly specify associations among our predictors.
Here, we model an interaction between x1 and x2, and the product of two distributions is generally not a normal distribution. For any variables with missing values that do not have a multivariate normal distribution with the other predictors, we need to specify a regression. The general structure of this formula is to put variables with missing values on the left, and fixed variables on the right. Here we have the simple formula:
x1 ~ x2;
If none of our variables were fixed, we would instead specify:
x1 x2 ~ 1;
The order of the variables on the left side is important. First list continuous variables, and then categorical variables. Within each variable type, order them from highest missing data rate to lowest missing data rate:
continuous_high_misssing continuous_low_missing categorical_high_misssing ~ fixed (or 1)
Correctly ordering the variables is very important. Failure to do so can bias the results.1
5.3 Blimp Output
Running the Blimp script produces this output file:
---------------------------------------------------------------------------
Blimp
3.2.10
Blimp was developed with funding from Institute of
Education Sciences awards R305D150056 and R305D190002.
Craig K. Enders, P.I. Email: cenders@psych.ucla.edu
Brian T. Keller, Co-P.I. Email: btkeller@missouri.edu
Han Du, Co-P.I. Email: hdu@psych.ucla.edu
Roy Levy, Co-P.I. Email: roy.levy@asu.edu
Programming and Blimp Studio by Brian T. Keller
There is no expressed license given.
---------------------------------------------------------------------------
ALGORITHMIC OPTIONS SPECIFIED:
Imputation method: Fully Bayesian model-based
MCMC algorithm: Full conditional Metropolis sampler with
Auto-Derived Conditional Distributions
Between-cluster imputation model: Not applicable, single-level imputation
Prior for random effect variances: Not applicable, single-level imputation
Prior for residual variances: Zero sum of squares, df = -2 (PRIOR2)
Prior for predictor variances: Unit sum of squares, df = 2 (XPRIOR1)
Chain Starting Values: Random starting values
NOTE: The default prior for regression coefficients
in categorical models is 'normal( 0.0, 5.0)'
BURN-IN POTENTIAL SCALE REDUCTION (PSR) OUTPUT:
NOTE: Split chain PSR is being used. This splits each chain's
iterations to create twice as many chains.
Comparing iterations across 2 chains Highest PSR Parameter #
251 to 500 1.023 4
501 to 1000 1.014 14
751 to 1500 1.011 2
1001 to 2000 1.010 2
1251 to 2500 1.005 14
1501 to 3000 1.013 14
1751 to 3500 1.005 1
2001 to 4000 1.004 1
2251 to 4500 1.004 14
2501 to 5000 1.003 2
2751 to 5500 1.001 14
3001 to 6000 1.001 10
3251 to 6500 1.003 13
3501 to 7000 1.002 13
3751 to 7500 1.001 14
4001 to 8000 1.001 6
4251 to 8500 1.001 6
4501 to 9000 1.002 14
4751 to 9500 1.002 6
5001 to 10000 1.001 3
METROPOLIS-HASTINGS ACCEPTANCE RATES:
Chain 1:
Variable Type Probability Target Value
x2 imputation 0.499 0.500
NOTE: Suppressing printing of 1 chains.
Use keyword 'tuneinfo' in options to override.
DATA INFORMATION:
Sample Size: 1000
Missing Data Rates:
x2 = 20.00
y = 00.00
MODEL INFORMATION:
NUMBER OF PARAMETERS
Outcome Models: 7
Predictor Models: 0
PREDICTORS
Fixed variables: x1
MODELS
[1] x2 ~ Intercept x1
[2] logit(y) ~ Intercept x1 x2 x1*x2
WARNING MESSAGES:
No warning messages.
MODEL FIT:
INFORMATION CRITERIA
Marginal Likelihood
DIC2 3824.193
WAIC 3901.919
Conditional Likelihood
DIC2 3824.193
WAIC 3901.919
CORRELATIONS AMONG RESIDUALS:
Summaries based on 10000 iterations using 2 chains.
Correlations Median StdDev 2.5% 97.5% PSR N_Eff
-------------------------------------------------------------------
x2, y -0.176 0.062 -0.295 -0.050 1.000 5325.489
-------------------------------------------------------------------
OUTCOME MODEL ESTIMATES:
Summaries based on 10000 iterations using 2 chains.
Outcome Variable: x2
Parameters Median StdDev 2.5% 97.5% PSR N_Eff
-------------------------------------------------------------------
Variances:
Residual Var. 0.793 0.039 0.720 0.875 1.001 3983.108
Coefficients:
Intercept 0.030 0.033 -0.034 0.094 1.000 2436.740
x1 0.514 0.037 0.442 0.586 1.000 1819.194
Standardized Coefficients:
x1 0.484 0.028 0.427 0.536 1.000 1889.033
Proportion Variance Explained
by Coefficients 0.235 0.027 0.182 0.287 1.000 1885.025
by Residual Variation 0.765 0.027 0.713 0.818 1.000 1885.025
-------------------------------------------------------------------
Outcome Variable: logit(y)
Parameters Median StdDev 2.5% 97.5% PSR N_Eff
-------------------------------------------------------------------
Coefficients:
Intercept -0.313 0.079 -0.470 -0.159 1.001 4618.388
x1 0.756 0.102 0.560 0.961 1.002 2921.593
x2 0.585 0.104 0.384 0.797 1.000 2328.084
x1*x2 0.675 0.110 0.466 0.896 1.001 1654.106
Odds Ratio:
Intercept 0.731 0.058 0.625 0.853 1.001 4664.919
x1 2.131 0.220 1.751 2.613 1.002 2913.850
x2 1.795 0.189 1.469 2.218 1.000 2312.561
x1*x2 1.965 0.219 1.594 2.450 1.001 1649.546
Proportion Variance Explained
by Coefficients 0.081 0.011 0.061 0.104 1.001 2054.962
by Residual Variation 0.919 0.011 0.896 0.939 1.001 2054.962
-------------------------------------------------------------------
Select sections of the output are now discussed.
The PSRs at the end of the burn-in are 1.001 or less, and the numbers of effective samples for all parameters are well over the cutoff of 100. We can consider the model as adequately converged.
5.3.1 Outcome Models
The Outcome Variable: logit(y)
section has the focal model estimates. It includes both log odds coefficients (called “coefficients”) and odds ratios.
This model did well in reproducing the results from our frequentist logistic regression fit to the complete data. x2 was MAR, conditional on x1. The predictor model regressing x2 on x1 helped impute plausible values for x2, correcting the bias in the model with listwise deletion.
Again, Blimp used all 1000 observations, even though x2
was missing values.
5.4 Exercise
Fit a logistic regression model with the dataset exercise_02.csv
. (All files are available at this link.) Regress y on x1, x2, x3, and the interaction of x2 and x3. You have reason to believe y is missing completely at random, and x1 is missing conditional on x2 and x3. Missing values are coded as 99.
The coefficients on the predictors from a frequentist logistic regression are as follows:
- x1: 0.54
- x2: 0.62
- x3: 0.85
- x2*x3: 1.09
Lüdtke, O., Robitzsch, A., & West, S. G. (2020). Regression models involving nonlinear effects with missing data: A sequential modeling approach using Bayesian estimation. Psychological Methods, 25(2), 157-181. https://doi.org/10.1037/met0000233↩︎