5 Generalized Responses
Structural equation models are not limited to equations with normal response variables, and MPlus can work with a variety of response variable types. These include
- binary responses (both logit and probit)
- ordered categorical responses
- multinomial (nominal) categorical responses
- count (Poisson) responses
- censored responses
In MPlus, exogenous variables are essentially untyped.
This flexibility applies only to variables that appear as responses. Exogenous variables are assumed to be scaled appropriately for computation - either continuous or already encoded as \(k-1\) set of indicators (0/1). An optional DEFINE
section of code can be used to transform raw data as needed.
Except for multinomial variables (unordered categories), these responses may be also be used as endogenous independent variables in equations.
For the most part, variable types are declared in the VARIABLE
section of your code. The exception is that the distinction between logit and probit responses is made by the ESTIMATOR
that is specified in the ANALYSIS
section. (Keep in mind that there is always a default estimator, and that the default depends on several features of your model)!
5.1 Binary Outcomes
Binary outcomes can be modeled as either logit or probit variables. For both types of outcome, the variable is declared to be CATEGORICAL
, and the distinction is made by specifying the correct ESTIMATOR.
- For logit outcomes, specify the
ML
orMLR
estimator. - For probit outcomes, specify the
WLS
estimator.
All categorical outcomes in a given model must be of the same type.
If you do not explicitly specify any estimator, you are modeling probit responses.
One of the WLS class of estimators is the default estimator if no explicit specification is made and there are any CATEGORICAL variables. This is in contrast the ML estimator that is the default when all the response variables are normally distributed.
5.1.1 Binary Logit
Use both a CATEGORICAL statement, and an ESTIMATOR statement.
TITLE: Logit (like ex 3.5 in the manual);
DATA: FILE = ex3.5.dat;
VARIABLE:
NAMES ARE u1 x1 x3;
CATEGORICAL IS u1;
ANALYSIS:
ESTIMATOR = ML; ! Do NOT use the default estimator, WLS;
MODEL: u1 ON x1 x3;
An alternative specification is discussed below, under Multinomial Responses.
5.1.2 Binary Probit
Use a CATEGORICAL statement, and the default estimator or another WLS estimator. In this example, we use the default.
TITLE: Probit (like ex 3.4 in the manual);
DATA: FILE = ex3.5.dat;
VARIABLE:
NAMES ARE u1 x1 x3;
CATEGORICAL IS u1;
MODEL: u1 ON x1 x3;
5.1.3 Output Differences
In the SUMMARY OF ANALYSIS portion of the output there will be a line that identifies the link function as either LOGIT or PROBIT. This can be difficult to spot if you are not looking carefully!
- An ML model (logit variables) gives you maximum likelihood fit statistics: Log-likelihood, AIC, and BIC. It also reports logit coefficients in both log-odds and odds ratio forms.
- A WLS model (probit variables) gives you least squares fit statistics: RMSEA, CFI/TLI. It also reports R-squared statistics for both probit and normally distributed variables.
5.2 Ordered Responses
This is the default analysis when your data has more than one category in any CATEGORICAL variable - your code looks the same as in the binary response models above.
With logit responses you get an additional test of the proportional odds assumption in your output.
5.3 Multinomial (Nominal) Responses
Specify that a response variable represents unordered categories with a NOMINAL
statement.
Here, the only link is a LOGIT, and the ESTIMATOR must be in the ML family. a robust MLR estimator is now the default. (Attempting to use a WLS type of estimator returns ML output with a warning.)
Categories are renumbered from the data, in numerical order (as with any software). We then refer to individual categories by a combination of the variable name and the category number. The base category for odds ratio interpretation is the final category.
A simple model specification is:
TITLE: Multinomial Logit;
DATA: FILE = ex3.6.dat;
VARIABLE: NAMES ARE u1 x1 x3;
NOMINAL = u1;
MODEL: u1 ON x1 x3;
5.3.1 Binary Nominal Responses
It no doubt occurs to you that we could use a NOMINAL variable type rather than a CATEGORICAL variable type, and not have to pay so much attention to the ESTIMATOR. That is, our logit model above could also have been specified
TITLE: Logit (similar to above);
DATA: FILE = ex3.5.dat;
VARIABLE:
NAMES ARE u1 x1 x3;
NOMINAL IS u1;
! We can skip the estimator specification!;
MODEL: u1 ON x1 x3;
The difference in these models is the base category. With a CATEGORICAL variable the base is the first category, while with a NOMINAL variable the base is the last category.
5.3.2 Constraints in Multinomial Models
To have more control over model specification we have to introduce notation for the nominal categories.
In the model above the first category of u1
is u1#1
, while the second category is u1#2
. So a more verbose version of the simple model is:
TITLE: Multinomial Logit;
DATA: FILE = ex3.6.dat;
VARIABLE: NAMES ARE u1 x1 x3;
NOMINAL = u1;
MODEL: u1#1 u1#2 ON x1 x3;
[u1#1 u2#2];
Model constraints are explored in more depth later. But as a simple illustration of where this additional notation might be useful, suppose our theory suggested that x1 had no effect on the log-odds of category 2 to category 3. We could specify
TITLE: Multinomial Logit;
DATA: FILE = ex3.6.dat;
VARIABLE: NAMES ARE u1 x1 x3;
NOMINAL = u1;
MODEL:
u1#1 u1#2@0 ON x1;
u1#1 u1#2 ON x3;
[u1#1 u2#2];
5.4 Count (Poisson) Responses
Working with count outcomes is as simple as declaring a variable’s type to be COUNT
.
For example (from the MPlus manual)
TITLE: Poisson regression (ex 3.7);
DATA: FILE IS ex3.7.dat;
VARIABLE: NAMES ARE u1 x1 x3;
COUNT IS u1;
MODEL: u1 ON x1 x3;
The default estimator is MLR.