getting started
Posted on

Forums
Hi, I'm trying to figure out how to specify models with categorical outcomes. Can you get me started with a modification of your simple example below -- how would you change it to work for categorical outcomes?
require(OpenMx)
data(demoOneFactor)
manifests <- names(demoOneFactor)
latents <- c("G")
factorModel <- mxModel("One Factor",
type="RAM",
manifestVars=manifests,
latentVars=latents,
mxPath(from=latents, to=manifests),
mxPath(from=manifests, arrows=2),
mxPath(from=latents, arrows=2, free=F, values=1.0),
mxData(cov(demoOneFactor), type="cov", numObs=500))
summary(mxRun(factorModel))
Attached please find a sample
Log in or register to post comments
In reply to Attached please find a sample by Steve
Thanks Steve! But what is
But what is the goal for this code? The estimates seem to be correlations among manifest variables (polychorics?) and thresholds. I thought we might be fitting a one-factor model but there are no factor loadings (that I have found). Am I supposed to add code for the factor model part?
Log in or register to post comments
In reply to Thanks Steve! But what is by winston
Yes, but specifying a factor
Log in or register to post comments
I am trying to do the same,
Unfortunately, running the model results in the error
Error: The job for model '1factor_model_RAM_thresh' exited abnormally with the error message: Objective function returned an infinite value.
I get the same error if I start from the mxFIMLObjective specification in the example 'OrdinalTest.R' (without the path-specification part), so I guessing I do not really understand how to specify thresholds...
Log in or register to post comments
In reply to I am trying to do the same, by mhermans
Hi You were very close to
You were very close to having it run;
values=cbind( (-2:1),
(-2:1),
(-2:1)),
would have "worked" in that the model would be fitted (see ###1 below), although the model is not identified:
The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
free parameters: thresh 1 belang -0.68785859 NaN thresh 2 belang -0.13691998 1.4253940 thresh 3 belang 0.89381646 1.8988307 thresh 4 belang 1.61439240 0.4731209 thresh 1 moskee 0.31658257 NaN thresh 2 moskee 0.50944070 NaN thresh 3 moskee 0.98157705 NaN thresh 4 moskee 1.56038768 NaN thresh 1 koran 0.35927457 NaN thresh 2 koran 0.46067897 2.7499008 thresh 3 koran 0.65612570 4.7995829 thresh 4 koran 0.90837487 6.6301938
name matrix row col Estimate Std.Error
1 l21 A 2 4 1.34006721 NaN
2 l31 A 3 4 0.64320758 NaN
3 e1 S 1 1 0.49797100 NaN
4 e2 S 2 2 0.13839618 NaN
5 e3 S 3 3 0.06882259 0.3997287
6 varReligie S 4 4 1.57808455 NaN
7 meanbelang M 1 1 1.00129453 1.8267578
8 meanmoskee M 1 2 -0.63725041 NaN
9 meankoran M 1 3 -0.22759640 NaN
10
11
12
13
14
15
16
17
18
19
20
21
observed statistics: 8370
estimated parameters: 21
degrees of freedom: 8349
-2 log likelihood: 15275.88
To identify the model, it is necessary to fix the first two thresholds of each variable, because the mean and the variance of each variable are being estimated. I prefer to fix these thresholds to zero and one. The attached runs nicely and Std Errors are reported happily:
> summary(model.thresh.run)
data:
$`1factor_model_RAM_thresh.data`
belang moskee koran
1:329 1:1958 1:2072
2:274 2: 89 2: 98
3:733 3: 219 3: 181
4:514 4: 241 4: 189
5:940 5: 283 5: 250
free parameters: thresh 3 belang 2.870869 0.1291761 thresh 4 belang 4.178778 0.2040375 thresh 3 moskee 3.448120 0.3028074 thresh 4 moskee 6.449431 0.6109066 thresh 3 koran 2.927398 0.2374341 thresh 4 koran 5.414895 0.4842495
name matrix row col Estimate Std.Error
1 l21 A 2 4 3.828229 0.4445358
2 l31 A 3 4 3.494584 0.3962465
3 e1 S 1 1 1.640563 0.2130230
4 e2 S 2 2 3.721492 1.1765354
5 e3 S 3 3 6.692506 1.4842776
6 varReligie S 4 4 5.199057 0.6049877
7 meanbelang M 1 1 3.065956 0.1483182
8 meanmoskee M 1 2 -4.945852 0.5853761
9 meankoran M 1 3 -5.787413 0.6382787
10
11
12
13
14
15
observed statistics: 8370
estimated parameters: 15
degrees of freedom: 8355
-2 log likelihood: 15275.88
Note that the -2 log likelihoods are the same with the underidentified and the identified versions; when fitting the underidentified version, the optimizer found one of the infinite number of solutions that fit equally well. Standard errors were a mess, though, as would be expected.
Paras Mehta describes the 0:1 specification of thresholds for ordinal models in this way in this article:
Mehta PD, Neale M.C., Flay BR: Squeezing interval change from ordinal panel data: latent growth curves with ordinal outcomes. Psychol Methods 2004; 9:301-333 http://www.vipbg.vcu.edu/vipbg/Articles/psychmethods-squeezing-2004.pdf
One final comment: I prefer to fix the factor variance to 1.0 and estimate all the factor loadings rather than single out one factor loading to be fixed to 1.0. The solutions are equivalent, and the pros and cons of each have been discussed in the literature.
Log in or register to post comments
In reply to Hi You were very close to by neale
Thanks for the correction.
... the 0:1 specification of thresholds for ordinal models ...
Is it possible to use the parametrization with all thresholds free, and the mean and variance of the (latent response) variable fixed? I tried the naive approach of just applying those changes to the M, S and thresh matrices, but the results are very different from those that Mplus gives for that parametrization...
Regarding estimation, I seem to remember from class that fully weighted or robust least squares should be used when dealing ordinal indicators. Is this possible in OpenMx, or is FILM adequate?
Regarding output, for the resulting object from mxRun(), holds
model$covariance@result
the sample covariance matrix andmodel@algebras$covariance@result
the model implied covariance matrix?Apologies for all the basic questions, I'll promise in return to expand the "OpenMx for Mplus users" wikipage once I get the hang of this...
Log in or register to post comments
In reply to Thanks for the correction. by mhermans
model$covariance and
model$covariance
andmodel@algebras$covariance
reference the same entity. Try to avoid using the '@' operator, as slot names may change as OpenMx develops over time. The style guide has some tips on using the '$' operator. WRT your question, the entity model$covariance stores the expected covariance matrix. Its final value should be accessed withmxEval(covariance, model)
, this avoids using the '@' operator on the 'result' slot.Log in or register to post comments
In reply to Thanks for the correction. by mhermans
Hi mhermans Is it possible
Is it possible to use the parametrization with all thresholds free, and the mean and variance of the (latent response) variable fixed? I tried the naive approach of just applying those changes to the M, S and thresh matrices, but the results are very different from those that Mplus gives for that parametrization...
Yes. However, you need to constrain the variances of the observed variables to equal 1.0. Your "naive" attempt did not do this; I attach a script in which these constraints were implemented. Good agreement with WLSMV estimates from Mplus. There is a problem with MxModels with non-linear constraints - the Std Errors are wrong. Therefore I made yet another version which uses mxAlgebra to calculate the residual variances (1-factorloading^2) instead. Std Errors are ok in this version (also part of attached script).
On the FIML vs. WLSMV issue there are pros and cons of each method. FIML delivers maximum likelihood parameter estimates, which have the usual advantages of i) being asymptotically unbiased, and ii) minimum variance of all estimators with property i). In addition, being a likelihood framework, model comparison via likelihood ratio test is possible. The disadvantage of FIML is that it takes a long time when there are many variables. In my experience around 15 or more gets painfully slow, due to the numerical integration over so many dimensions. It is possible to overcome this in part by use of a marginal maximum likelihood procedure in which one integrates over the factor space, but this is not very straightforward to specify. The good agreement between FIML and WLSMV in this case may or may not be general; there are likely some simulation studies out there. I note that there aren't any missing values in your dataset; I speculate that the divergence between FIML and WLSMV may increase when there missing values, depending on the mechanism for missingness.
Update:
The following code, if run after the thresh.R script, will produce standardized thresholds, A and S matrices, but from the zero/one fixed threshold approach. The real beauty of the 0/1 fixed threshold approach is that the rest of the model is entirely equivalent to the continuous case, containing variances, covariances and means. Thus, e.g., a latent growth curve model is straightforward to modify for use with ordinal data.
#
# Attempt to standardize the 0/1 fixed threshold estimates
# using a nthresholds x 1 Unit vector U to reproduce rows etc.
#
nthresh<-4
model.thresh.run.mod <- mxModel(model.thresh.run,mxMatrix(type="Unit",nrow=nthresh,ncol=1,name="U"))
standardizedThresholds <- mxEval((thresh - ((U %x% M[,1:3])) ) / (U %x% t(sqrt(diag2vec(F%&%(solve(I-A)%&%S))))) ,model.thresh.run.mod)
# Flushed with success at that, we now go after Standardized A and S matrices (loadings & initial covariances)
D <- mxEval(vec2diag(sqrt(diag2vec((solve(I-A)%&%S)))) ,model.thresh.run.mod)
standardizedLoadings <- mxEval(solve(D) %*% A %*% D,model.thresh.run.mod)
Ssd <- mxEval(solve(vec2diag(sqrt(diag2vec(S)))),model.thresh.run.mod)
standardizedSmatrix <- mxEval(Ssd %*% S %*% Ssd,model.thresh.run.mod)
Log in or register to post comments