# Specifying using paths with categorical manifests

7 posts / 0 new
Offline
Joined: 01/21/2011 - 13:24
Specifying using paths with categorical manifests

I have been working my way through a number of examples from a couple of books implementing them in OpenMx in order to learn how it works. So far so good. I now want to move on to using categorical/ordinal manifest variables. The examples I have found on this web site either in the formal documentation or in the forums all seem to do this using the matrix specification. Is there any fatal flaw in trying to do this using the path type specification (which I find more congenial)? I understand that I could part specify the model using the path method and then dump out the matrices but that seems a bit inelegant even if it is true. Of course if the answer is ' You looked in the wrong place' then a pointer would be helpful.

Offline
Joined: 07/31/2009 - 15:12

Unfortunately, the answer is "not yet." I'll go over how to do everything, then explain why there isn't yet a path spec for thresholds, and hopefully say enough to get a discussion going.

You don't have to 'dump out' the model, but you do have to provide thresholds in a matrix. The steps will be:
-specify paths the same way you always do.
-update the objective function with the name of your thresholds matrix.

I'll give you a choice of two ways of doing this. If you want to do everything in one step, then write something like this, that uses mxRAMObjective to rewrite the objective function to include a thresholds matrix:

model <- mxModel("OrdinalPathModel", type='RAM',
mxData(mydata, "raw"),
mxPath("x", "y", ... ),
...
mxMatrix("Full", .... , name="thresholdsMatrix"),
mxRAMObjective("A", "S", "F", "M", thresholds="thresholdsMatrix")
)

Alternatively, you could skip the mxRAMObjective line and just change the thresholds slot of the objective function like this:

<code>
model@objective@thresholds <- "thresholdsMatrix")

There are at least two reasons why you can't add an mxPath statement to specify thresholds. One is that we haven't agreed on a specification; it's not a path from anything, so we have to come up with an intuitive alternative function to specify them.

The bigger reason is that we don't know how many categories your data has, and thus don't know how big to make the thresholds matrix. Path models (specified with type='RAM') have clearly defined sizes and row/column names for the A, S, F and M matrices that are populated from the manifestVars and latentVars arguments. If you have manifest variables x, y and z and latent variables a and b, then we know that the A matrix will be 5 x 5 with rows and columns named x, y, z, a and b.

That is not true of thresholds; while we can figure out that the number of columns is the number of manifest variables (for now; joint ordinal/continuous is coming soon), we don't know how many rows the threshold matrix has. We can't assume the number of categories from the data, because there are cases where you'd want to assign more thresholds than there are categories (think of multiple group models where one group doesn't use all of the categories), and we can't assume that the maximum number of categories you specify is the number of rows, else the code would break if you tried to update the model and add more thresholds after the initial model specification.

Hopefully this explanation helps convey that thresholds from a path spec are tough, though we'll have to figure out some path-like specification once the GUI is in place. Best of luck with the package, and keep the questions coming!
Offline
Joined: 01/21/2011 - 13:24
Thank you Ryne. You have

Thank you Ryne. You have answered the question I would have asked if I had understood things better rather than the clumsy effort I managed. Your crystal ball is obviously in good shape. I assumed I probably still had to specify the thresholds as a matrix but I wanted to know whether I could do the genuine paths with mxPath. As you confirm, I can.

Offline
Joined: 01/21/2011 - 13:24
After some delay I have now

After some delay I have now got started on this and present some feedback. I decided to use datasets from the package ltm, available from CRAN, to test my understanding. At least then I have an approximate idea of what i should find. Note that I am not intending ultimately to fit IRT models using OpenMx, ltm is a specialised tool for that.

The attached file shows code for fitting the Rasch model to one of the datasets from ltm. I fix the means of the manifests to zero, their variance to unity and constrain all the loadings to equality. Strictly speaking for the Rasch model I should further constrain the loadings to unity but they are often estimated. The difficulty parameters from the rasch function in ltm and the thresholds from OpenMx should correspond and although they are different in value their correlation is as close to unity as makes no difference. So far so good but I have some comments and questions

1 - I have to revert the names of the variables after calling mxFactor as it changes them (by replacing space with . (dot). As far as i can see this is undocumented and leads to names which the remainder of Openmx rejects. Is this intentional?
2 - I do not understand what "one" is. I can find copious examples in the main manual of its use and it has the desired effect but I would like to understand what is going on under the bonnet (US=hood).

Of course if anyone has any other comments, improvements, or hints on how to map my OpenMx results onto the ones I get from rasch I would be very happy to know.

Offline
Joined: 07/31/2009 - 15:24
With regards to reverting the

With regards to reverting the names of the variables after calling mxFactor: the mxFactor() function constructs a new data.frame object whenever the input is a data.frame object. Currently, the call to data.frame() uses the default setting for check.names = TRUE. The names of the variables in the data frame are checked to ensure that they are syntactically valid variable names and are not duplicated. If necessary they are adjusted (by make.names) so that they are. make.names uses "." to eliminate any spaces in the variable names.

Unfortunately the "." character has a special meaning in the OpenMx library so the two features are clashing. I'm going to check in a patch to the OpenMx development branch so that check.names = FALSE in the mxFactor() function. You'll see this change in the next OpenMx pre-release.

Offline
Joined: 07/31/2009 - 15:12

I'm glad you got this running. As your question about MxFactor's been answered, I'll deal with you other one. The reserved word "one" represents a vector of 1s which is used in SEM to add means/intercepts to a model. It is identical to defining the intercept in a regression or GLM-based technique by creating a new variable of constant 1s and using its regression coefficient as the intercept. It is typically represented as a triangle on path diagrams with a big "1" in the middle (sometimes with and sometimes without a variance, which is a point of much consternation. I believe it should have one, if that matters). We chose to add a reserved word to make path specification easier and avoid the problem we discussed previously of manually specifying a means matrix like we have to do for thresholds.

The differences in ltm and OpenMx likely have to do with scaling. Discrimination and difficulty should be the factor loading and threshold/factor loading respectively, so I wonder why we're seeing a difference.

You can simplify your code by removing some of the 'rep' functions. When a single value is given for free, values or label, it is repeated for all requested paths or elements of a matrix. mxPath(manifests, arrows=2, free=F, values=1) specifies all of your fixed residual variances to unity, though they're not labeled. While you can shorten your code by omitting the paths that constrain the manifest means to zero, I think it's great that you include them.

Offline
Joined: 01/21/2011 - 13:24
May I suggest that a couple

May I suggest that a couple of sentences from your reply about "one" might usefully make it into the next edition of the manual? It is obvious when you explain it but ...

As for the scaling issue (1) I set the variances of the manifest variables to (pi^2)/3 (2) re-read the ltm documentation and realised I had to ask for the other parameterisation to get the effect of a non-linear factor analysis. Agreement is good and as close as I think it reasonable to get given the slightly different underlying models. I deduce that OpenMx is doing the equivalent of a k-parameter probit model rather than a k-parameter logistic. I have subsequently got good agreement with the 2 parameter logistic (code available if anyone wants it).

Next, ordinal manifests with three categories.