Currently one of the least polished functions, and at the same time most visible to end users is the summary() function applied to the results of a model optimization.

We need suggestions about what this output should look like.

Wed, 08/05/2009 - 10:11

#1
summary(myFitModel). What should be printed? What should be returned in a list()?

Currently one of the least polished functions, and at the same time most visible to end users is the summary() function applied to the results of a model optimization.

We need suggestions about what this output should look like.

If you run a CFA, the summary() output doesn't tell you what any of the paths are - they are just numbered 1-12.

I would expect summary to generate path names like "from<->to", in this case

g<->g

g->X1

g->X2

...

(nb: no fit statistics are given, but there are place-holders: Is that a bug, or are fit statistics not done yet?

require(OpenMx)

data(myFADataRaw); myData = myFADataRaw[,1:6]; # Just the first 6

manifests = names(myData)

#[1] "x1" "x2" "x3" "x4" "x5" "x6"

latents = c("G")

factorModel = mxModel(

"One Factor", type="RAM",

manifestVars = manifests,

latentVars = latents,

mxPath(from = latents, to=manifests),

mxPath(from = manifests, arrows=2),

mxPath(from = latents , arrows=2, free=F, values=1.0),

mxData(cov(myData), type="cov",numObs=500)

)

summary(mxRun(factorModel))

#Running One Factor

# x1 x2 x3 x4 x5 x6

# Min. :0.6109 Min. :0.6081 Min. :0.6081 Min. :0.6333 Min.

#...

# name parameter estimate error estimate

#1 0.8038825 47.75491

#2 0.8033414 45.51203

#3 0.7709466 44.89804

#4 0.8262442 45.50276

#5 0.8107035 40.79525

#6 0.8209529 37.36188

#7 0.3510382 52.97120

#8 0.3795362 49.66002

#9 0.3896020 49.53189

#10 0.3207061 56.40536

#11 0.3709618 49.80632

#12 0.3465167 51.79254

#Observed statistics: 21

#Estimated parameters: 12

#Degrees of freedom: 9

#AIC:

#BIC:

#adjusted BIC:

#RMSEA:

I like the idea of arrow notation when a parameter isn't labeled.

Arrow notation will only work for RAM models. I like the idea of making the summary function more informative, but the arrow notation could create disparities between RAM and non-RAM models. Would we tie the arrow notation to "type=RAM" models, or all models using mxRAMObjective?

Could we have unnamed parameters referenced by their matrix and position instead? I would be open to letting the "type" flag change this (i.e., create the paths that Tim was talking about). This referencing style could also pull variable names from "dimnames" arguments for additional clarity.

I think matrix position is a good idea. One reason I never implemented this in Mx1 was that the same parameter would often be in many locations, due to element or whole matrix equality constraints. Well, it was laziness really since it would be easy enough to just report the first occurrence...

The advantage in OpenMx is that there cannot be an equality constraint without there being a label. So any unlabeled parameter has to have a unique location in the model.

I don't see any alternative but to restrict the arrow notation to the RAM objective for now. We could try to be smart and deduce from a quadratic form what is going on (the middle matrix would be double headed, all others multiplying it would be single-headed and would go from column to row etc). This, IMO, would be a bit risky since it might be a completely different matrix specification, and it may not help a whole lot unless row and column labels exist for the matrices in question.

As of revision 696, the summary() function now reports matrix, row, and column information for each free parameter. If the free parameter is used in multiple locations, then only the "first" location is reported for some arbitrary definition of first location.

Perhaps matrixes should get default labels if the user provides none - that avoids the problem of unclear output

either

Arrows would still be nice for RAM models if the parameter isn't labeled.

Seconded: Arrows for RAM make sense.

More broadly, I think some way to get a picture output of the path model should be a priority. Much easier to debug a complex script if you can see a picture of what you are requesting. Think of the hamsters saved :-)

If the output of summary is to be like

{label}{from}<-arrow->|{to}|{value}

Then a pretty trivial side-project could be writing a function to take that list, and draw up a path diagram in lattice or ggplot2. There may even be such a package.

If it only works for RAM that would be a pity. Perhaps we need to think of a way of allowing users to annotate matrix-format models in such a way that (if used) would allow extraction of the paths and objects in the diagram.

Otherwise anything the user ever wants to see drawn as a diagram will have ot be done through the GUI, with it maintaining some solipsistic knowledge of what cells in various un-tagged matrices are intended to mean, diagramatically.

Would it be possible to add a semantic tag to each user defined parameter matrix (covariance, regression, mean, treshold && OTHER etc...)? That way the idea of single vs. double arrow is tied to the intent of the user with respect to a specific matrix.

As importantly, it would make that specification available more generally to any model defined in terms of an arbitrary combination covariance / regression matrices.

This isn't as easy as it might seem. The RAMpath algorithm does organize a path diagram from a set of RAM matrices.

Since many other forms can be (in principle) converted to RAM matrices, solving the RAM case would be a fairly general way to get diagrams for many other forms.

However, automatic diagram generation is not in our original grant proposal, so it is at present left for an enterprising third party, or a future follow-on grant. The rampath code is fairly easily implemented, but there are a surprisingly large set of problem conditions that must be accounted for before reasonable diagrams emerge in the general case.

It makes sense.

Given that RAM and LISREL notations are marginally different, path-centric functions could be mapped on to LISREL matrices easily. We are just getting ready to begin specifying some of our models as LISREL models. At that point, we will try applying path-centric functions to LISREL.

mxModel("foo", type=LISREL)

And right now, there is a seemingly unnecessary option to mxModel called "type". Currently only "type=RAM" is defined. But it is there so that a "type=LISREL" or whatever could be defined. Typed mxModels exist so that one can enforce rules about how matrices are structured. Untyped mxModels can have any free form matrices they like. Typed mxModels with rules enforced so that path diagrams can be created is one way we envisioned the "type=" argument would be used.

That use of model Type makes sense.

Consider following extension of mxMatrix:

mxMatrix("Lambda", pathType = "->")

mxMatrix("Psi", pathType = "<->")

For LISREL:

(a) {Lambda, Beta, Kappa, Gamma} &would be "->" type matrices

(b) {Psi, Theta} would be "<-->" type matrices.

It seems that the "arrow-type" is a matrix related attribute and could be easily decoupled from a specific model type.

If each model matrix is tagged as {->, <-->,"other"}, then a new model type such as LISREL would inherit path-centric functions for each matrix type, regardless of how a specific matrix such as Lambda is used within the model. This would be useful atleast for the purpose of model specification (frontend) and for printing the output (as per tbates' suggestion).

Creating path-diagrams from model matrices would require model-specific implementation. Hopefully, tagging each matrix as one of {->, <-->,"other"} would be a step in the direction of a model agnostic / generic translation of path-centric model specification to path-diagrams.

Actually, the way we had envisioned it, if you created an model

mymodel <- mxModel("foo", type=LISREL)

then the lambda, beta, kappa, gamma, psi, and theta matrices would automatically be created for you. All you'd need to do is put elements into the matrices. And the pathTypes would be unnecessary because lambda would always be a "->" matrix, etc. The objective function would also automatically be specified.

From there, we could, when someone asked for a diagram, automatically convert to RAM matrices and from there to the diagram. That way there is only one diagram writing routine (a big plus), while all we need to do for the conversion is a little matrix algebra (and we've got an engine that does that!).

We should add:

#Non-linear equality constraints: 2

and make df reflect these (credit of 1 observed statistic per constraint).

Also, sometimes we want to let users change the df directly (with user-defined fit functions this is often necessary). So, if that option is used we should print:

#User-requested change to Degrees of Freedom (print as +x or -x)

Hi.

One thing I think would be nice to print are the confidence limits for the RMSEA. My MBESS package easily takes the sufficient statistics and returns the exact confidence limits (based on the noncentral chi square distribution, which is also an MBESS function).

For example:

require(MBESS)

ci.rmsea(rmsea=.05, df=10, N=100, conf.level = 0.90)

which returns

$Lower.Conf.Limit

[1] 0

$RMSEA

[1] 0.05

$Upper.Conf.Limit

[1] 0.1259771

I'm interested in the confidence limits of RMSEA (which is why I created the function), so this suggestion in a bit self serving. However, I think many users would also want the confidence limits.

Thanks for your work on this thus far,

Ken

seconded: As well as knowing the RMSEA has passed some threshold, readers often want to know if they can have confidence in that threshold having been exceeded.

Good suggestion. I like it.

The Fit statistics outputted at the moment are very rudimentary. I suggest at least adding Chi2 and it's p value to the output, as is standard in every SEM package. SRMR would also be nice.

On a related note, the fitStatistics function is broken at the moment because the required variables in fitStatistics have different names than in the @output object. I don't know which are the canonical variable names but I have prepared a small patch that fixes the summary and fitStatistics function and adds the above functionality.

The fitStatistics function is also the right place to add the CI for RMSEA.

Is there a simple way to send patches?

Things are happening fast on the summary function. For now, please email your patch to Michael Spiegel, as he is the one committing changes to that part of the code. We're not ready to open up the svn repository to outside commits yet. Thanks for your contribution!

Hi, Michael. To submit a patch, I would recommend creating an "Issue", under "Developer" on the left, and attaching the patch to that issue. Thanks and congrats; you're the first to offer a patch from someone outside of the core team!

Should we be seeing fit statistics on models now?

The mxVersion() function is updated when a new binary release is made available to the users. In between binary releases, the function is not updated whenever a change is made to the source code repository. I'm running revision 720 (svn update) and I see the following output on your example:

Observed statistics: 15

Estimated parameters: 10

Degrees of freedom: 5

AIC: -3658.281

BIC: -1839.677

adjusted BIC:

RMSEA: 0.03088043

OK: working here now. But I had to quit R and reboot to get the new version to replace the old one.

If mxVersion() doesn't work, is there anyway to tell what version is loaded?

Is there a recommended way to unload or force reload?

Probably belongs in installation forum.

I've been reading over the posts on fit statistics, but I've noticed that no one has made mention of any incremental fit indices. Would it be possible to see either the TLI/NNFI, CFI, or both as part of the summary() output? Now, I do understand the programing issues that come with automatically calculating a default null model, so maybe, if that's not possible, providing a likelihood ratio statistic as part of the output so the user could calculate his or her own incremental fit stats would be nice.

There has been some discussion of an mxCompare function or something like that in order to calculate and present model comparisons. As of yet, that is still just a gleam in some of our eyes.