Hello all,

I am trying to get the correct standard errors for standardized regression coefficients. According to Bentler and Lee (1983) this can be done using nonlinear constraints. I have tried two different model specifications with nonlinear constraints, both of which yield the correct parameter estimates, but neither give correct standard errors (they are returning zeros). Any help would be appreciated. My R code and output are below.

Thanks,

Jeff

Bentler, P. M. & Lee, S. (1983). Covariance structures under polynomial constraints: Applications to correlation and alpha-type structural models. Journal of Educational Statistics, 8, 207-222.

#################

#################

R <- matrix(c( 1,.3,.4,

.3, 1,.5,

.4,.5, 1),3)

manifests <- c("y","x1","x2")

colnames(R) <- rownames(R) <- manifests

###########

# Model 1 #

###########

m1 <- mxModel("Regression of y on x1 and x2 (standardized)",

type="RAM",

manifestVars=manifests,

mxPath(from=c("x1","x2"), to="y",

arrows=1,

free=TRUE, values=.3, labels=c("b1", "b2")),

mxPath(from=c("x1","x2"),

arrows=2,

free=TRUE, values=1,

labels=c("VarX1", "VarX2")),

mxPath(from="y",

arrows=2,

free=TRUE,values=.8,labels="VarE"),

mxPath(from="x1", to="x2",

arrows=2,

free=TRUE, values=.3,

labels=c("CovX1X2")),

# nonlinear constraint: Y to unit variance

mxAlgebra(expression=b1^2 + b2^2 + 2*b1*b2*CovX1X2+VarE, name="VarY"),

mxMatrix( type="Unit", nrow=1, ncol=1, name="I"),

mxConstraint(I == VarY, name="VarYEqualOne"),

mxData(observed=R, type="cov", numObs=100)

)

m1 <- mxOption(model=m1,key="Standard Errors",value="Yes")

m1.run <- mxRun(m1)

Running Regression of y on x1 and x2 (standardized)

summary(m1.run)

data:

$`Regression of y on x1 and x2 (standardized).data`

$`Regression of y on x1 and x2 (standardized).data`$cov

y x1 x2

y 1.0 0.3 0.4

x1 0.3 1.0 0.5

x2 0.4 0.5 1.0

free parameters:

name matrix row col Estimate Std.Error

1 b1 A y x1 0.1333333 2.128634e-314

2 b2 A y x2 0.3333333 2.128634e-314

3 VarE S y y 0.8266667 2.128634e-314

4 VarX1 S x1 x1 0.9999993 2.128634e-314

5 CovX1X2 S x1 x2 0.4999993 2.128634e-314

6 VarX2 S x2 x2 0.9999993 2.130209e-314

observed statistics: 7

Constraint 'VarYEqualOne' contributes 1 observed statistic.

estimated parameters: 6

degrees of freedom: 1

-2 log likelihood: 249.6745

saturated -2 log likelihood: 249.6745

number of observations: 100

chi-square: 4.672529e-11

p: 0.9999945

AIC (Mx): -2

BIC (Mx): -2.302585

adjusted BIC:

RMSEA: NA

timestamp: 2010-11-05 11:22:50

frontend time: 0.1519690 secs

backend time: 0.001686096 secs

independent submodels time: 7.70092e-05 secs

wall clock time: 0.1537321 secs

cpu time: 0.1537321 secs

openmx version number: 1.0.1-1464

###########

# Model 2 #

###########

# covariance matrix that matches the correlation matrix

# in model 1

COV <- matrix(c(100,60 ,60,

60,400,150,

60,150,225),3)

colnames(COV) <- rownames(COV) <- manifests

D <- diag(sqrt(diag(COV)[2:3]))

sdy <- sqrt(COV[1,1])

m2 <- mxModel("Regression of y on x1 and x2 (standardized)",

type="RAM",

manifestVars=manifests,

mxPath(from=c("x1","x2"), to="y",

arrows=1,

free=TRUE, values=.2, labels=c("b1", "b2")),

mxPath(from=c("x1","x2"),

arrows=2,

free=TRUE, values=300,

labels=c("VarX1", "VarX2")),

mxPath(from="y",

arrows=2,

free=TRUE,values=.8,labels="VarE"),

mxPath(from="x1", to="x2",

arrows=2,

free=TRUE, values=100,

labels=c("CovX1X2")),

# nonlinear constraints

mxAlgebra(exp=sdy*(t(rbind(b1,b2))%*%R[2:3,2:3]%*%rbind(b1,b2)+VarE)*sdy,name="c1"),

mxAlgebra(exp=sdy*(b1+R[2,3]*b2)*D[1,1],name="c2"),

mxAlgebra(exp=sdy*(b2+R[2,3]*b1)*D[2,2],name="c3"),

mxConstraint(c1 == COV[1,1],name="nonlinear1"),

mxConstraint(c2 == COV[1,2],name="nonlinear2"),

mxConstraint(c3 == COV[1,3],name="nonlinear3"),

mxData(observed=COV, type="cov", numObs=100)

)

m2 <- mxOption(m2,"Standard Errors","Yes")

m2 <- mxOption(m2,"Major iterations","10000")

m2 <- mxRun(m2)

Running Regression of y on x1 and x2 (standardized)

summary(m2)

data:

$`Regression of y on x1 and x2 (standardized).data`

$`Regression of y on x1 and x2 (standardized).data`$cov

y x1 x2

y 100 60 60

x1 60 400 150

x2 60 150 225

free parameters:

name matrix row col Estimate Std.Error

1 b1 A y x1 0.1333333 2.121996e-314

2 b2 A y x2 0.3333333 2.121996e-314

3 VarE S y y 0.8266667 0.000000e+00

4 VarX1 S x1 x1 398.2965743 2.121996e-314

5 CovX1X2 S x1 x2 149.1088443 2.121996e-314

6 VarX2 S x2 x2 224.3940013 2.322230e-314

observed statistics: 9

Constraint 'nonlinear1' contributes 1 observed statistic.

Constraint 'nonlinear2' contributes 1 observed statistic.

Constraint 'nonlinear3' contributes 1 observed statistic.

estimated parameters: 6

degrees of freedom: 3

-2 log likelihood: 11991.72

saturated -2 log likelihood: 1834.935

number of observations: 100

chi-square: 10156.78

p: 0

AIC (Mx): 10150.78

BIC (Mx): 5071.484

adjusted BIC:

RMSEA: 5.817727

timestamp: 2010-11-05 11:22:51

frontend time: 0.3465281 secs

backend time: 1.01469 secs

independent submodels time: 6.29425e-05 secs

wall clock time: 1.361281 secs

cpu time: 1.361281 secs

openmx version number: 1.0.1-1464

The reason that standard errors come out as NAs when you include constraints is that the inverse Hessian does not, in general, provide correct standard errors in the presence of arbitrary constraints. Thus we do not report them.

When you included:

you overwrote the default, and sure enough, the standard errors aren't correct.

But there is a way to do this.

1. Remove the mxOption() statement.

2. Add the following to your model to calculate confidence intervals for "b1" and "b2". Of course, you could calculate confidence intervals for all your estimates, if you like.

3. Then turn on confidence interval calculation in the mxRun statement.

That's all there is to it.

From the mxCI version:

From inverse Hessian model of the correlation matrix without constraints:

In this case, the mxCI estimates are a little tighter than the inverse Hessian estimates, but that is not guaranteed to be the case. In general I would be more likely to trust the likelihood based (mxCI) estimates over inverse Hessian estimates of the parameter variability.

Hope this helps!

>The reason that standard errors come out as NAs when you include constraints is that the inverse

> Hessian does not, in general, provide correct standard errors in the presence of arbitrary

> constraints. Thus we do not report them.

Might be helpful to add a note in the summary when SE <- NA

Something like "Constraints preclude automatic calculation of Standard Errors. To determine CIs, rewrite to avoid constraints (perhaps using standardized data) or, more simply, add a request to mxCI() into your model. e.g.:

mxCI(c("b1", "b2"))

Thanks for the help!

I have one question. Do you happen to have a reference discussing why the inverse Hessian under arbitrary constraints does not yield correct standard errors?

Thanks again,

Jeff

This issue is more technical than scholarly. There are two 'flavors' of Hessian that OpenMx generates. One is estimated during optimization - it is used by the optimizer to find the minimum. The second is calculated after optimization by directly adjusting the parameter values and recording how much the objective changes. The reason for the second direct step (even though it can be costly in terms of CPU) is that the first method is sometimes inaccurate, whether there are non-linear constraints (specified with mxConstraint()) or not. Unfortunately, the second method does not take mxConstraint()s into consideration, so it too yields an inaccurate Hessian when there are constraints in the model. Hence no SE's for models with this type of constraint. This limitation may be lifted in a future version of OpenMx.

Surely the Hessian is a headless rider in Sleepy Hollow? :-)

I sometimes also need to have standard errors rather than confidence intervals. It is because standard errors of the summary statistics can be used in further analyses, e.g., meta-analysis. One possibility that I am thinking is parametric bootstrap. The following code estimates a bootstrapped SE for Jeff's example. It is not optimized for performance. For example, it is better to generate covariance samples directly instead of to generate raw data.

library(MASS)

set.seed(10001)

paraboot <- function(R, n, mx.fit) {

mx.fit@data@observed <- cov(mvrnorm(n=n, mu=c(0,0,0), Sigma=R))

mx.refit <- mxRun(mx.fit, silent=TRUE, suppressWarnings=TRUE)

return(mxEval(c(b1,b2), mx.refit))

}

bootCov <- cov(t(replicate(200, paraboot(R=R, n=100, mx.fit=m1.run))))

dimnames(bootCov) <- list(c("b1","b2"),c("b1","b2"))

bootCov

sqrt(diag(bootCov))

bootCov:

b1 b2

b1 0.010148215 -0.006035671

b2 -0.006035671 0.011095607

Standard errors:

b1 b2

0.1007383 0.1053357

Nice!