trying to get correct standard errors for standardized regression coefficients

Posted on
No user picture. jone1087 Joined: 11/05/2010

Hello all,

I am trying to get the correct standard errors for standardized regression coefficients. According to Bentler and Lee (1983) this can be done using nonlinear constraints. I have tried two different model specifications with nonlinear constraints, both of which yield the correct parameter estimates, but neither give correct standard errors (they are returning zeros). Any help would be appreciated. My R code and output are below.

Thanks,

Jeff

Bentler, P. M. & Lee, S. (1983). Covariance structures under polynomial constraints: Applications to correlation and alpha-type structural models. Journal of Educational Statistics, 8, 207-222.

#################
#################

R <- matrix(c( 1,.3,.4,
.3, 1,.5,
.4,.5, 1),3)

manifests <- c("y","x1","x2")
colnames(R) <- rownames(R) <- manifests

###########
# Model 1 #
###########

m1 <- mxModel("Regression of y on x1 and x2 (standardized)",
type="RAM",
manifestVars=manifests,
mxPath(from=c("x1","x2"), to="y",
arrows=1,
free=TRUE, values=.3, labels=c("b1", "b2")),

mxPath(from=c("x1","x2"),
arrows=2,
free=TRUE, values=1,
labels=c("VarX1", "VarX2")),

mxPath(from="y",
arrows=2,
free=TRUE,values=.8,labels="VarE"),

mxPath(from="x1", to="x2",
arrows=2,
free=TRUE, values=.3,
labels=c("CovX1X2")),

# nonlinear constraint: Y to unit variance

mxAlgebra(expression=b1^2 + b2^2 + 2*b1*b2*CovX1X2+VarE, name="VarY"),
mxMatrix( type="Unit", nrow=1, ncol=1, name="I"),
mxConstraint(I == VarY, name="VarYEqualOne"),

mxData(observed=R, type="cov", numObs=100)
)

m1 <- mxOption(model=m1,key="Standard Errors",value="Yes")

m1.run <- mxRun(m1)
Running Regression of y on x1 and x2 (standardized)
summary(m1.run)
data:
$`Regression of y on x1 and x2 (standardized).data`
$`Regression of y on x1 and x2 (standardized).data`$cov
y x1 x2
y 1.0 0.3 0.4
x1 0.3 1.0 0.5
x2 0.4 0.5 1.0

free parameters:
name matrix row col Estimate Std.Error
1 b1 A y x1 0.1333333 2.128634e-314
2 b2 A y x2 0.3333333 2.128634e-314
3 VarE S y y 0.8266667 2.128634e-314
4 VarX1 S x1 x1 0.9999993 2.128634e-314
5 CovX1X2 S x1 x2 0.4999993 2.128634e-314
6 VarX2 S x2 x2 0.9999993 2.130209e-314

observed statistics: 7
Constraint 'VarYEqualOne' contributes 1 observed statistic.
estimated parameters: 6
degrees of freedom: 1
-2 log likelihood: 249.6745
saturated -2 log likelihood: 249.6745
number of observations: 100
chi-square: 4.672529e-11
p: 0.9999945
AIC (Mx): -2
BIC (Mx): -2.302585
adjusted BIC:
RMSEA: NA
timestamp: 2010-11-05 11:22:50
frontend time: 0.1519690 secs
backend time: 0.001686096 secs
independent submodels time: 7.70092e-05 secs
wall clock time: 0.1537321 secs
cpu time: 0.1537321 secs
openmx version number: 1.0.1-1464

###########
# Model 2 #
###########

# covariance matrix that matches the correlation matrix
# in model 1

COV <- matrix(c(100,60 ,60,
60,400,150,
60,150,225),3)

colnames(COV) <- rownames(COV) <- manifests

D <- diag(sqrt(diag(COV)[2:3]))
sdy <- sqrt(COV[1,1])

m2 <- mxModel("Regression of y on x1 and x2 (standardized)",
type="RAM",
manifestVars=manifests,

mxPath(from=c("x1","x2"), to="y",
arrows=1,
free=TRUE, values=.2, labels=c("b1", "b2")),

mxPath(from=c("x1","x2"),
arrows=2,
free=TRUE, values=300,
labels=c("VarX1", "VarX2")),

mxPath(from="y",
arrows=2,
free=TRUE,values=.8,labels="VarE"),

mxPath(from="x1", to="x2",
arrows=2,
free=TRUE, values=100,
labels=c("CovX1X2")),

# nonlinear constraints

mxAlgebra(exp=sdy*(t(rbind(b1,b2))%*%R[2:3,2:3]%*%rbind(b1,b2)+VarE)*sdy,name="c1"),
mxAlgebra(exp=sdy*(b1+R[2,3]*b2)*D[1,1],name="c2"),
mxAlgebra(exp=sdy*(b2+R[2,3]*b1)*D[2,2],name="c3"),

mxConstraint(c1 == COV[1,1],name="nonlinear1"),
mxConstraint(c2 == COV[1,2],name="nonlinear2"),
mxConstraint(c3 == COV[1,3],name="nonlinear3"),

mxData(observed=COV, type="cov", numObs=100)
)

m2 <- mxOption(m2,"Standard Errors","Yes")
m2 <- mxOption(m2,"Major iterations","10000")

m2 <- mxRun(m2)
Running Regression of y on x1 and x2 (standardized)
summary(m2)
data:
$`Regression of y on x1 and x2 (standardized).data`
$`Regression of y on x1 and x2 (standardized).data`$cov
y x1 x2
y 100 60 60
x1 60 400 150
x2 60 150 225

free parameters:
name matrix row col Estimate Std.Error
1 b1 A y x1 0.1333333 2.121996e-314
2 b2 A y x2 0.3333333 2.121996e-314
3 VarE S y y 0.8266667 0.000000e+00
4 VarX1 S x1 x1 398.2965743 2.121996e-314
5 CovX1X2 S x1 x2 149.1088443 2.121996e-314
6 VarX2 S x2 x2 224.3940013 2.322230e-314

observed statistics: 9
Constraint 'nonlinear1' contributes 1 observed statistic.
Constraint 'nonlinear2' contributes 1 observed statistic.
Constraint 'nonlinear3' contributes 1 observed statistic.
estimated parameters: 6
degrees of freedom: 3
-2 log likelihood: 11991.72
saturated -2 log likelihood: 1834.935
number of observations: 100
chi-square: 10156.78
p: 0
AIC (Mx): 10150.78
BIC (Mx): 5071.484
adjusted BIC:
RMSEA: 5.817727
timestamp: 2010-11-05 11:22:51
frontend time: 0.3465281 secs
backend time: 1.01469 secs
independent submodels time: 6.29425e-05 secs
wall clock time: 1.361281 secs
cpu time: 1.361281 secs
openmx version number: 1.0.1-1464

Replied on Sun, 11/07/2010 - 11:12
Picture of user. Steve Joined: Jul 30, 2009

The reason that standard errors come out as NAs when you include constraints is that the inverse Hessian does not, in general, provide correct standard errors in the presence of arbitrary constraints. Thus we do not report them.

When you included:

m1 <- mxOption(model=m1,key="Standard Errors",value="Yes")

you overwrote the default, and sure enough, the standard errors aren't correct.

But there is a way to do this.

1. Remove the mxOption() statement.

2. Add the following to your model to calculate confidence intervals for "b1" and "b2". Of course, you could calculate confidence intervals for all your estimates, if you like.

mxCI(c("b1", "b2")),

3. Then turn on confidence interval calculation in the mxRun statement.

m1.run <- mxRun(m1, intervals=TRUE)

That's all there is to it.

From the mxCI version:

confidence intervals:
lbound estimate ubound
b1 -0.07510258 0.1333333 0.3279935
b2 0.12635554 0.3333333 0.5064689

From inverse Hessian model of the correlation matrix without constraints:

# b1
> 0.1333332 - (1.96 * 0.1055161)
[1] -0.07347836
> 0.1333332 + (1.96 * 0.1055161)
[1] 0.3401448

# b2
> 0.3333333 - (1.96 * 0.1055160)
[1] 0.1265219
> 0.3333333 + (1.96 * 0.1055160)
[1] 0.5401447

In this case, the mxCI estimates are a little tighter than the inverse Hessian estimates, but that is not guaranteed to be the case. In general I would be more likely to trust the likelihood based (mxCI) estimates over inverse Hessian estimates of the parameter variability.

Hope this helps!

Replied on Mon, 11/08/2010 - 03:49
Picture of user. tbates Joined: Jul 31, 2009

In reply to by Steve

>The reason that standard errors come out as NAs when you include constraints is that the inverse
> Hessian does not, in general, provide correct standard errors in the presence of arbitrary
> constraints. Thus we do not report them.

Might be helpful to add a note in the summary when SE <- NA

Something like "Constraints preclude automatic calculation of Standard Errors. To determine CIs, rewrite to avoid constraints (perhaps using standardized data) or, more simply, add a request to mxCI() into your model. e.g.:

mxCI(c("b1", "b2"))

Replied on Thu, 11/11/2010 - 08:40
Picture of user. neale Joined: Jul 31, 2009

In reply to by jone1087

This issue is more technical than scholarly. There are two 'flavors' of Hessian that OpenMx generates. One is estimated during optimization - it is used by the optimizer to find the minimum. The second is calculated after optimization by directly adjusting the parameter values and recording how much the objective changes. The reason for the second direct step (even though it can be costly in terms of CPU) is that the first method is sometimes inaccurate, whether there are non-linear constraints (specified with mxConstraint()) or not. Unfortunately, the second method does not take mxConstraint()s into consideration, so it too yields an inaccurate Hessian when there are constraints in the model. Hence no SE's for models with this type of constraint. This limitation may be lifted in a future version of OpenMx.

Replied on Thu, 11/11/2010 - 21:42
Picture of user. Mike Cheung Joined: Oct 08, 2009

In reply to by tbates

I sometimes also need to have standard errors rather than confidence intervals. It is because standard errors of the summary statistics can be used in further analyses, e.g., meta-analysis. One possibility that I am thinking is parametric bootstrap. The following code estimates a bootstrapped SE for Jeff's example. It is not optimized for performance. For example, it is better to generate covariance samples directly instead of to generate raw data.

library(MASS)
set.seed(10001)
paraboot <- function(R, n, mx.fit) {
mx.fit@data@observed <- cov(mvrnorm(n=n, mu=c(0,0,0), Sigma=R))
mx.refit <- mxRun(mx.fit, silent=TRUE, suppressWarnings=TRUE)
return(mxEval(c(b1,b2), mx.refit))
}

bootCov <- cov(t(replicate(200, paraboot(R=R, n=100, mx.fit=m1.run))))
dimnames(bootCov) <- list(c("b1","b2"),c("b1","b2"))
bootCov
sqrt(diag(bootCov))

bootCov:
b1 b2
b1 0.010148215 -0.006035671
b2 -0.006035671 0.011095607

Standard errors:
b1 b2
0.1007383 0.1053357