You are here

mxFitFunctionAlgebra question

8 posts / 0 new
Last post
carey's picture
Offline
Joined: 10/19/2009 - 15:38
mxFitFunctionAlgebra question
AttachmentSize
Plain text icon macOutput.txt3.45 KB
Plain text icon linuxOutput.txt3.55 KB

Attached are two outputs, one from OpenMx 2.0.1.4157 using OS X on a Mac and the other from OpenMx 2.2.2 on Red Hat Linux. In both situations the models converge when I use mxExpectationNormal and mxFitFunctionML. But when I manually set the fit function to -log(likelihood) using mxFitFunctionAlgebra, mxRun crashes in both operating systems (albeit with different errors).
Am perplexed.
Greg

mhunter's picture
Offline
Joined: 07/31/2009 - 15:26
Two issues

I can see at least two reasons why the algebra fit function wouldn't work. First, the algebra used is not the minus 2 log likelihood.

What you have
fval <- mxAlgebra(.5*(N - 1)*(log(det(preCov)) +
         tr(obsCov %*% solve(preCov))), name="fval")
 
# What it should be
fval <- mxAlgebra((N - 1)*(log(det(preCov)) - log(det(obsCov))
         tr(obsCov %*% solve(preCov)) - 4), name="fval")

Yes, it's just a scale and location adjustment on the fit function, but when weird things are happening it's often best to eliminate unnecessary sources of variability.

Second, the model is not identified without an additional constraint (in the form of an mxAlgebra) on the uniquenesses.

mxMatrix("Unit", 4, 1, name="vectorofOnes")
mxAlgebra(vectorofOnes - (diag2vec(fpat %*% t(fpat))) , name="u")
# drop 
# umat <- mxMatrix(name="u", type="Diag", nrow=4, free=T,
#                  values=u)
mxAlgebra(fpat %*% t(fpat) + vec2diag(u), name="preCov"),
carey's picture
Offline
Joined: 10/19/2009 - 15:38
More perplexed

Rob and Mike,

(1) minimizing -log(L) and minimizing -2log(L) will lead to the same solution given that the first derivatives for both functions will converge to small numbers close to 0. If there is any advantage, then it goes to -log(L) because the derivatives will be 1/2 those of -2log(L)

(2) as I pointed out years ago in this forum (openmx.psyc.virginia.edu/thread/360) minimizing -2Log(L) will give incorrect standard errors. In fact, in other circumstances mxFitFunctionAlgebra() gave me the wrong standard errors. That is why I started exploring the issue.

(3) The model is indeed identified. Otherwise, the MxModel called mod1 would not have converged to the correct solution. (I verified that using a completely different minimizer before I posted).

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Greg, the major problem with

Greg, the major problem with your MxModel using the algebra fitfunction is that there is no safeguard that keeps the model-predicted covariance matrix within the parameter space, i.e. the set of all symmetric positive-definite matrices. The built-in ML fitfunction has a "soft" feasibility constraint, in that it returns a non-finite value if it is passed a non-PD covariance matrix; the optimizer then tries to work around that implicit boundary. In the script below, I show that for this particular problem, it appears that ensuring that the unique variances be strictly positive is sufficient to keep the predicted covariance matrix within the parameter space.

#require(OpenMx)
set.seed(40850)
f <- c(.7, .9, .2, .8)
u <- diag(1 - f*f)
Sigma <- f %*% t(f) + u
N <- 1000
require(MASS)
X <- mvrnorm(N, rep(0, 4), Sigma)
obsCov <- cov(X)
vn <- paste("V", 1:4, sep="")
dn <- list(vn, vn)
dimnames(obsCov) <- dn
colnames(X) <- vn
 
 
fpat <- mxMatrix(name="fpat", type="Full", nrow=4, ncol=1,free=T, values=f)
umat <- mxMatrix(name="u", type="Diag", nrow=4, free=T,values=u)
preCov <- mxAlgebra(fpat %*% t(fpat) + u, name="preCov",dimnames=dn)
obsData <- mxData(obsCov, type="cov", numObs=N)
#This is the loglikelihood, after dropping terms that are not functions of parameters:
fval <- mxAlgebra(.5*(1000 - 1)*(log(det(preCov)) + tr(obsCov %*% solve(preCov))), name="fval")
#fval <- mxAlgebra(0.5*(1000 - 1)*(log(det(preCov)) - log(det(obsCov)) + tr(obsCov %*% solve(preCov)) - 4), name="fval")
mod1 <- mxModel("model1_150528", fpat, umat, preCov, obsData, fval,mxExpectationNormal("preCov", dimnames=vn),
                                mxFitFunctionML())
mod1Results <- mxRun(mod1)
( mod1Summ <- summary(mod1Results) )
 
 
#Enforce strictly positive unique variances:
umat <- mxMatrix(name="u", type="Diag", nrow=4, free=T,values=u,lbound=0.0001)
mod2 <- mxModel("model2_150528", fpat, umat, preCov, obsData, fval, mxFitFunctionAlgebra("fval"))
#OpenMx will assume that the algebra fitfunction is -2logL, so its automatic SEs will be wrong:
mod2 <- mxOption(mod2,"Standard Errors","No")
mod2Results <- mxRun(mod2)
( mod2Summ <- summary(mod2Results) )
 
 
#Compare fitfunction minima:
mod1Results$output$minimum
mod2Results$output$minimum * 2
#Compare parameter-estimate tables:
mod1Summ$par
mod2Summ$par
#Compare standard errors:
mod1Results$output$standardErrors
matrix(sqrt(diag(solve(mod2Results$output$hessian))),dimnames = list(rownames(mod2Results$output$hessian),NULL))

You will further see that the fitfunction values at the solution, the point estimates, and the standard errors agree between mod1 (using built-in ML) and mod2 (using your algebra fitfunction):

> #Compare fitfunction minima:
> mod1Results$output$minimum
[1] 2560.021
> mod2Results$output$minimum * 2
[1] 2560.021
> #Compare parameter-estimate tables:
> mod1Summ$par
                     name matrix row col  Estimate  Std.Error lbound ubound lboundMet uboundMet
1 model1_150528.fpat[1,1]   fpat   1   1 0.6922554 0.02908638     NA     NA     FALSE     FALSE
2 model1_150528.fpat[2,1]   fpat   2   1 0.8559914 0.02785441     NA     NA     FALSE     FALSE
3 model1_150528.fpat[3,1]   fpat   3   1 0.1512036 0.03283377     NA     NA     FALSE     FALSE
4 model1_150528.fpat[4,1]   fpat   4   1 0.7908544 0.02825001     NA     NA     FALSE     FALSE
5    model1_150528.u[1,1]      u   1   1 0.4860006 0.02650040     NA     NA     FALSE     FALSE
6    model1_150528.u[2,2]      u   2   2 0.2283616 0.02517103     NA     NA     FALSE     FALSE
7    model1_150528.u[3,3]      u   3   3 0.9119656 0.04097672     NA     NA     FALSE     FALSE
8    model1_150528.u[4,4]      u   4   4 0.3298028 0.02460205     NA     NA     FALSE     FALSE
> mod2Summ$par
                     name matrix row col  Estimate Std.Error lbound ubound lboundMet uboundMet
1 model2_150528.fpat[1,1]   fpat   1   1 0.6922554        NA     NA     NA     FALSE     FALSE
2 model2_150528.fpat[2,1]   fpat   2   1 0.8559914        NA     NA     NA     FALSE     FALSE
3 model2_150528.fpat[3,1]   fpat   3   1 0.1512036        NA     NA     NA     FALSE     FALSE
4 model2_150528.fpat[4,1]   fpat   4   1 0.7908545        NA     NA     NA     FALSE     FALSE
5    model2_150528.u[1,1]      u   1   1 0.4860006        NA  1e-04     NA     FALSE     FALSE
6    model2_150528.u[2,2]      u   2   2 0.2283616        NA  1e-04     NA     FALSE     FALSE
7    model2_150528.u[3,3]      u   3   3 0.9119656        NA  1e-04     NA     FALSE     FALSE
8    model2_150528.u[4,4]      u   4   4 0.3298028        NA  1e-04     NA     FALSE     FALSE
> #Compare standard errors:
> mod1Results$output$standardErrors
                              [,1]
model1_150528.fpat[1,1] 0.02908638
model1_150528.fpat[2,1] 0.02785441
model1_150528.fpat[3,1] 0.03283377
model1_150528.fpat[4,1] 0.02825001
model1_150528.u[1,1]    0.02650040
model1_150528.u[2,2]    0.02517103
model1_150528.u[3,3]    0.04097672
model1_150528.u[4,4]    0.02460205
> matrix(sqrt(diag(solve(mod2Results$output$hessian))),dimnames = list(rownames(mod2Results$output$hessian),NULL))
                              [,1]
model2_150528.fpat[1,1] 0.02908629
model2_150528.fpat[2,1] 0.02785451
model2_150528.fpat[3,1] 0.03283363
model2_150528.fpat[4,1] 0.02824991
model2_150528.u[1,1]    0.02650037
model2_150528.u[2,2]    0.02517099
model2_150528.u[3,3]    0.04097642
model2_150528.u[4,4]    0.02460192

The lower bound on the unique variances might be what Mike hunter had in mind when he mentioned a constraint on the uniquenesses, but his exact recommendation doesn't apply here, as you are not analyzing a correlation matrix. I agree that the model is identified; mxCheckIdentification(mod1) says that it is locally identified at its start values.

FWIW, I ran this code with an OpenMx revision built from the head of the git repository this morning:

> mxVersion()
OpenMx version: 2.2.4.8 [GIT v2.2.4-8-g4660d2a]
R version: R version 3.1.1 (2014-07-10)
Platform: i386-w64-mingw32
Default optimiser: SLSQP
carey's picture
Offline
Joined: 10/19/2009 - 15:38
AHA!

" … the major problem with your MxModel using the algebra fitfunction is that there is no safeguard that keeps the model-predicted covariance matrix within the parameter space"

AH! That would explain matters! At the same time, it is problematic for anyone trying to do novel things with OpenMx. I've put these data through several optimizers and they all work fine. If the tuning of the optimizers is so sensitive with mxFitFunctionAlgebra, then I'll avoid using OpenMx with it.

Also, there is nothing in the documentation to the effect that "OpenMx will assume that the algebra fitfunction is -2logL, so its automatic SEs will be wrong." You definitely need to include that in the documentation so that people are aware of it.

And Mike--the simulations that I did were on an older version of OpenMx. In fact, those simulations along with some done by Lindon Eaves convinced the OpenMx folks at the time to change their way of calculating the standard errors to get the correct ones.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Also, there is nothing in the
Also, there is nothing in the documentation to the effect that "OpenMx will assume that the algebra fitfunction is -2logL, so its automatic SEs will be wrong." You definitely need to include that in the documentation so that people are aware of it.

Agreed. The help page has already been updated in the git repository. BTW, in recent versions of OpenMx (including 2.2), the user can specify the "units" of an algebra fitfunction, and can explicitly tell OpenMx whether the fitfunction is or is not -2logL.

I've put these data through several optimizers and they all work fine. If the tuning of the optimizers is so sensitive with mxFitFunctionAlgebra, then I'll avoid using OpenMx with it.

Bear in mind that the optimizer is just trying to minimize a loss function. It doesn't know what the parameter space is unless it is told, either explicitly (via bounds or constraints) or implicitly (by soft feasibility constraints built into the fitfunction). The only major thing your model was missing was a bound to keep the uniquenesses positive (i.e., to avoid the well-known problem of "Heywood cases").

mhunter's picture
Offline
Joined: 07/31/2009 - 15:26
Use the correct standard errors

Minimizing the -2log(L) only gives incorrect standard errors when you use the wrong formula for them. Going back to your simulation (http://openmx.psyc.virginia.edu/thread/360), it appears you use the wrong formula for the standard errors.

  results[i,hStrt:hEnd] <- sqrt(diag(solve(thisMxModel@output$hessian)))

These are not the standard errors that OpenMx reports. Later you correct the formula by multiplying by sqrt(2). These are what OpenMx reports. I verified this by running your simulation and then extracting the standard errors from summary. Alternatively, you could also get these at model$output$standardErrors. Here are the results from the last replication.

               IncorrectForumlaFromSim CorrectFormulaFromSim ReportedByOpenMxSummary
hessLambda1                 0.06223995            0.08802059              0.08802059
hessLambda2                 0.04557903            0.06445848              0.06445848
hessLambda3                 0.05496074            0.07772623              0.07772623
hessLambda4                 0.04216909            0.05963610              0.05963610
hessSpecifics1              0.09786250            0.13839847              0.13839847
hessSpecifics2              0.05542296            0.07837990              0.07837990
hessSpecifics3              0.07493568            0.10597506              0.10597506
hessSpecifics4              0.04937788            0.06983087              0.06983087

As for the model, it is locally identified, but not globally. As long as it doesn't wander too far from the true solution, it's identified. But there are infinitely many other solutions that also fit.

Empirically, I found it sufficient to add a lower bound on the residual variances to identify the model.

umat <- mxMatrix(name="u", type="Diag", nrow=4, free=T,
                 values=u, lbound=1e-6)

As far as the standard errors, the -2*log(L) model gives the correct standard errors in summary. The standard errors reported by your mxAlgebraFitFunction are the standard errors based on the Hessian of your algebra fit function, and may or may not correspond to anything meaningful depending on the fit function used.

As Rob suggested, setting the likelihood scale

mxOption(NULL,"loglikelihoodScale",-1)

makes OpenMx use -log(L). It also gives identical (and correct) standard errors when compared to using the -2*log(L).

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
loglikelihood scale

If you really want the fit function to be -logL instead of -2logL, you can change the relevant option with the command

mxOption(NULL,"loglikelihoodScale",-1)

and then build your MxModel using an MxExpectationNormal and MxFitFunctionML.