Automatically computing residual variances for a specified RAM matrix

Posted on Tue, 08/05/2014 - 13:30

fife Joined: 07/01/2010

Forums

It seems I'm frequently wanting to compute an implied correlation matrix given a SE model. I began writing a function to do just that:

#### create a RAM model for testing
RAM = data.frame(matrix(c(
"F1", "A1", 1, .4,
"F1", "A2", 1, .4,
"F1", "A3", 1, .4,
"F2", "A4", 1, .4,
"F2", "A5", 1, .4,
"F2", "A6", 1, .4,
"F3", "A7", 1, .4,
"F3", "A8", 1, .4,
"F3", "A9", 1, .4,
"A10", "A9", 2, .5), ncol=4, byrow=T
))
names(RAM) = c("From", "To", "Arrows", "Values")

####
observed = c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "A10")

### begin function (currently omitted until done testing)
#ram.2.cor = function(RAM, observed){

all.vars = as.character(unique(unlist(c(RAM[,1:2]))))
unobserved = as.character(all.vars[which(!(all.vars%in%observed))])

#### create asymmetric matrix
mA = data.frame(matrix(0,nrow=length(all.vars), ncol=length(all.vars)), row.names=c(observed, unobserved))
names(mA) = c(observed, unobserved)
for (j in 1:nrow(RAM)){
col = which(names(mA) == RAM$From[j] & RAM$Arrows[j] == 1)
rw = which(names(mA) == RAM$To[j] & RAM$Arrows[j] == 1)
mA[rw, col] = as.numeric(RAM$Values[j])
}

##### create symmetric matrix (temporarily fill in endogenous variances)
end = names(mA)[which(rowSums(mA)>0)]
mS = data.frame(diag(length(all.vars)), row.names=c(observed, unobserved))
names(mS) = row.names(mS)
for (i in 1:nrow(RAM)){
if (RAM$Arrows[i] == 2){
col = which(names(mA) == RAM$To[i])
row = which(names(mA) == RAM$From[i])
mS[col,row] = as.numeric(as.character(RAM$Values[i]))
mS[row,col] = as.numeric(as.character(RAM$Values[i]))
}
}
#### I'M STUCK!

The code above produces the asymmetrical matrix and is almost there for the symmetric matrix. However, all the variances are set to one. I want all variables to have a total variance of one, which means that the endogenous variables will have a value < 1. In the past, I've just used covariance algebra to compute them, but this becomes unrealistic for large models and cannot be put into a function.

So....can you all think of any general equation that will tell me what value for the residual variance will give each variable a total variance of one? I was considering using optim to solve it through brute force, but I figured there had to be a more elegant way of doing it.

Thanks!

P.S....sorry if this is totally obvious. I reserve the right to overlook a very simple answer.
}

Replied on Tue, 08/05/2014 - 15:18

RobK Joined: Apr 19, 2011

Are you running OpenMx

EDIT: Hmm, I don't think I answered your question, if you're curious about writing a function for general-purpose SEM (i.e., not necessarily using OpenMx), and inputting a matrix of RAM paths. The below might still be useful to you, though.

Are you running OpenMx version 1.3/1.4, or the beta of 2.0? If it's the latter, there is a built-in function, mxStandardizeRAMpaths(), that will give you the values of the path coefficients when all variables, both latent and observable, have total variance of 1.

If you want the A and S matrices containing the values reported by mxStandardizeRAMpaths(), then assuming you've already made your model as myModel and all of its two-headed paths have nonzero start values, you could do this:
myModelZ <- mxModel(model=myModel, mxMatrix(type="Iden",nrow=nrow(myModel$A@values),name="I"), mxAlgebra( vec2diag(diag2vec( solve(I-A)%*%S%*%t(solve(I-A)) )%^%-0.5) , name="InvSD"), mxAlgebra( InvSD %*% A %*% solve(InvSD), name="Az",dimnames=dimnames(myModel$A@values)), mxAlgebra( InvSD %*% S %*% InvSD, name="Sz",dimnames=dimnames(myModel$S@values)) ) myFitZ <- mxRun(myModelZ)
Then, Az and Sz would be algebras contained in myFitZ. This should work in 1.3/1.4 as well as 2.0 beta.

If what you're really after is a model-implied correlation matrix for the manifest variables, then in 1.3/1.4, you could do cov2cor(myModel@objective@info$expCov). In the 2.0 beta, the equivalent command would be cov2cor(myModel$fitfunction$info$expCov).

Does any of this help?

Log in or register to post comments

Replied on Tue, 08/05/2014 - 16:29

fife Joined: Jul 01, 2010

My original intent was to

My original intent was to avoid openMx, but I'm open to trying that. Let me give it a shot.

Log in or register to post comments

Replied on Tue, 08/05/2014 - 17:31

RobK Joined: Apr 19, 2011

OpenMx might not be useful

OpenMx might not be useful for this purpose. It looks like you want to produce a model-expected correlation matrix without explicitly giving values for residual variances, instead relying on the constraint that variables have total variance of 1. The stuff I mentioned using OpenMx would require a complete S matrix to produce a correlation matrix.

Your idea of using optim() to get the diagonal elements of the S matrix isn't bad--it might be the only (practical) way to do what you want in the most general case. But in simple special cases, an analytic solution might be practical. For instance, in the example you have in your syntax, you would just set the residual variances of A1 thru A9 to 0.84. Those 9 variables all have communalities of 0.4^2 = 0.16, so unique variances of 0.84 would give them total variance of 1. Essentially the same calculation (albeit with matrix algebra) should be doable for any common-factor model.

Log in or register to post comments

Replied on Wed, 08/06/2014 - 09:24

fife Joined: Jul 01, 2010

I attempted the following

I attempted the following code (which uses OpenMx):

RAM = data.frame(matrix(c(
"F1", "A1", 1, runif(1, .38, .42),
"F1", "A2", 1, runif(1, .38, .42),
"F1", "A3", 1, runif(1, .38, .42),
"F2", "A4", 1, runif(1, .38, .42),
"F2", "A5", 1, runif(1, .38, .42),
"F2", "A6", 1, runif(1, .38, .42),
"F3", "A7", 1, runif(1, .38, .42),
"F3", "A8", 1, runif(1, .38, .42),
"F3", "A9", 1, runif(1, .38, .42),
"A10", "A9", 2, .5,
"F1", "F2", 2, .2,
"F2", "F3", 2, .25,
"F1", "F3", 2, .09,
"F1", "Y", 1, .2,
"F2", "Y", 1, .5,
"F3", "Y", 1, .05), ncol=4, byrow=T
), stringsAsFactors=F)
names(RAM) = c("From", "To", "Arrows", "Values")
RAM$Arrows <- as.integer(RAM$Arrows)
RAM$Values <- as.numeric(RAM$Values)

observed = c(intersperse("A", 1:10), "Y")
all.vars = as.character(unique(unlist(c(RAM[,1:2]))))
unobserved = as.character(all.vars[which(!(all.vars%in%observed))])

paths = mxPath(from = RAM[, 1], to = RAM[, 2], arrows = RAM[,3],
values = RAM[, 4], labels = paste(RAM[, 1], "to",
RAM[, 2], "_", sep = ""), free = T)

#### freely estimate observed variances
variances.obs = mxPath(from=observed, arrows=2, connect="single", free=T, values=1)

#### fix unobserved variances to one
variances.unobs = mxPath(from=unobserved, arrows=2, connect="single", free=F, values=1)

### put into a model
myModel = mxModel("Test Model", type="RAM", manifestVars=observed,
latentVars = unobserved, variances.obs, variances.unobs, paths)

myModelZ <- mxModel(model=myModel,
mxMatrix(type="Iden",nrow=nrow(myModel$A@values),name="I"),
mxAlgebra( vec2diag(diag2vec( solve(I-A)%*%S%*%t(solve(I-A)) )%^%-0.5) ,
name="InvSD"),
mxAlgebra( InvSD %*% A %*% solve(InvSD),
name="Az",dimnames=dimnames(myModel$A@values)),
mxAlgebra( InvSD %*% S %*% InvSD,
name="Sz",dimnames=dimnames(myModel$S@values))
)
myFitZ <- mxRun(myModelZ)

And got the following error:

Error: The RAM expectation function does not have a dataset associated with it in model 'Test Model'

I was hoping to avoid inputting a dataset. Should I just insert a random correlation matrix?

Log in or register to post comments

Replied on Wed, 08/06/2014 - 10:40

RobK Joined: Apr 19, 2011

Use mxEval()

Instead of running it, just use mxEval() to compute the algebras you want. For instance:
mxEval(Sz,myModelZ,compute=T) #<--post-standardization S matrix mxEval(F%*%solve(I-A)%*%S%*%solve(t(I-A))%*%t(F),myModelZ,compute = T) #<--Model-expected covariance matrix mxEval(F%*%solve(I-Az)%*%Sz%*%solve(t(I-Az))%*%t(F),myModelZ,compute = T) #<--Model-expected correlation matrix
The thing is, I'm not sure this does what you want. It gives you a model-expected correlation matrix, given some values inside an A and S matrix. It sounds like you would rather have a function that gives you a correlation matrix given some values insides an A matrix and some off-diagonal values inside an S matrix; the function would figure out the diagonal elements of S.

Log in or register to post comments

Replied on Wed, 08/06/2014 - 11:50

fife Joined: Jul 01, 2010

The end result of what I want

The end result of what I want is to obtain the correlation matrix, so what you have almost does it, but I noticed the S matrix is standardized. I'm hoping that what I do doesn't require standardization. For example, the above script returns a value of .4634 for the correlation between A9 and A10, when it should be exactly .5.

Any other ideas?

Log in or register to post comments

Replied on Fri, 08/15/2014 - 13:10

mhunter Joined: Jul 31, 2009

Two things

I should point out two things.

First, if you have a model that has been run, you can get the expected covariance matrix by
RanModel$fitfunction$info$expCov
You can then make this the "expected correlation" matrix with cov2cor
cov2cor(RanModel$fitfunction$info$expCov)

Second, the same strategy works with any RAM model that has not been run.
ecov <- mxEval(F%*%solve(I-A)%*%S%*%solve(t(I-A))%*%t(F), myModelZ, compute=TRUE) ecor <- cov2cor(ecov)

Log in or register to post comments

Replied on Tue, 08/05/2014 - 16:55

RobK Joined: Apr 19, 2011

Clarify a bit?

I've been thinking about the problem you've posed, and I'm a bit confused about what is assumed known versus what you want your function to find. It looks as though what is known is a matrix of RAM paths, and what you want the function to do is produce the model-expected correlation matrix. More specifically, what is known in from the matrix of RAM paths in the present case are the asymmetric paths (factor loadings) and a covariance between manifest variables 9 & 10. Is that right?

In any event, there are several parts of your syntax that don't do what I'm pretty sure you want it to. I've cleaned it up a little:
#### create a RAM model for testing RAM = data.frame(matrix(c( "F1", "A1", 1, .4, "F1", "A2", 1, .4, "F1", "A3", 1, .4, "F2", "A4", 1, .4, "F2", "A5", 1, .4, "F2", "A6", 1, .4, "F3", "A7", 1, .4, "F3", "A8", 1, .4, "F3", "A9", 1, .4, "A10", "A9", 2, .5), ncol=4, byrow=T ),stringsAsFactors=F) names(RAM) = c("From", "To", "Arrows", "Values") RAM$Arrows <- as.integer(RAM$Arrows) RAM$Values <- as.numeric(RAM$Values)


####

observed = c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "A10")
### begin function (currently omitted until done testing)

#ram.2.cor = function(RAM, observed){
all.vars = as.character(unique(unlist(c(RAM[,1:2]))))

unobserved = as.character(all.vars[which(!(all.vars%in%observed))])
#### create asymmetric matrix

mA = matrix(0,nrow=length(all.vars), ncol=length(all.vars), dimnames=list(c(observed, unobserved),c(observed,unobserved)))

for (j in 1:nrow(RAM)){

  col = which(colnames(mA) == RAM$From[j] & RAM$Arrows[j] == 1)

  rw = which(rownames(mA) == RAM$To[j] & RAM$Arrows[j] == 1)

  mA[rw, col] = RAM$Values[j]

}

##### create symmetric matrix (temporarily fill in endogenous variances) end = rownames(mA)[which(rowSums(mA)>0)] mS <- diag(1,length(all.vars)) dimnames(mS) <- list(c(observed,unobserved),c(observed,unobserved)) for (i in 1:nrow(RAM)){ if (RAM$Arrows[i] == 2){ col = which(colnames(mS) == RAM$To[i]) row = which(rownames(mS) == RAM$From[i]) mS[col,row] = RAM$Values[i] mS[row,col] = RAM$Values[i] } } #### I'M STUCK!

Log in or register to post comments

Replied on Mon, 01/19/2015 - 16:54

fife Joined: Jul 01, 2010

Problem solved (albeit slowly)

I came up with a function to do what I wanted to do. It uses optim, so it's a bit slow, but it works. here it is!

ram.2.cor = function(RAM){

#### first create function that can be optimized
opfunc = function(diag=c(.5, .5, .5, .5), RAM, op=TRUE){

#### extract the names of all the variables
all.vars = as.character(unique(unlist(c(RAM[,1:2]))))

#### create asymmetric matrix
mA = data.frame(matrix(0,nrow=length(all.vars), ncol=length(all.vars)), row.names=all.vars)
names(mA) = c(all.vars)
for (j in 1:nrow(RAM)){
col = which(names(mA) == RAM$From[j] & RAM$Arrows[j] == 1)
rw = which(names(mA) == RAM$To[j] & RAM$Arrows[j] == 1)
mA[rw, col] = as.numeric(as.character(RAM$Values[j]))
}

##### create symmetric matrix (temporarily fill in endogenous variances)
end = names(mA)[which(rowSums(mA)>0)]
mS = data.frame(diag(x=diag, length(all.vars)), row.names=all.vars)
names(mS) = row.names(mS)
for (i in 1:nrow(RAM)){
if (RAM$Arrows[i] == 2){
col = which(names(mA) == RAM$To[i])
row = which(names(mA) == RAM$From[i])
mS[col,row] = as.numeric(as.character(RAM$Values[i]))
mS[row,col] = as.numeric(as.character(RAM$Values[i]))
}
}

#### create factor and identity matrices
mI = diag(1, nrow(mA))
mF = mI

#### compute implied covariance matrix
C = (solve(mI-mA) %*% data.matrix(mS) %*% t(solve(mI-mA)))

#### return the sum of squares of the diagonals
diags = sum((diag(C) - 1)^2)

if (op){
return(diags)
} else {
return(C)
}
}

### give a note
cat("Doing a brute-force search for the correct correlations. Give me a minute.\n\n")

#### now we optimize it
vars = as.character(unique(unlist(c(RAM[,1:2]))))
op.res = optim(rep(.5, times=length(vars)), opfunc, RAM=RAM)
RAM.returned = round(opfunc(diag=op.res$par, RAM, op=FALSE), digits=2)

#### and return the original matrix
RAM.returned

}

Usage:

RAM = data.frame(matrix(c(
"Age", "Energy", 1, .1,
"Age", "Self-Efficacy", 2, .1,
"Energy", "Exercise", 1, .6,
"Eating", "Energy", 1, .4,
"Eating", "Self-Efficacy", 1, .3,
"Age", "Eating", 2, .15,
"Self-Efficacy", "Exercise", 1, .5), ncol=4, byrow=T))
names(RAM) = c("From", "To", "Arrows", "Values")
ram.2.cor(RAM)

Log in or register to post comments

Log in or register to post comments

News

Recent Posts

Automatically computing residual variances for a specified RAM matrix

Are you running OpenMx

My original intent was to

OpenMx might not be useful

I attempted the following

Use mxEval()

The end result of what I want

Two things

Clarify a bit?

Problem solved (albeit slowly)

News

Recent Posts