You are here

the thresholds in SEM of ordinal variables

14 posts / 0 new
Last post
diana's picture
Offline
Joined: 04/17/2019 - 08:58
the thresholds in SEM of ordinal variables

Hello, I have got a question about the thresholds in SEM of ordinal variables and really need your help.

1) I found tow kinds of scripts of saturate SEM of one ordinal variable, one script constrains the free status of thresholds, and another one does not. I can't figure out which one to follow.
the first one:

nth <- 3  
frTh <- matrix(rep(c(F,F,(rep(T,nth-2)))),nrow=nth,ncol=nv) # free status for thresholds
thinMZ <- mxMatrix( type="Full", nrow=nth, ncol=ntv, free=frTh, values=svTh, lbound=lbTh, labels=labThMZ, name="thinMZ" )  

the second one:

nth <- 3 
thinMZ    <- mxMatrix( type="Full", nrow=nth, ncol=ntv, free=TRUE, values=svTh, lbound=lbTh, labels=labThMZ, name="thinMZ" ) 

2)
If I do need to constrain the free status of thresholds, then how many thresholds should I fix?
In the first script, the number of thresholds is 3 and two shresholds are fixed. If my ordinal variable is categorized into 5 groups, and then the number of thresholds should be 4, then how many thresholds should I fix? three or two? In fact, I don't konw the reason for fixing thresholds. So, please don't mind my kind of stupid question.

I would be really appreciated for your reply!
Thanks!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
model identification

If your ordinal variable has two or more thresholds (three or more ordered categories), then for model identification, you need to choose one of the following: (1) fix the mean (or regression intercept) and variance, and free all thresholds; (2) free the mean (or regression intercept) and fix the variance and one of the thresholds; or (3) free the mean (or regression intercept) and the variance, and fix two of the thresholds. Most people do either (1) or (3). In the case of (1), usually the mean (or intercept) and variance are fixed to 0 and 1, respectively. In the case of (3), people usually fix the lowest two thresholds to 0 and 1, respectively.

The choices are arbitrary, but sometimes one choice is preferable to others because it makes results easier to interpret. For instance, option (3) is often used with longitudinal data, because it's easier to interpret the variance of the latent continuum changing with time, as opposed to interpreting the spread of the thresholds changing with time.

Note that in biometrical variance-components analysis (as with twin data, etc.), fixing the variance to 1 will often require an MxConstraint. An alternative to using an MxConstraint is to fix the nonshared-environmental variance component to 1, which does change the interpretation of the genetic and shared-environmental variance components..

diana's picture
Offline
Joined: 04/17/2019 - 08:58
Model is not locally identified

I'm really grateful for your detailed reply. However, even though I ran the totally same code as the script I found (intercept and variance are fixed to 0 and 1), the result showed 'Model is not locally identified'. The code is as follows and I hope you can help me to figure out where the problem is.

vars      <- 'tea_5g'                      # list of variables names
covars    <- c("age","region_type")
nv        <- 1                         # number of variables
ncv       <- 2                         # number of covariates
ntv       <- nv*2                      # number of total variables
nth       <- 4
selVars   <- paste(vars,c(rep(1,nv),rep(2,nv)),sep="")
covVars   <- paste(covars,c(rep(1,ncv),rep(2,ncv)),sep="")
# Select Data for Analysis
mzData <- subset(tea_5g, zygosity1==0, c(selVars,covVars)) 
dzData <- subset(tea_5g, zygosity1==1, c(selVars,covVars))
# tell mx the data is ordinal
mzData[,1:2] <- mxFactor(mzData[,1:2], levels= c(0:nth))
dzData[,1:2] <- mxFactor(dzData[,1:2], levels= c(0:nth))
# Set Starting Values
svBe <- 0.01                                       # start value for beta regressions
svLTh <- -1.5                                      # start value for first threshold
svITh <- 1                                         # start value for increments
svTh <- matrix(rep(c(svLTh,(rep(svITh,nth-1)))),nrow=nth,ncol=nv) # start value for thresholds
lbTh <- matrix(rep(c(-3,(rep(0.001,nth-1))),nv),nrow=nth,ncol=nv) # lower bounds for thresholds
svCor <- .5                                      # start value for correlations
lbCor <- -0.99                                   # lower bounds for correlations
ubCor <- 0.99                                    # upper bounds for correlations
 
labThMZ <- c(paste("t",1:nth,"MZ1",sep=""),paste("t",1:nth,"MZ2",sep=""))
labThDZ <- c(paste("t",1:nth,"DZ1",sep=""),paste("t",1:nth,"DZ2",sep=""))
 
modelMZ<- mxModel("MZ",
                  # Define definition variables
                  mxMatrix( type="Full", nrow=ncv, ncol=2, free=F, label=c("data.age1","data.region_type1","data.age2","data.region_type2"), name="DefVars"),
                  mxMatrix( type="Full", nrow=1, ncol=ncv, free=TRUE, values=svBe,labels=c("Bage","Bregion_type"), name="BageTH"),
                  # expected mean Matrices
                  mxMatrix( type="Zero", nrow=1, ncol=ntv, name="meanG" ), #intercepts#
                  mxAlgebra( expression= meanG + BageTH%*%DefVars, name="expMeanMZ" ), 
                  # Define threshold matrices : nth--number of thresholds
                  mxMatrix( type="Full", nrow=nth, ncol=ntv, free=T, values=svTh, lbound=lbTh, labels=labThMZ, name="thinMZ" ),
                  mxMatrix( type="Lower", nrow=nth, ncol=nth, free=FALSE, values=1, name="inc" ),
                  mxAlgebra( expression= inc %*% thinMZ, name="threMZ" ),
                  # expected variance–covariance Matrices
                  mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=TRUE, values=svCor, lbound=lbCor, ubound=ubCor, labels="rMZ",name="corMZ" ),    
                  # Read in the data
                  mxData( observed=mzData, type="raw" ), 
                  mxExpectationNormal( covariance="corMZ", means="expMeanMZ", dimnames=selVars, thresholds="threMZ"),
                  mxFitFunctionML() 
)
modelDZ<- mxModel("DZ",
                  mxMatrix( type="Full", nrow=ncv, ncol=2, free=F, label=c("data.age1","data.region_type1","data.age2","data.region_type2"), name="DefVars"),
                  mxMatrix( type="Full", nrow=1, ncol=ncv, free=TRUE, values=svBe,labels=c("Bage","Bregion_type"), name="BageTH"),
                  mxMatrix( type="Zero", nrow=1, ncol=ntv, name="meanG" ), #two labels#
                  mxAlgebra( expression= meanG + BageTH%*%DefVars, name="expMeanDZ" ), 
                  mxMatrix( type="Full", nrow=nth, ncol=ntv, free=frTh, values=svTh, lbound=lbTh, labels=labThDZ, name="thinDZ" ),
                  mxMatrix( type="Lower", nrow=nth, ncol=nth, free=FALSE, values=1, name="inc" ),
                  mxAlgebra( expression= inc %*% thinDZ, name="threDZ" ),
                  mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=TRUE, values=svCor, lbound=lbCor, ubound=ubCor, labels="rDZ",name="corDZ" ),
                  mxData( observed=dzData, type="raw"), 
                  mxExpectationNormal( covariance="corDZ", means="expMeanDZ", dimnames=selVars, thresholds="threDZ" ),
                  mxFitFunctionML()
)
Conf    <- mxCI (c ('MZ.corMZ','DZ.corDZ')) 
SatModel    <- mxModel( "Sat", modelMZ, modelDZ,mxFitFunctionMultigroup(c('MZ.fitfunction','DZ.fitfunction')), Conf )
# -----------------------------------------------------------------------------------------------
# RUN Saturated Model
SatFit  <-   mxTryHard(SatModel, intervals=F,extraTries = 41) #,bestInitsOutput=T,silent=F#
(SatSumm  <- summary(SatFit))
mxCheckIdentification(SatFit, details=F)   
# the result showed 'Model is not locally identified'

Thank you so much!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
probably false alarm
However, even though I ran the totally same code as the script I found (intercept and variance are fixed to 0 and 1), the result showed 'Model is not locally identified'.

That's probably a false alarm. Try mxCheckIdentification() again, with details=TRUE, to see the function's guess as to which parameters are unidentified. Also, try running your model (perhaps with mxTryHardOrdinal()), and check to see if mxCheckIdentification() says it's locally unidentified at the solution, too.

diana's picture
Offline
Joined: 04/17/2019 - 08:58
I have one more question, how

I have one more question, how can I define the threshold matrix if the ordinal variable1 has 4 thresholds and ordinal variable2 has 3 threshold.

#select ordinal variable 1-------------------
table(tea5g_smk4g$tea_5g1,tea5g_smk4g$tea_5g2)
nth1 <- 4                    # number of thresholds
vars1 <- c('tea_5g')         # list of ordinal variables names
nvo1 <- 1                    # number of ordinal variables
ntvo1 <- nvo1*2               # number of total ordinal variables
ordVars1 <- paste(vars1,c(rep(1,nvo1),rep(2,nvo1)),sep="")
 
#select ordinal variable 2-------------------
table(tea5g_smk4g$smk_num_4g1,tea5g_smk4g$smk_num_4g2)
nth2 <- 3                    # number of thresholds
vars2 <- c('smk_num_4g')     # list of ordinal variables names
nvo2 <- 1                    # number of ordinal variables
ntvo2 <- nvo2*2               # number of total ordinal variables
ordVars2 <- paste(vars2,c(rep(1,nvo2),rep(2,nvo2)),sep="")
 
# Select Variables for Analysis---------------------
vars <- c('tea_5g','smk_num_4g')       # list of variables names
covars <- c("age","region_type")
nv <- nvo1+nvo2                         # number of variables
ncv <- 2                         # number of covariates
ntv <- nv*2                            # number of total variables
selVars <- paste(vars,c(rep(1,nv),rep(2,nv)),sep="")           #"tea_5g1"     "smk_num_4g1" "tea_5g2"     "smk_num_4g2"
covVars <- paste(covars,c(rep(1,ncv),rep(2,ncv)),sep="")
 
# Select Data for Analysis---------------------------
mzData <- subset(tea5g_smk4g, zygosity2==0, c(selVars,covVars))
dzData <- subset(tea5g_smk4g, zygosity2==1, c(selVars,covVars))
 
mzDataF <- cbind(mxFactor( x=mzData[,ordVars1], levels=c(0:nth1)),mxFactor( x=mzData[,ordVars2], levels=c(0:nth2)),mzData[,covVars] )
dzDataF <- cbind(mxFactor( x=dzData[,ordVars1], levels=c(0:nth1)),mxFactor( x=dzData[,ordVars2], levels=c(0:nth2)),dzData[,covVars] )

I don't know how to define threshold matrix.

Thanks so much!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
see User Guide

See here. Specifically, under "Threshold Specification", it says "Rows of the threshold matrix beyond the number of thresholds in a particular variable should be fixed parameters with starting values of NA."

diana's picture
Offline
Joined: 04/17/2019 - 08:58
some results and new problems

Hello again! I'm sorry for turning to you for help again.
1) After I ran mxCheckIdentification(), details=TRUE, the result showed the coefficients of covatiates were not identified. Does that matters?

2) I followed your advice by fixing corresponding thresholds with starting values of NA. However, I got the waring as follows:

**Computing Hessian and/or standard errors and/or confidence intervals from imperfect solutionError : In model 'Sat' I was expecting 3 thresholds in column 'smk_num_4g1' of matrix/algebra 'MZ.threMZ' but I hit NA values after only 0 thresholds. You need to increase the number of thresholds for 'smk_num_4g1' and give them values other than NA

Retry limit reached; solution not found. Best fit=25943569000 (started at 47142898000) (16 attempt(s): 16 valid, 0 errors)**

Here is my script and I'm sorry to bother you to check what's wrong with it. Thank you so much!!!!

#select ordinal variable 1-------------------
table(tea5g_smk4g$tea_5g1,tea5g_smk4g$tea_5g2)
nth1 <- 4                    # number of thresholds
vars1 <- c('tea_5g')         # list of ordinal variables names
nvo1 <- 1                    # number of ordinal variables
ntvo1 <- nvo1*2               # number of total ordinal variables
ordVars1 <- paste(vars1,c(rep(1,nvo1),rep(2,nvo1)),sep="")
 
#select ordinal variable 2------------------
table(tea5g_smk4g$smk_num_4g1,tea5g_smk4g$smk_num_4g2)
nth2 <- 3                    # number of thresholds
vars2 <- c('smk_num_4g')     # list of ordinal variables names
nvo2 <- 1                    # number of ordinal variables
ntvo2 <- nvo2*2               # number of total ordinal variables
ordVars2 <- paste(vars2,c(rep(1,nvo2),rep(2,nvo2)),sep="")
 
# Select Variables for Analysis---------------------
vars <- c('tea_5g','smk_num_4g')       # list of variables names
covars <- c("age","region_type")
nv <- nvo1+nvo2                         # number of variables
ncv <- 2                         # number of covariates
ntv <- nv*2                            # number of total variables
selVars <- paste(vars,c(rep(1,nv),rep(2,nv)),sep="")           #"tea_5g1"     "smk_num_4g1" "tea_5g2"     "smk_num_4g2"
covVars <- paste(covars,c(rep(1,ncv),rep(2,ncv)),sep="")
 
# Select Data for Analysis---------------------------
mzData <- subset(tea5g_smk4g, zygosity2==0, c(selVars,covVars))
dzData <- subset(tea5g_smk4g, zygosity2==1, c(selVars,covVars))
 
mzDataF <- cbind(mxFactor( x=mzData[,ordVars1], levels=c(0:nth1)),mxFactor( x=mzData[,ordVars2], levels=c(0:nth2)),mzData[,covVars] )
dzDataF <- cbind(mxFactor( x=dzData[,ordVars1], levels=c(0:nth1)),mxFactor( x=dzData[,ordVars2], levels=c(0:nth2)),dzData[,covVars] )
# Set Starting Values
frMV <- c(FALSE,FALSE)                         # free status for meanG of variables
frTH <- matrix(c(rep(T,nth1),c(rep(T,nth2),F)),4,4)                         # free status for thresholds
svMe <- c(0,0)                               # start value for means
svBe <- 0.01                                       # start value for beta regressions
svLTh <- 0                                    # start value for first threshold
svITh <- 1                                    # start value for increments
svTh <- matrix(c(svLTh,rep(svITh,nth1-1),svLTh,rep(svITh,nth2-1),NA),nrow=nth1,ncol=ntv) # start value for thresholds
lbTh <- matrix(c(-3,rep(0.001,nth1-1),-3,rep(0.001,nth2-1),NA),nrow=nth1,ncol=ntv)        # lower bound for thresholds
svCor <- .5                                   # start value for correlations
 
# Create Labels
labdata  <- cbind(paste("data.",covars[1],c(1,2),sep = ""),paste("data.",covars[2],c(1,2),sep = ""))  
#       [,1]        [,2]               
# [1,] "data.age1" "data.region_type1"
# [2,] "data.age2" "data.region_type2"
labeta    <-cbind(paste("B",covars,"_",vars[1],sep = ""),paste("B",covars,"_",vars[2],sep = ""))
#      [,1]                  [,2]                     
# [1,] "Bage_tea_5g"         "Bage_smk_num_4g"        
# [2,] "Bregion_type_tea_5g" "Bregion_type_smk_num_4g"
labTh <- function(lab,vars,nth) { paste(paste("t",1:nth,lab,sep=""),rep(vars,each=nth),sep="") }
labThMZ <- matrix(c(labTh("MZ","tea_5g1",nth1),labTh("MZ","smk_num_4g1",nth2),NA,labTh("MZ","tea_5g2",nth1),labTh("MZ","smk_num_4g2",nth2),NA),nth1,ntv)
labThDZ <- matrix(c(labTh("DZ","tea_5g1",nth1),labTh("DZ","smk_num_4g1",nth2),NA,labTh("DZ","tea_5g2",nth1),labTh("DZ","smk_num_4g2",nth2),NA),nth1,ntv)
labThZ <- matrix(c(labTh("Z","tea_5g1",nth1),labTh("Z","smk_num_4g1",nth2),NA,labTh("Z","tea_5g2",nth1),labTh("Z","smk_num_4g2",nth2),NA),nth1,ntv)
 
labCvMZ  <- c("MZCVT1", "MZWP1", "MZCTT1T2","MZCTT2T1", "MZWP2", "MZCVT2")  # labels for (co)variances for MZ twins
labCvDZ  <- c("DZCVT1", "DZWP1", "DZCTT1T2","DZCTT2T1", "DZWP2", "DZCVT2") # labels for (co)variances for DZ twins
labCvZ   <- c("ZCVT1", "ZWP1", "ZCTT1T2","ZCTT2T1", "ZWP2", "ZCVT2") 
##########################################
#         wc1   smk1   |  wc2     smk2    
# wc1   VP1T1          |
# smk1  CVT1*   VP2T1  | 
#-----------------------------------------
# wc2   WP1*    CTT2T1 | VP1T2   
# smk2  CTT1T2* WP2*   | CVT2   VP2T2
##########################################
modelMZ <- mxModel( "MZ", 
      mxMatrix( type="Full", nrow=2, ncol=ncv, free=F, label=labdata, name="MZDefVars"),
      mxMatrix( type="Full", nrow=ncv, ncol=nv, free=T, values=svBe,labels=labeta, name="Beta"),
      mxMatrix( type="Full", nrow=1, ncol=nv, free=frMV, values=svMe, name="meanG" ),  
      mxAlgebra( expression= cbind(meanG + MZDefVars[1,] %*% Beta,meanG + MZDefVars[2,] %*% Beta), name="expMeanMZ" ),
      mxMatrix( type="Full", nrow=nth1, ncol=ntv, free=frTH, values=svTh, lbound=lbTh, labels=labThMZ, name="thinMZ" ),
      mxMatrix( type="Lower", nrow=nth1, ncol=nth1, free=FALSE, values=1, name="inc" ),
      mxAlgebra( expression= inc %*% thinMZ, name="threMZ" ),
      mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=T, values=svCor, labels=labCvMZ, name="expCovMZ"),  #lbound=-.99, ubound=.99
      mxData( observed=mzDataF, type="raw" ), 
      mxExpectationNormal( covariance="expCovMZ", means="expMeanMZ", dimnames=selVars, thresholds="threMZ" ),
      mxFitFunctionML()
)
modelDZ <- mxModel( "DZ",
      mxMatrix( type="Full", nrow=2, ncol=ncv, free=F, label=labdata, name="DZDefVars"),
      mxMatrix( type="Full", nrow=ncv, ncol=nv, free=T, values=svBe,labels=labeta, name="Beta"),
      mxMatrix( type="Full", nrow=1, ncol=nv, free=frMV, values=svMe, name="meanG" ),  
      mxAlgebra( expression= cbind(meanG + DZDefVars[1,] %*% Beta,meanG + DZDefVars[2,] %*% Beta), name="expMeanDZ" ),
      mxMatrix( type="Full", nrow=nth1, ncol=ntv, free=frTH, values=svTh, lbound=lbTh, labels=labThDZ, name="thinDZ" ),
      mxMatrix( type="Lower", nrow=nth1, ncol=nth1, free=FALSE, values=1, name="inc" ),
      mxAlgebra( expression= inc %*% thinDZ, name="threDZ" ),
      mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=T, values=svCor, labels=labCvDZ, name="expCovDZ"), # lbound=-.99, ubound=.99
      mxData( observed=dzDataF, type="raw" ), 
      mxExpectationNormal( covariance="expCovDZ", means="expMeanDZ", dimnames=selVars, thresholds="threDZ" ),
      mxFitFunctionML()
)
# Combine Groups
Conf        <- mxCI (c ('MZ.expCovMZ[3,1]','DZ.expCovDZ[3,1]','MZ.expCovMZ[4,2]','DZ.expCovDZ[4,2]','MZ.expCovMZ[2,1]','DZ.expCovDZ[2,1]','MZ.expCovMZ[4,1]','DZ.expCovDZ[4,1]') )#'MZ.expCovMZ[3,1]','MZ.expCovMZ[4,2]'
SatModel    <- mxModel( "Sat", modelMZ, modelDZ,mxFitFunctionMultigroup(c('MZ','DZ')), Conf )
SatFit  <- mxTryHardOrdinal(SatModel, intervals=F,extraTries = 15) 

Wish you have a good day!
Thanks♪(・ω・)ノ

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Data objects

What do you get from str(mzDataF) and str(dzDataF)?

diana's picture
Offline
Joined: 04/17/2019 - 08:58
im sorry i dont have my data

im sorry i dont have my data at hand. it is 11 pm in my country. i will get up as early as possible tomorrow morning to check the result of it. but i define mzdataF and dzdataF according to the demo code i uploaded, so i think it would be right.
please leave more information since when i get up you would possibly get off work.
thank you very much!!!

File attachments: 
AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
unsure

Having taken a second look at your post, I'm not sure that there's anything wrong with your script. I can't find anything suspicious in your script, except for the possibility that your use of cbind() might have returned a matrix rather than a dataframe at some point.

Did mxTryHardOrdinal() return a fitted MxModel object? If so, does mxCheckIdentification() still say the model is unidentified?

diana's picture
Offline
Joined: 04/17/2019 - 08:58
Hi, Rob!

Hi, Rob!
1)After I ran str(mzDataF), it turned out that it was a dataframe. According to the warning, it seems that there is something wrong in the matrix of MZ.threMZ?
2) mxTryHardOrdinal() can return a fitted solution, but mxCheckIdentification() still say the model is not identified and the non-identified parameters are two coefficients of age and region_type.

Thanks again!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
What's the actual output of

What's the actual output of str(mzDataF)?

diana's picture
Offline
Joined: 04/17/2019 - 08:58
Hi, the result was in the

Hi, the result was in the picture I uploaded.
Thanks!

File attachments: 
AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
definition variables

Per its documentation, mxCheckIdentification() doesn't necessarily give accurate results when the MxModel uses definition variables.