Help: TSSEM Mediation model

Posted on
No user picture. Kwabenaaaddo Joined: 02/01/2021
Forums
Dear community,

I trust you are doing well. Thank you very much for your immense work in structural equation modelling. I'm contacting you because I'm currently working on a research project that involves implementing a two-step structural equation model.

I got an error when I implemented the command to run the first-stage model (I've attached the document). From here, I do not particularly understand how to proceed. I have attached the codes and the data to aid you. I would be grateful if you could help me identify the error and, if convenient, with the right codes.

Once again, thank you very much, and I look forward to your response.

Kindest Regards,

Replied on Sat, 04/13/2024 - 05:00
Picture of user. Mike Cheung Joined: 10/08/2009

There are errors in the data, such as correlation coefficients with minimum and maximum values of -550 and 1.188, respectively (see the end of the outputs), which are clearly invalid.


my.data <- read.csv("./data_7.csv")

#Load MetaSEM package
library(metaSEM)

# locate studies with information on at least 1 correlation, and no missing sample size
keepstudy <- rowSums(is.na(my.data[,2:37]))!=36 & is.na(my.data$NUMBER.OF.BANKS)==FALSE
# keep only the studies with information
data <- my.data[keepstudy,]

# check data
head(data)

## Study_id X1_X2 X1_X3 X1_X4 X1_X5 X1_X6 X1_X7 X1_X8 X1 X2_X3 X2_X4
## 1 2 0.075 0.1625 NA 0.05 0.1375 NA NA 0.0375 0.4125 NA
## 2 2 0.075 0.1625 NA 0.05 0.1375 NA NA 0.0375 0.4125 NA
## 3 2 0.075 0.1625 NA 0.05 0.1375 NA NA 0.0250 0.4125 NA
## 4 2 0.075 0.1625 NA 0.05 0.1375 NA NA 0.0375 0.4125 NA
## 5 2 0.075 0.1625 NA 0.05 0.1375 NA NA -0.1000 0.4125 NA
## 6 2 0.075 0.1625 NA 0.05 0.1375 NA NA -0.0125 0.4125 NA
## X2_X5 X2_X6 X2_X7 X2_X8 X2 X3_X4 X3_X5 X3_X6 X3_X7 X3_X8 X3 X4_X5
## 1 0.05 0.175 NA NA -0.2250 NA 0.0125 0.225 NA NA -0.1125 NA
## 2 0.05 0.175 NA NA -0.2500 NA 0.0125 0.225 NA NA -0.1750 NA
## 3 0.05 0.175 NA NA -0.2000 NA 0.0125 0.225 NA NA -0.1250 NA
## 4 0.05 0.175 NA NA -0.2625 NA 0.0125 0.225 NA NA -0.1625 NA
## 5 0.05 0.175 NA NA -0.2625 NA 0.0125 0.225 NA NA -0.1875 NA
## 6 0.05 0.175 NA NA -0.1875 NA 0.0125 0.225 NA NA -0.0500 NA
## X4_X6 X4_X7 X4_X8 X4 X5_X6 X5_X7 X5_X8 X5 X6_X7 X6_X8 X6 X7_X8 X7 X8
## 1 NA NA NA NA -0.05 NA NA 0.0750 NA NA -0.0750 NA NA NA
## 2 NA NA NA NA NA NA NA 0.4250 NA NA -0.0125 NA NA NA
## 3 NA NA NA NA NA NA NA 0.2375 NA NA 0.0000 NA NA NA
## 4 NA NA NA NA NA NA NA 0.3250 NA NA -0.0375 NA NA NA
## 5 NA NA NA NA NA NA NA 0.2500 NA NA -0.1625 NA NA NA
## 6 NA NA NA NA NA NA NA 0.0375 NA NA -0.0250 NA NA NA
## BANK.YEAR.OX1 NUMBER.OF.BANKS
## 1 2426 212
## 2 2426 212
## 3 2426 212
## 4 2426 212
## 5 2426 212
## 6 2426 212

length(data)

## [1] 39

## summary(data)

# varnames and labels
nvar <- 9
varnames <- c("X1","X2","X3","X4","X5","X6","X7","X8","X9")
labels <- list(varnames,varnames)
# create list with correlation matrices
cordat <- list()
for (i in 1:nrow(data)){
cordat[[i]] <- vec2symMat(as.matrix(data[i,2:37]),diag = FALSE)
dimnames(cordat[[i]]) <- labels
}
# put NA on diagonal of correlation matrix if variable is missing
for (i in 1:length(cordat)){
for (j in 1:nrow(cordat[[i]])){
if (sum(is.na(cordat[[i]][j,]))==nvar-1)
{cordat[[i]][j,j] <- NA}
}}

# show number of studies per correlation coefficient
pattern.na(cordat, show.na = FALSE)

## X1 X2 X3 X4 X5 X6 X7 X8 X9
## X1 219 173 66 69 147 179 42 78 189
## X2 173 198 51 64 125 150 36 60 168
## X3 66 51 69 25 50 61 17 34 62
## X4 69 64 25 86 55 72 18 28 66
## X5 147 125 50 55 196 154 29 60 163
## X6 179 150 61 72 154 236 40 78 204
## X7 42 36 17 18 29 40 48 9 44
## X8 78 60 34 28 60 78 9 93 81
## X9 189 168 62 66 163 204 44 81 249

# show total N per correlation coefficient
pattern.n(cordat, data$NUMBER.OF.BANKS)

## X1 X2 X3 X4 X5 X6 X7 X8 X9
## X1 20434 13161 11296 6655 15604 18361 2349 10004 16931
## X2 13161 15875 5859 7623 11027 12986 2196 5282 12268
## X3 11296 5859 11543 3248 9824 10696 1013 6478 10129
## X4 6655 7623 3248 29544 12872 27613 1374 22770 26661
## X5 15604 11027 9824 12872 44614 41298 1753 28733 40607
## X6 18361 12986 10696 27613 41298 47652 2338 30099 44030
## X7 2349 2196 1013 1374 1753 2338 2539 667 2160
## X8 10004 5282 6478 22770 28733 30099 667 30881 29667
## X9 16931 12268 10129 26661 40607 44030 2160 29667 45661

cordat <- lapply(cordat, function(x) {diag(x) <- 1; x})

summary(unlist(cordat))

## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -550.000 -0.025 0.178 0.217 1.000 1.188 14626

Replied on Thu, 04/18/2024 - 04:40
Picture of user. Mike Cheung Joined: 10/08/2009

It is very strange that the CFI is so bad, especially for a mediation model. There is probably something wrong in your model or data.
Replied on Mon, 04/29/2024 - 09:48
Picture of user. Mike Cheung Joined: 10/08/2009

It is hard to tell. One common issue is that you forget to allow the predictors correlated. This will make the model fit terribly bad.
Replied on Wed, 05/01/2024 - 05:56
Picture of user. Mike Cheung Joined: 10/08/2009

If you use the lavaan syntax to specify the models, you may check https://lavaan.ugent.be/tutorial/syntax1.html, especially the "~~" operator. You may also refer to some SEM books about model specifications.
Replied on Thu, 05/02/2024 - 03:50
No user picture. Kwabenaaaddo Joined: 02/01/2021

In reply to by Mike Cheung

Mike,

Thank you. I will have a look and revert. A quick scan of the codes file, which I have attached, showed that the "~~" was used to specify the covariance between a variable and itself (i.e., variance). Do I need to use the same operator to specify the covariance between different variables?

Replied on Thu, 05/02/2024 - 08:56
Picture of user. Mike Cheung Joined: 10/08/2009

Yes, if you have read the example in https://lavaan.ugent.be/tutorial/syntax1.html. You may also try it and see if it works.
Replied on Thu, 05/02/2024 - 17:16
No user picture. Kwabenaaaddo Joined: 02/01/2021

Hi Mike,

Thanks for the link. I have checked the mediation model section. It appears all the codes I have are in order regarding the "tilde" sign. If convenient, can you cross-check the codes I have already uploaded to see if they are indeed in order? Maybe I am missing something that is not obvious to me.
Thank you.

Replied on Thu, 05/02/2024 - 22:37
Picture of user. Mike Cheung Joined: 10/08/2009

The predictors are independent in your model. You need to include the correlations among ALL predictors, e.g., 'X1 ~~ X2'.


model10 <- "X9 ~ c*X1 +f*X2+i*X3+l*X4+s*X6+t*X7+x*X8
X5 ~ a*X1+d*X2+g*X3+j*X4+u*X6+v*X7+w*X8
X9 ~ b*X5
X1 ~~ 1*X1
X2 ~~ 1*X2
X3 ~~ 1*X3
X4 ~~ 1*X4
X6 ~~ 1*X6
X7 ~~ 1*X7
X8 ~~ 1*X8"

Replied on Fri, 05/03/2024 - 03:51
No user picture. Kwabenaaaddo Joined: 02/01/2021

Dear Mike,

Thanks for your response so far. I did what you suggested in the earlier conversation. See the updated code sheet that I have attached. However, I get this error when I try to estimate the model:

Warning message:
In .solve(x = object$mx.fit@output$calculatedHessian, parameters = my.name) :
Error in solving the Hessian matrix. Generalized inverse is used. The standard errors may not be trustworthy.

> summary(tssem.fit)
Error in if (pchisq(chi.squared, df = df, ncp = 0) >= upper) { :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In pchisq(tT, df = dfT, lower.tail = FALSE) : NaNs produced
2: In sqrt(max((tT - dfT)/(n - 1), 0)/dfT) : NaNs produced
3: In pchisq(chi.squared, df = df, ncp = 0) : NaNs produced

What could actually be an issue now? Thank you in advance.

File attachments
Replied on Fri, 05/03/2024 - 04:40
Picture of user. Mike Cheung Joined: 10/08/2009

Since the previous dataset contains error, could you attached the corrected dataset? Moreover, please ensure that the errors are reproducible by attaching the complete R code to read the CSV file and run the analysis.
Replied on Fri, 05/03/2024 - 07:22
No user picture. Kwabenaaaddo Joined: 02/01/2021

Hi Mike,

I have attached the corrected dataset and the complete R code. Thank you very much.

Regards

Replied on Fri, 05/03/2024 - 20:51
Picture of user. Mike Cheung Joined: 10/08/2009

Please see my first reply and ensure the data are valid. Your correlations vary from -1.216 to 1.050. You can check your data with the following syntax.


summary(unlist(cordat))
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
-1.216 -0.055 0.109 0.235 0.475 1.050 15827

Replied on Sat, 05/04/2024 - 04:22
No user picture. Kwabenaaaddo Joined: 02/01/2021

Dear Mike,
Apologies, I uploaded the wrong file. Kindly find the corrected file attached.

Many thanks.

Regards

Replied on Wed, 05/08/2024 - 21:35
Picture of user. Mike Cheung Joined: 10/08/2009

I noticed a few peculiarities in the data. First, the range of correlations varies from -0.887 to 0.990. This (0.8 or 0.9) could indicate a problem with multicollinearity. Secondly, many correlations seem identical, such as X1_X2 and X1_X3 on rows 1 to 6. Overall, the dataset looks unusual to me.


summary(c(sapply(cordat, vechs)))
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
-0.887 -0.106 0.044 0.054 0.204 0.990 7332

hist(c(sapply(cordat, vechs)))

Replied on Thu, 05/09/2024 - 09:17
No user picture. Kwabenaaaddo Joined: 02/01/2021

Dear Prof Cheung,

Thank you for your answer. I have removed all the duplicated intercorrelations and correlations greater than 0.8, hoping the model will fit. However, a new problem arises (see below). Do you think the problem results from the model specification, specifically with the outlined covariance and variances? As you might remember when the covariance and variances are not specified, the model can fit but with a low CFI.

name="Dir_AGE"),Dir_LIQ=mxAlgebra(x, name="Dir_LIQ")))
Warning messages:
1: In .solve(x = object$mx.fit@output$calculatedHessian, parameters = my.name) :
Error in solving the Hessian matrix. Generalized inverse is used. The standard errors may not be trustworthy.

2: In checkRAM(Amatrix = Amatrix, Smatrix = Smatrix, cor.analysis = cor.analysis) :
The variances of the dependent variables in 'Smatrix' should be free.
summary(tssem.fit)
> summary(tssem.fit)
Error in if (pchisq(chi.squared, df = df, ncp = 0) >= upper) { :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In pchisq(tT, df = dfT, lower.tail = FALSE) : NaNs produced
2: In sqrt(max((tT - dfT)/(n - 1), 0)/dfT) : NaNs produced
3: In pchisq(chi.squared, df = df, ncp = 0) : NaNs produced

Regards

Replied on Fri, 05/10/2024 - 04:48
No user picture. Kwabenaaaddo Joined: 02/01/2021

Dear Prof. Cheung,

I have attached the files. In the quest to push the CFI up (the initial problem that furthered this thread), I have specified the covariances and variances as part of the model (for e.g, X1 ~~ X2, X1 ~~ XX1). Unexpectedly, this leads to the error. I can not locate where this error comes from. So, I would appreciate your guidance from here.

One more question: do I need to include the covariance between the mediator variable (X9) and the other variables, as well as its variance (X9 ~~ X9) in the model specification?
Thank you

File attachments
Replied on Sat, 05/11/2024 - 05:49
Picture of user. Mike Cheung Joined: 10/08/2009

The following changes may help.
1) Rerun the model with the `autofixtau2=TRUE` argument.

stX71_rerun <- metaSEM::rerun(stX71random, autofixtau2 = TRUE)

2) Remove the covariance between the residues of X5 and other variables and free the residues of X5.

model10 <- "X9 ~ c*X1 +f*X2+i*X3+l*X4+s*X6+t*X7+x*X8
X5 ~ a*X1+d*X2+g*X3+j*X4+u*X6+v*X7+w*X8
X9 ~ b*X5
#Covariance
X1 ~~ X2
X1 ~~ X3
X1 ~~ X4
#X1 ~~ X5
X1 ~~ X6
X1 ~~ X7
X1 ~~ X8
X2 ~~ X3
X2 ~~ X4
#X2 ~~ X5
X2 ~~ X6
X2 ~~ X7
X2 ~~ X8
X3 ~~ X4
#X3 ~~ X5
X3 ~~ X6
X3 ~~ X7
X3 ~~ X8
#X4 ~~ X5
X4 ~~ X6
X4 ~~ X7
X4 ~~ X8
#X5 ~~ X6
#X5 ~~ X7
#X5 ~~ X8
X6 ~~ X7
X6 ~~ X8
X7 ~~ X8

#variance
X1 ~~ 1*X1
X2 ~~ 1*X2
X3 ~~ 1*X3
X4 ~~ 1*X4
# X5 ~~ 1*X5
X6 ~~ 1*X6
X7 ~~ 1*X7
X8 ~~ 1*X8"

Replied on Mon, 05/13/2024 - 04:37
No user picture. Kwabenaaaddo Joined: 02/01/2021

Dear Mike,

Thank you. This was helpful. My CFI improved to 1. The TLI is -inf. Is that normal?

Regards.