Attachment | Size |
---|---|

R script.R | 2.57 KB |

Dear Mike and all,

First off, thanks for this great forum and for directing me to ask questions here!

I used the metasem package to test some simple mediation models (IV->Med->DV, with the direct effect included). The script is attached.

Results of stage 1 looked fine:

summary(stage1)

##

## Call:

## meta(y = ES, v = acovR, RE.constraints = Diag(paste0(RE.startvalues,

## "*Tau2_", 1:no.es, "_", 1:no.es)), RE.lbound = RE.lbound,

## I2 = I2, model.name = model.name, suppressWarnings = TRUE,

## silent = silent, run = run)

##

## 95% confidence intervals: z statistic approximation (robust=FALSE)

## Coefficients:

## Estimate Std.Error lbound ubound z value Pr(>|z|)

## Intercept1 0.40038970 0.02796976 0.34556997 0.45520943 14.3151 < 2.2e-16 ***

## Intercept2 0.19706054 0.02504600 0.14797128 0.24614980 7.8679 3.553e-15 ***

## Intercept3 0.22359242 0.03632058 0.15240539 0.29477944 6.1561 7.457e-10 ***

## Tau2_1_1 0.02692037 0.00683628 0.01352151 0.04031923 3.9379 8.221e-05 ***

## Tau2_2_2 0.02431705 0.00635049 0.01187031 0.03676379 3.8292 0.0001286 ***

## Tau2_3_3 0.01545611 0.00749157 0.00077291 0.03013932 2.0631 0.0390998 *

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Q statistic on the homogeneity of effect sizes: 998.5511

## Degrees of freedom of the Q statistic: 99

## P value of the Q statistic: 0

##

## Heterogeneity indices (based on the estimated Tau2):

## Estimate

## Intercept1: I2 (Q statistic) 0.9629

## Intercept2: I2 (Q statistic) 0.9413

## Intercept3: I2 (Q statistic) 0.9174

##

## Number of studies (or clusters): 102

## Number of observed statistics: 102

## Number of estimated parameters: 6

## Degrees of freedom: 96

## -2 log likelihood: -69.77645

## OpenMx status1: 0 ("0" or "1": The optimization is considered fine.

## Other values may indicate problems.)

Yet results of stage 2 modeling looked strange, with NA values of standard errors, z values, and p-values:

summary(stage2)

##

## Call:

## wls(Cov = pooledS, aCov = aCov, n = tssem1.obj$total.n, RAM = RAM,

## Amatrix = Amatrix, Smatrix = Smatrix, Fmatrix = Fmatrix,

## diag.constraints = diag.constraints, cor.analysis = cor.analysis,

## intervals.type = intervals.type, mx.algebras = mx.algebras,

## mxModel.Args = mxModel.Args, subset.variables = subset.variables,

## model.name = model.name, suppressWarnings = suppressWarnings,

## silent = silent, run = run)

##

## 95% confidence intervals: Likelihood-based statistic

## Coefficients:

## Estimate Std.Error lbound ubound z value Pr(>|z|)

## c 0.128067 NA 0.058912 0.195726 NA NA

## b 0.172316 NA 0.083865 0.260299 NA NA

## a 0.400390 NA 0.345569 0.455256 NA NA

##

## mxAlgebras objects (and their 95% likelihood-based CIs):

## lbound Estimate ubound

## Indirect[1,1] 0.03369740 0.06899341 0.1071547

## Direct[1,1] 0.05891188 0.12806712 0.1957264

##

## Goodness-of-fit indices:

## Value

## Sample size 88066.00

## Chi-square of target model 0.00

## DF of target model 0.00

## p value of target model 0.00

## Number of constraints imposed on "Smatrix" 0.00

## DF manually adjusted 0.00

## Chi-square of independence model 304.72

## DF of independence model 3.00

## RMSEA 0.00

## RMSEA lower 95% CI 0.00

## RMSEA upper 95% CI 0.00

## SRMR 0.00

## TLI -Inf

## CFI 1.00

## AIC 0.00

## BIC 0.00

## OpenMx status1: 0 ("0" or "1": The optimization is considered fine.

## Other values indicate problems.)

My questions are:

(1) how could this happen? The model is simple. Could this result from too many missing values in my correlation matrixes? Many of my cases have incomplete correlations. But given the relatively large sample of many cases (see my sample size info below), I thought it wouldn't be an issue. I also tried to impute the missing values. but the results got worse, with now an optimization issue in stage 1 (score of 5).

pattern.na(my.cor, show.na = FALSE)

## DC M1 DV1

## DC 102 38 49

## M1 38 102 15

## DV1 49 15 102

(2) without any estimates of standard errors, I suppose the reported results of parameter estimates wouldn't make any sense, right? How to improve this modeling, if possible?

Any suggestions or advice to fix this? Thank you for your time!

Yingjie

Dear Yingjie,

Could you please include the data to reproduce the error?

Best,

Mike

Yes, please see the data attached. Thanks a lot, Mike!

You requested the likelihood-based confidence interval. The SEs, z values, and p values are removed to avoid confusion. If you want them, you may use the Wald statistic. For example,

Moreover, it is challenging to test the nonlinear constraint c=a*b. I suggest dropping it.

Thanks! Now I see the SEs...

Can you elaborate a bit on the estimate of the indirect effect? Why do you see it as a trouble and suggest dropping it? The indirect effect is, actually, the core of my hypotheses.

But I did get some warning messages when modeling stage2, and not sure what to do with this:

## Warning: In model 'TSSEM2 Correlation' Optimizer returned a non-zero status

## code 6. The model does not satisfy the first-order optimality conditions to the

## required accuracy, and no improved point for the merit function could be found

## during the final linesearch (Mx status RED)

a

b and c are the indirect and direct effects. There is no issue in estimating them. But c=ab means that the direct and indirect effects are the same. Sometimes it is challenging to fit it as c=a*b requires a nonlinear constraint.Mx status RED means that the solution is not optimal. You should not report it. You may refit the model with rerun().

A couple of thoughts: the indirect effect equals a

b, to test whether ab = 0 (no evidence of mediation) seems a reasonable constraint. I can't see a good reason to test whether ab = c. On the confidence intervals front, the likelihood-based confidence intervals of ab are easy to request, and this is probably better than Sobel or other methods of estimating the precision of the indirect effect. It is also easy to ask for standard errors of functions of parameters with mxSE().One concern I had looking at the analysis is that the correlations between the DC, MED, and DV variables is that they sometimes stem from the same study. These statistics would therefore not be independent (the same measures can contribute to the different statistics). The lack of independence (positive correlation between the statistics) would normally result in estimates of standard errors that are too large. There are ways around this problem with a SEM specification - perhaps that was done, I didn't look too closely.

Yes indeed. The constraint of ab = c is not needed here, I overlooked it when I adapted the script from Mike's open example earlier. Thank you both! I corrected that.

Re the point about multiple observations from the same study. Do you mean the independence between DC-MED and MED-DV is required? Or is it more about having multiple sets of DC-MED-DV from the same study? For the latter, we'll use composite scores. If the former, do you have any recommendations? I didn't know this would be an issue.

Hi Yingjie,

Thanks for posting your questions, which also helped me since I am also doing a meta-analysis of mediation.

I looked at your data, and am interested in your missing values in correlation matrices. I noticed that some of the matrix only has the value of one correlation, and I am wondering why you did not exclude some studies as they missed too many values; some of the studies only reported the correlation of the path from DC to DV, which is hard to get the mediation analysis.

I am quite new in this field, so I am not sure my questions make sense.

Regarding the point about multiple observations from the same study, I am also figuring out this issue. If one study with the same sample contributes to several correlation matrices, should we calculate them together or should we see different matrices from one sample as different effect sizes?

Best wishes,

Yixuan

Excluding studies with only one correlation ultimately depends on the inclusion/exclusion criteria that have been established to determine whether or not a study is eligible for inclusion.

There is no definitive solution that is considered the "best" approach to dealing with multiple effect sizes. While some researchers may choose to average them to remove non-independence, others are exploring the use of three-level models and robust standard errors. However, there is currently no concrete evidence to support any one particular practice as superior.

Hi Mike,

Thank you so much for your quick response and your significant contributions to this field!

I am not sure if my understanding is correct: at first, I didn't exclude any studies as long as they assessed all paths of mediation; but when I started to analyze data, I found some studies only had one correlation coefficient (though they had regression coefficients of all paths), so I use the functions: pattern.na and is.pd in Metasem, to determine whether or not I should include the data in next step.

eg., we see that one study has some missing. To see the overall missing data pattern, we can use the pattern.na() function.

We have to check if the matrices are positive definite, because this is a requirement for the later processing steps. We can do this with the is.pd() function. If we get TRUE for all studies, everything is fine and we can continue; but I got NA for one study, so we need to delete it to continue to the next step.

I have attached screenshots of my coding and the results for your review. I am not sure if this method of handling missing data is appropriate and would be very grateful if you have any advice or guidance on this matter.

Best wishes,

Yixuan

The correlation matrices are not required to be positive definite. However, non-positive definite matrices usually indicate data entry or other errors, so they must be checked before the analyses.

`NA`

in the check does not necessarily mean an error. You may refer to the following examples.Thank you so much for pointing out this, Mike!

I checked my data and I found that my matrix is similar to your x2, but for is.pd it returned NA, and I think there are no data entry errors about this matrix. I attached my screenshot for this matric and I would be very grateful if you could give me some advice.

If you want to check whether they are positive definite, you need to change the format from x2 to x1; otherwise, it will return NA.