You are here

How to handle missing data in multilevel meta analysis & mutilevel TSSEM?

15 posts / 0 new
Last post
iana's picture
Offline
Joined: 10/16/2021 - 12:33
How to handle missing data in multilevel meta analysis & mutilevel TSSEM?

Hi Mike and all,

I'm working on a meta-analysis examining the mediators between attachment and intimate partner aggression. I would like to use two methods to meta-analyze the data. The methods include 1) three-level meta-analytic models (Assink & Wibbelink, 2016) and 2) three-level TSSEM (Wilson et al., 2016). Both are random-effects models.

However, there are some missing correlations in the included studies. In particular, a few studies did not report any correlations at all. My meta-analysis contains only a small number of studies because many mediators have been examined by only a handful of studies. Some mediators only got 3 studies; whereas others got 9 studies.

Would you please advise what is the best way to handle missing correlations in both analyses, especially in the context of small number of studies?

For the three-level TSSEM, would you please confirm whether the missing data could be handled automatically as in traditional TSSEM? I saw that Mike's paper (Cheung 2014) has mentioned this, but it was mainly for traditional TSSEM, not for the three-level one. Would you please clarify whether the method to handle missing data in a three-level random effect TTSEM is the same as that of the traditional TSSEM?

For method 1 (three-level meta-analytic models), I've searched for papers on handling missing data for multi-level meta-analysis. However, there was very little relevant paper on this topic (i.e., for multi-level synthesis). Would you mind shedding light on it when you have time? E.g., would it be preferable to exclude those studies with missing data?

I look forward to hearing from you soon.

Thank you for your time and help in advance.

Best wishes,
Iana

Mike Cheung's picture
Offline
Joined: 10/08/2009 - 22:37
Hi Iana,

Hi Iana,

The three-level meta-analysis implemented in the metaSEM uses an SEM approach. It handles missing effect sizes with the full-information maximum likelihood (FIML) approach (see Cheung, 2014).

I cannot speak for Wilson et al. (2016) as I am not one of the authors.

Suzanne Jak and I are extending several MASEM procedures to nested data. But they are not ready for production yet.

> In particular, a few studies did not report any correlations at all.
If there are no correlations, I cannot see how you include these studies.

> E.g., would it be preferable to exclude those studies with missing data?
It is better to include studies with missing data in general.

Cheung, M. W.-L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19(2), 211–229. https://doi.org/10.1037/a0032968

Mike

iana's picture
Offline
Joined: 10/16/2021 - 12:33
Hi Mike,

Hi Mike,

Thank you for your response and advice. I greatly appreciate that.

May I please check with you whether the maximum likelihood (FIML) approach involves data imputation? If yes, would it be a problem with a small number of studies (i.e., three)?

Moreover, from my reading of the Cheung 2014 paper, looks like the maximum likelihood (FIML) approach mainly applies in handling missing data for covariates? For instance, there is a section 'Handling Missing Covariates With Full-Information Maximum Likelihood Estimation' on page 9. My meta-analysis will not examine covariates due to the small number of studies. I got quite some missing correlations among the variables of interest. Can TSSEM handle them with the full-information maximum likelihood (FIML) approach as well?

In particular, do you think the multilevel approach to TSSEM proposed by Wilson et al 2016 (see below) handles missing data in the same way as described in Cheung 2014? I have to use the Wilson et al 2016 approach because it is the only method to handle dependent correlations in TSSEM meta-analysis at the moment. It's great to know that you and your colleagues are working to develop a more complex model. I look forward to learning about it.

Wilson, S. J., Polanin, J. R., and Lipsey, M. W. (2016) Fitting meta-analytic structural equation models with complex datasets. Res. Syn. Meth., 7: 121– 139. doi: 10.1002/jrsm.1199.

Mike Cheung's picture
Offline
Joined: 10/08/2009 - 22:37
Hi Iana,

Hi Iana,

FIML is a standard approach in SEM. You should be able to find plenty of literature on this topic (e.g., Enders, 2010).

A small number of studies is a concern not only for missing data but for meta-analysis. It is an issue in meta-analysis and other data analysis if you only have 3 data points.

meta3() uses FIML to handle missing data on the effect sizes, whereas meta3X() uses FIML to handle missing data on effect sizes and the covariates.

If you have questions regarding Wilson et al. (2016), it's better to contact the authors. I don't want to distort their messages.

If you are brave enough, you may try the developmental version at the metaSEM Github. It consists of TSSEM3L (three-level TSSEM), OSMASEM3L (three-level OSMASEM), and TSSEMRobust (TSSEM using robust statistics). However, there are likely bugs and limited documentation and support.

Enders, C. K. (2010). Applied missing data analysis. Guilford Press.

Mike

iana's picture
Offline
Joined: 10/16/2021 - 12:33
Hi Mike,

Hi Mike,

Thank you for your advice and clarification again.

For TSSEM with a small number of studies, would you recommend excluding studies with missing data instead of including them and conducting the maximum likelihood estimation? E.g., Among 4 available studies, 1 study has some missing variables. Do you think I should exclude that study or include them and run the TSSEM with maximum likelihood estimation?

As for the R packages, thank you for the recommendation. I think I would prefer to wait for the fully developed version. I look forward to them.

Thank you again.

Best wishes,
Iana

Mike Cheung's picture
Offline
Joined: 10/08/2009 - 22:37
Dear Iana,

Dear Iana,

Dropping the studies with missing data is known as the listwise deletion in the literature. FIML is usually preferred over the listwise deletion.

Mike

iana's picture
Offline
Joined: 10/16/2021 - 12:33
Hi Mike,

Hi Mike,

Thank you for the advice. That's very helpful. I greatly appreciate that :)

Best wishes,
Iana

iana's picture
Offline
Joined: 10/16/2021 - 12:33
Hi Mike,

Hi Mike,

Sorry to bother you again.

I'm running a TSSEM for my meta-analysis using R. I've extracted the correlations from studies but some of them are missing. I've searched for the exact R codes in handling the missing data but I'm afraid the available ones are not suitable for my study. For instance, I've read your document titled 'Package ‘metaSEM’', but the example on handling missing data is about binary data (ie, "Fellowship": 1; "Grant": 0). I'm also a bit unsure why the missingness has to be created manually?

The correlations that are missing involve the independent and dependent variables, and mediators (so do not include covariates). I'm running a random effect model.

Would you mind directing me to some papers/R codes showing how to handle the missing data?

Many thanks for your help in advance.

Best wishes,
Iana

Package ‘metaSEM’
http://cran.nexr.com/web/packages/metaSEM/metaSEM.pdf

Mike Cheung's picture
Offline
Joined: 10/08/2009 - 22:37
I am not sure if I follow the

I am not sure if I follow the question.

If you have missing data in the correlation matrices, you may just use NA to represent the missing correlations. See an example in Hunter83.

iana's picture
Offline
Joined: 10/16/2021 - 12:33
Hi Mike,

Hi Mike,

Thank you for getting back to me.

I've used NA to represent the missing correlations in my data (in an excel spreadsheet). I've changed the data type from 'character' to 'numeric' for the column (named 'r') containing missing data (as denoted by NA) when I imported the data file to R. But the console of R returns an error message (see below) when I estimated the unadjusted pooled correlation matrix in the first stage of TSSEM. I've already tried to replace the na values with NA, and used the new dataset with this replacement to run the first stage of TSSEM. But I'm still getting the same error message. Please see the R codes below.

Would you please advise what can I do to handle the missing correlation coefficients in running the first stage of TSSEM?

I've checked out Hunter 83 on the pdf of Package ‘metaSEM’ (http://cran.nexr.com/web/packages/metaSEM/metaSEM.pdf). However, the most relevant info I could see is the use of function 'pattern.na'. I'm afraid it is not what I'm looking for to run the first stage of TSSEM as this function only let us see the pattern of missing values.

Would you please shed some light on it or refer me to relevant materials when you have a moment?

Thank you for your help in advance!

Best wishes,
Iana

Error message in R:
Error: $ operator is invalid for atomic vectors
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion

R codes:
my_data <- read_excel("Jealousy, anger, trust, infidelity_anxious_perpetration.xlsx")
dataset <- as.data.frame(my_data)

install.packages("metafor")
install.packages("metaSEM")
install.packages("OpenMx")
library(metafor)
library(metaSEM)
library(OpenMx)

Step1 <- rma.mv(yi=r, V=inv_n,
data=dataset,
random=list(~1|studyID,~1|ESID),
method="ML", mods=~factor(Cell)-1)

Replacing na values with NA

my_data_new <- my_data
my_data_new[is.na(my_data_new)]<- NA

Step1 <- rma.mv(yi=r, V=inv_n,
data=my_data_new,
random=list(~1|studyID,~1|ESID),
method="ML", mods=~factor(Cell)-1)

Mike Cheung's picture
Offline
Joined: 10/08/2009 - 22:37
NA is an internal

NA is an internal representation of missing data in R. Missing values are usually represented by a blank cell (nothing) in Excel.

I am confused by your analyses. Your stage 1 analysis uses rma.mv(), which is from the metafor package. Since you use the metafor package, your data structure should fit its requirement.

iana's picture
Offline
Joined: 10/16/2021 - 12:33
Hi Mike,

Hi Mike,

Thank you for resolving this mystery. After replacing the NA with a blank cell on my excel spreadsheet, the codes worked fine. Thank you again for your help! I greatly appreciate that :)

Best wishes,
Iana

Ting's picture
Offline
Joined: 10/13/2022 - 05:47
Some questions about three-level metaSEM

Dear Prof Cheung,

I am requesting your assistance with my questions regarding three-level metaSEM. I found my questions are related to the posts above, so I put my questions here.

I currently use the three-level metaSEM in my work and have found that the WPL approach is the only one available. Would it be possible to ask you some questions about it, as I have been unable to find the answers?

I would like to briefly introduce my work, which involves testing a model as shown in Picture 1. We have 271 effect sizes nested in 64 data sets. We grouped data into four categories (latent variables) and 16 subcategories (observed variables). Unfortunately, I haven’t collected correlations between observed variables belonging to the same latent variables, such as PS1 and PS2, and there are also many other missing values (such as DS3 and MF3). As a result, we decided to analyze each correlation separately and eliminate dependency by ensuring that each dataset only had one correlation between two types of variables (observed variables Picture 1).

We were interested in the entire model with nested effect sizes for subgroup analyses rather than just a path. However, when there were missing values of correlations between any two subcategories, such as DS1 and PS1, we could not conduct subgroup analyses for the whole model. To eliminate missing values, we conducted subgroup analyses for the structural equation model composed of six correlations from four broad categories (four latent variables in the above picture). We used the WPL approach to test the model in Picture 2 by three-level meta-analysis (Level 1: Sampling variance; Level 2: variation within data sets; Level 3: variances between data sets).

I have two questions that I would greatly appreciate your input on.
1. What do you think of the strategy I used in my work? Do you know any other better way to analyze my data? I am unsure if the reviewers would challenge it. For example, we used broader categories for the subgroup analyses, which may not be enough to reflect the variation of subcategories.
2. Do you know of any ways to impute missing values of multilevel metaSEM?

Thank you for your time and consideration.

I appreciate any advice you can provide and look forward to hearing from you soon.

Best,
Ting

File attachments: 
Mike Cheung's picture
Offline
Joined: 10/08/2009 - 22:37
Dear Ting,

Dear Ting,

I don't know what the WPL approach is. Does it refer to Wilson, Polanin, and Lipsey (2016)? If this is the case, asking the authors about their approach is better.

From your description, it is difficult to fit the model in picture 1. Moreover, picture 1 is not an identified model.

It is more reasonable to fit the model in picture 2.

Regarding your specific questions,
1) The classification is a conceptual question rather than a statistical one. It would help if you persuaded the reviewers that your classification is reasonable.

2) No, I do not know how to impute missing data in a multilevel meta-analysis. There is almost no work on it.

Wilson, S. J., Polanin, J. R., & Lipsey, M. W. (2016). Fitting meta-analytic structural equation models with complex datasets. Research Synthesis Methods, 7(2), 121–139. https://doi.org/10.1002/jrsm.1199

Mike

Ting's picture
Offline
Joined: 10/13/2022 - 05:47
Thanks

Dear Prof Cheung,

Thank you very much for your suggestions. I appreciate your help again.

Best
Ting