You are here

Cross-lagged model issues

16 posts / 0 new
Last post
ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
Cross-lagged model issues

Hello,
I have a cross-lagged model that I have run using the code below:

CrossLaggedModel<-mxModel("cross_lagged_model",
        type="RAM",
        manifestVars=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
        mxData(observed=children_data, type="raw"),
        mxPath(from=c("f7_bmi"), to=c("tf4_cpg"), 
            arrows=1, free=TRUE, values=c(.5), labels="AtoB"),
        mxPath(from=c("f7_cpg"), to=c("tf4_bmi"), 
            arrows=1, free=TRUE, values=c(1.5), labels="BtoA"),
        mxPath(from=c("f7_bmi"), to=c("tf4_bmi"), 
            arrows=1, free=TRUE, values=c(.1), labels="AtoA"),
        mxPath(from=c("f7_cpg"), to=c("tf4_cpg"), 
            arrows=1, free=TRUE, values=c(0), labels="BtoB"),
        mxPath(from=c("f7_bmi","tf4_bmi"),
            arrows=2, free=TRUE, values=c(1, 1), labels=c("residualA1", "residualA2")),
        mxPath(from=c("f7_cpg","tf4_cpg"), 
            arrows=2, free=TRUE, values=c(.5, .5), labels=c("residualB1", "residualB2")),
        mxPath(from=c("f7_bmi","tf4_bmi"), to=c("f7_cpg","tf4_cpg"), 
            arrows=2, free=TRUE, values=c(.05,.05), labels=c("residCovAB1", "residCovAB2")),
        mxPath(from="one", to=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
            free=TRUE, values=0, labels="m")
)
model1<-mxModel(CrossLaggedModel, mxCI(c("cross_lagged_model.A","cross_lagged_model.S")))
model<-mxTryHard(model1, intervals=T)

Here, f7 is one time point and tf4 is a later timepoint and BMI and CpG are the two variables at different timepoints. When I run this model I'm concerned about the results I get (shown below) as these are suggesting opposite effects to linear regression models, but also because the residuals/variance for residual B1 is very large and not near what the actual variance of the data is. I've tried specifying different starting values for the model, but I still get these large values. I have also tried adding in a latent variable for BMI and CpG but this just makes the numbers even larger. So I just wondered if there was anything wrong with the code I am using or whether this is perhaps just not working well for some reason.

free parameters:
name matrix row col Estimate Std.Error A
1 AtoA A tf4_bmi f7_bmi 0.09102847 0.10034782
2 AtoB A tf4_cpg f7_bmi 0.47616253 0.03039210
3 BtoA A tf4_bmi f7_cpg 1.34019586 0.01243257
4 BtoB A tf4_cpg f7_cpg -0.05465275 0.01098387
5 residualA1 S f7_bmi f7_bmi 1.82067088 0.33356059
6 residualA2 S tf4_bmi tf4_bmi 7.49071007 0.38457389
7 residCovAB1 S f7_bmi f7_cpg -13.60428618 2.70038904
8 residualB1 S f7_cpg f7_cpg 238.60375066 12.54453274
9 residCovAB2 S tf4_bmi tf4_cpg 0.26706472 0.09050008
10 residualB2 S tf4_cpg tf4_cpg 0.82379525 0.04027825
11 m M 1 f7_bmi 0.91112011 0.17838277

Thank you!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
I don't see anything amiss

I don't see anything amiss about your script. I'm surprised you're getting results you don't find trustworthy, because this looks as though it would be a pretty easy optimization problem.

If I'm counting correctly, your model should be just-identified. Does the model-expected covariance matrix at the solution look reasonable? Also, do you have many missing observations? A FIML solution can look different from results obtained by a method that deletes incomplete cases (as is typical in linear regression).

ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
Thanks Rob,

Thanks Rob,

I'm mostly concerned by that large residual value, but then if the direction of effect is also different it's a question of which to trust I guess.
I've had a look at the expected covariance matrix (I believe this is the correct part, below) and I'm not really sure what it should look like but I'm a little surprised to see such large values in there. But perhaps this isn't unusual?

I also looked at missing cases and yes there are less people in the regression analayses and missing data in the FIML model, so this could explain the difference, although this is about 100 people and I wouldn't expect this to necessarily change that much, but perhaps that is the case.

$algebras$cross_lagged_model.fitfunction
         [,1]
[1,] 16047.19
attr(,"expCov")
           [,1]      [,2]      [,3]       [,4]
[1,]   1.820671 -18.06668 -13.60429   1.610447
[2,] -18.066675 432.74860 318.53738 -25.744554
[3,] -13.604286 318.53738 238.60375 -19.518203
[4,]   1.610447 -25.74455 -19.51820   2.657353

I should also mention that I have this issue with another similar model but in a separate sample (above is children and the other sample is adults).

Thanks

Zoe

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
intercepts

I think the issue is with the model for the intercepts. First of all, is zero a good choice of start value for them? Secondly, and more importantly, I just noticed that this line,

        mxPath(from="one", to=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
            free=TRUE, values=0, labels="m")

, assigns the same label "m" to all four of those paths, meaning that the intercepts of all four manifest variables are constrained to be equal to one another. I doubt that's what you want to do?

If you're incorrectly estimating the phenotypic means, then it would be no surprise that the variance is being overestimated.

AdminNeale's picture
Offline
Joined: 03/01/2013 - 14:09
What Rob said

The same expected mean for every variable is likely the culprit, IMO. Substituting manifestVars for m in this line

mxPath(from="one", to=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
            free=TRUE, values=0, labels="m")

to get

mxPath(from="one", to=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
            free=TRUE, values=0, labels="manifestVars")

should help a lot.

ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
Thank you both,

Thank you both,

So am I right in thinking that will just use what I have already specified as the manifestVars to estimate the expected means, so if I just do the above but keep them all within the same mxPath() function then this should work?

ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
Actually I've just tried

Actually I've just tried making that change and I still get large residuals, even with different starting values. I've tried specifying different starting values for each but this still makes no difference.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
data?

Odd. Do you think you could share simulated data that's similar to your own, with mxGenerateData()?

ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
Data

I've attached data generated as suggested as a .csv. I've not used that function before, but the data has negatives in it for BMI, but I'm guessing that might not matter? If it does let me know and I'll try and make it so this does not happen.

Thanks!

File attachments: 
AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
data

What did you pass as value for argument model to mxGenerateData(): your MxModel object, or a dataframe containing your actual data? I should have specified that you ought to pass it your actual data.

If you did give it a dataframe of your data, then I guess negative BMI scores are OK, because the function generates new data from a multivariate-normal distribution.

ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
mxGenerateData

So I just passed it the model, so that explains why. I'm probably missing something obvious but when I try and pass a normal R data frame to this it doesn't work and I get the following error? Do you know why this is?

Error: you must specify 'nrow' and 'ncol' arguments in mxMatrix(values = wlsData$thresholds, name = "thresh")

Thanks

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
That looks like a bug. What

That looks like a bug. What's your mxVersion() output?

ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
Thanks

Thanks, it was an older version. I tried with a newer one and it worked. I've attached the correct data now, so hopefully that is useful.

File attachments: 
AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Wait, hold it!

Wait, hold it! Professor Neale's suggested change here,

The same expected mean for every variable is likely the culprit, IMO. Substituting manifestVars for m in this line

mxPath(from="one", to=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
            free=TRUE, values=0, labels="m")

to get

mxPath(from="one", to=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
            free=TRUE, values=0, labels="manifestVars")

should help a lot.

, won't resolve the issue. It just changes the label of the single intercept parameter. I can't believe I overlooked that! Each of the 4 variables needs its own intercept parameter, like so:

mxPath(from="one", to=c("f7_bmi","tf4_bmi","f7_cpg","tf4_cpg"),
            free=TRUE, values=0, labels=c("f7_bmi_int","tf4_bmi_int","f7_cpg_int","tf4_cpg_int"))
ZR's picture
ZR
Offline
Joined: 07/26/2017 - 09:34
That works

Thanks!

That solves the problem and I get much more realistic residuals, as shown here:

Summary of cross_lagged_model

free parameters:
name matrix row col Estimate Std.Error A
1 AtoA A tf4_bmi f7_bmi 1.300011908 0.04774913
2 AtoB A tf4_cpg f7_bmi 0.053621172 0.01425319
3 BtoA A tf4_bmi f7_cpg 0.094762258 0.10041324
4 BtoB A tf4_cpg f7_cpg 0.452180754 0.02951133
5 residualA1 S f7_bmi f7_bmi 4.311406435 0.20415313
6 residualA2 S tf4_bmi tf4_bmi 7.475740688 0.38337381
7 residCovAB1 S f7_bmi f7_cpg 0.224644372 0.07007160
8 residualB1 S f7_cpg f7_cpg 1.004418842 0.04750762
9 residCovAB2 S tf4_bmi tf4_cpg 0.285473804 0.08713938
10 residualB2 S tf4_cpg tf4_cpg 0.772890949 0.03655699
11 f7_bmi_int M 1 f7_bmi 16.219345821 0.06950395
12 tf4_bmi_int M 1 tf4_bmi 1.566125549 0.77808383
13 f7_cpg_int M 1 f7_cpg 0.007653321 0.03351886
14 tf4_cpg_int M 1 tf4_cpg -0.873267552 0.23301683

Thank you for noticing that!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Good to hear!

Good to hear!