# More questions about the magic of LGM with use of age definition variables

16 posts / 0 new
Offline
Joined: 05/16/2018 - 17:05
More questions about the magic of LGM with use of age definition variables

Either SEMNET is not getting my emails or they don't care about my questions (probably the latter), so I'll ask them here specific to OpenMx.

I'm also not sure if these are stupid questions, so bare with me. For some reason the magic sorcery of definition variables, with their use in latent growth curve modeling, continues to intrigue me. I have some questions related to their use. For all of my questions lets assume I'm working with a 3-wave study. Each person can measured up to three times, no more, but there is of course missing data due to attrition. A cross-sequential design (I think I got that right Todd Little) is used such that at wave 1 a cross section of ages ranging from 10y to 30y is measured on a single measure, which is continuous. Between each wave is approx. five years. At wave 3, the age range is therefore 20y-40y and the entire study the age range being 10y to 40y.

1. Ryne if you're listening I'm going to plug your growth modeling book. I honestly open that book every day. Katherine Masyn, who was a thesis adviser of mine, said it's like one of her favorite books of all time...and her and I both like Stephen King a lot. Anyways, you note in your section on continuous time modeling that when using age definition variables you cannot obtain overall model fit statistics such as the chi-square test of overall fit. Again, as you note this is because the factor loadings matrix can/will vary from person to person. I've read over and over again in other publications that with the use of age definition variables in LGM, to model individually-varying times of observation, there is no longer a single model-implied mean vector and variance-covariance matrix. There is a unique one for each person. It is for this reason that an overall chi-square test of model fit cannot be computed. The usual 'saturated' model would be a wave-based means, variances, and covariances model. Though in this case, the time metric is age, not wave, and it's individually varying. This is also what makes this approach essentially identical to MLM/mixed-effects longitudinal models. Mplus won't even give you a chi-square or any other absolute fit indices. You can obtain these from OpenMx, however. Are these to be trusted? Is there a single saturated model that can be used, legitimately, to obtain overall fit to the data? I'm a little confused because in the Mehta & West (2000) article there is some mention about obtaining overall model fit from these models, but I have read the opposite in more recent articles about unbalanced time in LGM.

2. Related to question #1. Since each person has their own model-implied mean vector and covariance-variance matrix, does this mean a separate model is estimated for each person and sort of averaged together? Is it right to say a 'different model' is estimated for each person because people have potentially unique time scores? And if so, is this why overall fit is not simple to obtain?

3. If it's the case that there is no single model-implied mean vector and variance-covariance structure, what does that mean for estimated residual variances? If I use my 3-wave panel study, but I use age at my time metric and not wave, through the use of age definition variables (age at each wave)...how do I interpret the residual variances being estimated? I obtain three residual variances, one for each variable at each wave, but shouldn't my residual variances be equal to the number of time points (i.e., number of unique ages), and not wave? In MLM the residual variances are held constant across time so just one is estimated. If using age-definition variables, should I also constrain the residual variances to equality across time? If not, how do I interpret those 3 residual variances?---people of the same age (i.e., same time score) could very well be in different waves.

4. Related to question 3, if I do biometric growth modeling with age definition variables am I supposed to constrain those to equality across time and then decompose the one residual variance into ACE components? I have the same confusion about how to interpret those if not constrained.

5. I was not under the impression that decomposing the residual variances in a LGM would change the values estimated in the ACE decomposition of my latent slope and intercept. Might improve fit to estimate them, but should not impact the decomposition of the slope and intercept. Though, I've found that decomposing them (with age definition variables) does change the ACE decomps of the slope and intercept. Am I modeling incorrectly or confused or both?

6. This is potentially the most stupid question and I know I've asked similar questions before. But my 3-wave study has, of course, only a maximum of three observations a person, though with age as my time metric I have 40y-10y = 30 time points or more because the ages are individually varying. Is there any possible way of actually estimating the random effects in models that would otherwise require more time points? Quadratic and piecewise come to mind for me. Ryne, in your book you mention the use of time-windows as an alternative to the use of age definition variables if one wants to model continuous time (in this case age). What if I restructured my 3-wave data so that I had four time-window variables (say they're age-based windows). There would be four variables, but no one would have complete data (only up to three non missing observations). Instead of just modeling that data using fixed time scores, I use age definition variables. Merging the time-window and age definition variable approaches. I now have four variables, FIML handles the missing data, and with age as my time metric I have 30+ actual time points. Can I use that method to magically estimate a quadratic model (saturated with all random effects estimated) using a 3-wave study?

7. I was bugging Bengt Muthen with similar questions but related to a piecewise (bilinear spline). When I first learned about age definition variables, or well TSCORES in Mplus, I was like why can't I run a piecewise growth model with my 3-wave study when age is my time metric and is truly individually varying and I have lots of ages?? It should be theoretically possible, maybe not to estimate the variance in each slope. Though, Mplus won't easily let me. It limits the complexity of the growth modeling capability as if wave is used as the time metric when in fact I'm using age. One idea we had was to run a bivariate growth model. Each 'model' would actually have the same three variables. But in one model (representing the first piece of the piecewise spline), older aged folks would have their data set to missing with psudeo-missing time scores for their definition variables. So with the first model I estimate data on those maybe aged 10y to 20y. In the second model (second piece of piecewise), I do the opposite...set younger folks data to missing with psudeo-missing def. vars. I can include overlapping non missing data in both models, say the oldest people at wave 1 and the youngest people from wave 3. Then I run that bivariate growth model but center the intercepts at the same age, fix the intercept means, variances, and covariances with the slopes to equality and so forth....so it's like I run a model with only one intercept and two slopes on 6 variables. In reality I only have a 3-wave study, though :). In both 'models' the same knot point would be used. Perhaps I did this wrong, but I was able to get this to run in Mplus with the three random effects estimated (intercept and two slope variances). Does that sound legitimate to you and is there a more simple way to do this in openmx?

Ok I think I've finally got all my definition variable LGM questions listed. Thanks for humoring me!

Eric

Online
Joined: 01/24/2014 - 12:15
re questions
Ryne if you're listening I'm going to plug your growth modeling book.

Wait, Ryne has a growth-modeling book?! How am I just now learning about that??

Mplus won't even give you a chi-square or any other absolute fit indices. You can obtain these from OpenMx, however. Are these to be trusted? Is there a single saturated model that can be used, legitimately, to obtain overall fit to the data?

Could you provide some example summary() output that includes those absolute fit indices? I agree with you in that I don't see that the "saturated model" has a sensible general-case definition when the loadings are individually-varying age definition variables. So, I doubt those indices are trustworthy. I suspect that mxRefModels() is behaving as it normally does, which is not appropriate for the kind of growth models in which you're interested. The man page for mxRefModels() already advises that it doesn't do what the user may want in the case of twin models and their ilk.

Related to question #1. Since each person has their own model-implied mean vector and covariance-variance matrix, does this mean a separate model is estimated for each person and sort of averaged together? Is it right to say a 'different model' is estimated for each person because people have potentially unique time scores? And if so, is this why overall fit is not simple to obtain?

Yes, that's more or less the idea (although isn't the family the independent unit of analysis in your model?). Specifically, the -2logL of potentially every row of the dataset is being evaluated given a different model-expected mean vector and covariance matrix, which themselves are functions of the free parameters and of the definition-variable values for that row.

Concerning your questions #3 and #4, I agree that constraining the residual variance equal across the 3 waves makes for easier interpretation.

Concerning your question #5, I would not expect the biometrical decomposition of the intercept and slope variances to change much, but I don't see why it wouldn't change at all. When you decompose the residual variances as well, you are imposing a different covariance structure on the residuals. In particular, the cross-relative residual covariances will depend on those relatives' degree of biological relatedness. Thus, what ends up as residual variance vs. random-intercept or random-slope variance could change.

Concerning your question #6: it's possible I'm missing something, but I very much doubt that trick will work. It doesn't change the fact that no one in your sample has more than 3 timepoints. No statistical legerdemain can validy create information out of thin air.

As for the rest of your post, you pose some very deep questions that I cannot answer without at least giving them a great deal of thought, and indeed may be beyond my expertise! Hopefully some others will chime in on this thread, too.

Offline
Joined: 07/31/2009 - 15:12
Thanks for the kind words. It

Thanks for the kind words. It's really Kevin and Nilam's book, borne out of their years of hard work on their growth curve modeling workshop. I'm just happy to have lent a hand and include some OpenMx. Onto your questions:

1.) You can't get a chi-square because that requires comparing to a saturated model that doesn't really exist. In your example, you could structure your data pretending there are three time points (t1, t2, t3), or five (age10 through age50) with missing data. OpenMx doesn't care how you structure it, and will return the same -2LL value. But you'll get different saturated models, and thus different chi-square values, depending on your structure.

2.) Not really. There is no averaging. The saturated model will ignore your definition variables. The LGC of interest will estimate the intercept and slope that take into account your definition variables.

3.) and 4.) As I think Rob said, I'd constrain to a single residual variance term. You could get fancy and put an algebra on it if you want to test for variance growth or decay over time.

5.) Modeling the residual ACE structures will give better/different estimates of associations across twins, which in turn affects estimates of intercepts/slopes, and thus their decompositions. Fundamentally, you're just decomposing the correlations between twins: if you ignore residuals, the intercept/slope have to do that job all by themselves. If you let the residuals help, they'll do so and change intercept/slope estimates.

6.) I'm not 100% sure. You can't get out more random effects than you have observations per person, so I don't think you could get out a quadratic trend while also estimating residual variances (though the piecewise trend may break this). You could probably get out non-linear mean structures, though, and probably as many mean-structure parameters as you have unique time points.

7.) This is a really cool idea. You'd have to do it with a fixed change point or transition point, but this may be possible. I've got an old paper on random change points that I've never finished that I may need to pick up again. The middle-aged people in your sample could be interesting. From there, you'd have a method to do essentially piecewise linear splines, and make whatever constraints you needed to make things estimable.

Online
Joined: 01/24/2014 - 12:15
saturated model
2.) Not really. There is no averaging. The saturated model will ignore your definition variables.

I don't think Eric meant "averaging" to be taken literally, and he seemed to have approximately the right idea. At any rate, the saturated model to which you refer here is what would be created by mxRefModels(), correct?

Offline
Joined: 07/31/2009 - 15:12
Yeah, the concept is right,

Yeah, the concept is right, but it's good to be clear. mxRefModels just builds the saturated model based on the dimensions of the data, ignoring the model, correct?

Offline
Joined: 05/16/2018 - 17:05
these are less sorcerous now, but more questions

 I suspect that mxRefModels() is behaving as it normally does,
which is not appropriate for the kind of growth models in
which you're interested.


Rob- yes I was using mxRefModels(), so I assume the 'saturated' model being used by mxRefModels() is a wave-based means, variances, and covariances model?

   Yes, that's more or less the idea (although isn't the family the
independent unit of analysis in your model?).


Rob- yes, thanks for clarifying. If not doing biometric modeling, then rows would be people. With biometric, yes rows would be families.

On to fit..

It's too bad I can't have a metric for absolute fit, that's one of the major advantages of SEM. I can use individual based fit metrics that those in the MLM/mixed-effects world use. But it seems like cheating to publish a latent growth curve model and get away without convincing readers the fit the data, overall, is reasonable. Sure I've got several relative fit metrics I draw on, but it's not the same. Perhaps the closest thing I could do is fit an age-based 'time-window' model with very small windows of time (several age binned Y variables), as many windows as I can run without the matrix getting to sparse. That model will only apprx. the age definition variable model but I can obtain chi-square. So that would be an apprx. chi-square.

So if it's the case that as you say Rob..

  the -2logL of potentially every row of the dataset is being evaluated


does that mean that in theory I could model some individuals with non-linear slope loadings matrices and other people as having linear? Forget about the 3 time point thing, say I have 5 time points and I use age definition variables. Could I structure some people's individual models to be quadratic and others to be linear (with a lot of programming)? I'm asking more to better understand this process than for some practical reason. Almost sounds like growth mixture modeling (I dont think Katherine is listening) when I describe it. Does allowing for person-centered differences in time scores means (i.e., definition variables) imply, in theory, that person-centered differences in the function of the time scores could be allowed too? But then I dont know what the overall model would estimate, a linear model, a quadratic, both? I guess if you ran a GMM with known classes, assigning some people to a quadratic class and another to a linear class and then used age definition variables that would be the same thing. Not sure why you would do that though, just thinking out loud I suppose.

On to residual variances...

So I understand that, certainly, by estimating three separate residual covariances the intercept and slope have less work to do and that could change things. And based on Rob's comment, I kind of understand why allowing those to be free VS allowing those to be free + breaking them down into variance components would affect the ACE estimates from the intercept and slope. However, Ryne's comment about model fit actually spooked me a little when I thought what that could mean for ACE results.

  ...you could structure your data pretending there are three time points
(t1, t2, t3), or five (age10 through age50) with
missing data. OpenMx doesn't care how you structure it,
and will return the same -2LL value.


So let's say I'm still working with this 3-wave data. I understand that if I run a biometric LGM with age definition variables and structure my data based on wave (t1, t2,t3) the -2LL value will be the same as if I restructure the same data but make 5 age-window variables and run a model. That makes brilliant sense to me re: why there is no sensical saturated model. That structure we set up becomes arbitrary when we use definition variables. The path diagrams of LGM with def. vars., even with those little triangles, are still misleading in my opinion for this very reason. Ok so that all makes sense. But based on what you both are telling me, my ACE estimates from decomposing the slope and intercept will likely be affected by decomposing the residual variances. So in this example, I run two models, same -2LL value, but because the implied error structures are different those models could produce different ACE results for the intercept and slope?? Assuming I decompose the residual variances freely, that is. There's 3 resvars in one model and 5 in another model. It just seems spooky that depending on how I structure my data (even though the time metric stays the same and the fit of the data is the same) those ACE estimates might look different. Makes me really think that we should constrain those to equality across time or use some other sort of method.

On to the nonlinear models with 3 time points...

 Concerning your question #6: it's possible I'm missing something,
but I very much doubt that trick will work. It doesn't
change the fact that no one in your sample has more than 3 timepoints.
No statistical legerdemain can validy create
information out of thin air.


No I don't think you're missing anything, I just think it was not a good idea :) Like my grandfather said, out of many stupid ideas is at least one good one. It seems that it's possible for me to fit many types of nonlinear models with 3-time points (using age def. vars.) but that I'll need to constrain some random effects (such as the variance of the quad curvature or the residual variances). While this is not helpful in trying to ACE decompose (I have no variance to decompose), it will contribute to better fit to the data (assuming there is a quadratic-like nature to the data) and I could decompose some other random effects that I can freely estimate (e.g., intercept) with 3 time points. Am I thinking about that correctly?

It seams to me that using definition variables gives SEM a huge advantage in this type of case. With limited time points, using LGM with definition variables allows me to run nonlinear models (even though I have to fix a few random effects). I can do that in MLM/mixed-effects too. So why use SEM still? In SEM I can easily add predictors and outcomes and build complicated models. For example, I can regress some outcome onto the intercept of a quadratic model (cant regress onto the random effects/variances that I've fixed, of course). Can't do that so easily in MLM.

This is a really cool idea.


Thanks Ryne, Bengt had the idea to do it as a multiple group, multiple cohort type analysis. Though, that would not allow you to correlate the two slopes. This bivariate trick might allow for the correlation to be estimated. When I think more about it, perhaps I dont need to do it as a bivariate growth model. I simply run the 6 variables together in one growth model. Though in that situation no one would have complete data all the way across. But in the bivariate case, if I overlapped the data (old people at wave 1 and young people at wave 3 being used in both pieces), then potentially a fair number of people will have complete data (within each growth model that is). I'll give it a try.

Thanks!

Online
Joined: 01/24/2014 - 12:15
saturated model
Rob- yes I was using mxRefModels(), so I assume the 'saturated' model being used by mxRefModels() is a wave-based means, variances, and covariances model?

I'm not sure exactly what you mean here. But if you have twin pairs and one phenotype measured at 3 timepoints, the saturated model would have an unstructured 6x6 covariance matrix and an unstructured vector of 6 means. It would ignore zygosity and definition variables.

BTW, to tell the truth I don't actually know what you mean by "time-windows", either.

Offline
Joined: 05/16/2018 - 17:05
clarificaition

I'm not sure exactly what you mean here. But if you have twin pairs
and one phenotype measured at 3 timepoints, the
saturated model would have an unstructured 6x6 covariance
matrix and an unstructured vector of 6 means. It would
ignore zygosity and definition variables.

Yes sorry that is what I mean. In the non-biometric example, its a model with as many means variances and covariances possible. So 3 time points would be 3x3 variance-covariance matrix and vector of 3 means.

    BTW, to tell the truth I don't actually
know what you mean by "time-windows", either.


The time window, which Ryne can explain better I'm sure, approximates truly individually varying times of observations. It approximates the definition variable approach. Say I form 6 variables and bin people according to age. The first variable people will be for those aged 10y-15y, the second would be 16y-20y, third would be 21y-25y, forth would be 25y-30y, fifth would be 31y-35y, and 6th would be 36y-40y. Each person has up to 3 non-missing observations across the 6 variables. They have missing data if they do not have data for that age-window. Fixed time scores are used. The tighter you make the windows(and thus the more variables you have) the closer the approximation gets to definition variable approach. But as Ryne points out in his book, the sparseness of the data matrix can lead to convergence issues. So in general the age def. approach is favored, but the window approach gives chi-square test of overall fit.

Online
Joined: 01/24/2014 - 12:15

BTW, when you posted this thread, I remembered that there had previously been a thread about the problem of the "saturated model" when definition variables are involved. I just found it: https://openmx.ssri.psu.edu/thread/3966.

Offline
Joined: 05/16/2018 - 17:05
number of time points = # of unique ages?

Another point of confusion. Ok so I have a 3-wave study, but as Ryne pointed out I could restructure that differently so there would be more than 3 Y variables. It's my understanding that if I stretched out the data enough time window, as in I had one variable for each unique age (and I could convince openmx to run such a sparse model)...I would obtain the results or very very close to the results of a model with age definition variables.

In an age definition model I model time as age. Time = age.

So does this imply that the number of time points, as far as my model is concerned, is the number of unique ages? Which of course is much more than 3. It just so happens that no one person was measured at all of those time points, in fact, only a maximum of 3. The within person # of observations will never be more than 3, but I have many more than 3 between-person time points. Am I thinking about that correctly?

Online
Joined: 01/24/2014 - 12:15
Yes, correct.

Yes, correct.

Offline
Joined: 05/16/2018 - 17:05
and another question on likelihood estimation

Is there a difference between the likelihood estimation of a traditional, fixed time score LGM and one with definition variables? Do both approaches uses FIML but in different ways? Does the traditional approach not use individual based likelihood estimation? Thanks

Online
Joined: 01/24/2014 - 12:15
Assuming you're analyzing raw

Assuming you're analyzing raw data in both cases, then yes, both would use FIML, and the loglikelihood of the whole sample would be the sum of the row-wise loglikelihoods. The difference is that with definition variables, OpenMx has to calculate a different model-expected mean vector and covariance matrix for possibly every row of the dataset.

Offline
Joined: 05/16/2018 - 17:05
right, and this relates to the saturated model issue?...

So there is no single model-expected mean vector and covariance matrix, there are N of them. Without a single, overall model-expected mean vector and covariance matrix, there is no single sensible saturated model. One person's expected mean vector and covariance matrix may be specific to Y measured at ages 14,15,and 16... and another person's expected mean vector and covariance matrix may be specific to Y measured at ages 16,17, and 18. Two different people have Y measured at the same time (time as we define it, based on age), that is 16y, yet these can from two different waves. So a traditional saturated model that is #wave mean vector and #wave-by-#wave variance-covariance matrix makes little sense to compare to. Am I thinking about this correctly?

Online
Joined: 01/24/2014 - 12:15
Yes, you have the right idea.

Yes, you have the right idea.

Offline
Joined: 05/16/2018 - 17:05
thanks!

thanks!