Either SEMNET is not getting my emails or they don't care about my questions (probably the latter), so I'll ask them here specific to OpenMx.
I'm also not sure if these are stupid questions, so bare with me. For some reason the magic sorcery of definition variables, with their use in latent growth curve modeling, continues to intrigue me. I have some questions related to their use. For all of my questions lets assume I'm working with a 3-wave study. Each person can measured up to three times, no more, but there is of course missing data due to attrition. A cross-sequential design (I think I got that right Todd Little) is used such that at wave 1 a cross section of ages ranging from 10y to 30y is measured on a single measure, which is continuous. Between each wave is approx. five years. At wave 3, the age range is therefore 20y-40y and the entire study the age range being 10y to 40y.
-
Ryne if you're listening I'm going to plug your growth modeling book. I honestly open that book every day. Katherine Masyn, who was a thesis adviser of mine, said it's like one of her favorite books of all time...and her and I both like Stephen King a lot. Anyways, you note in your section on continuous time modeling that when using age definition variables you cannot obtain overall model fit statistics such as the chi-square test of overall fit. Again, as you note this is because the factor loadings matrix can/will vary from person to person. I've read over and over again in other publications that with the use of age definition variables in LGM, to model individually-varying times of observation, there is no longer a single model-implied mean vector and variance-covariance matrix. There is a unique one for each person. It is for this reason that an overall chi-square test of model fit cannot be computed. The usual 'saturated' model would be a wave-based means, variances, and covariances model. Though in this case, the time metric is age, not wave, and it's individually varying. This is also what makes this approach essentially identical to MLM/mixed-effects longitudinal models. Mplus won't even give you a chi-square or any other absolute fit indices. You can obtain these from OpenMx, however. Are these to be trusted? Is there a single saturated model that can be used, legitimately, to obtain overall fit to the data? I'm a little confused because in the Mehta & West (2000) article there is some mention about obtaining overall model fit from these models, but I have read the opposite in more recent articles about unbalanced time in LGM.
-
Related to question #1. Since each person has their own model-implied mean vector and covariance-variance matrix, does this mean a separate model is estimated for each person and sort of averaged together? Is it right to say a 'different model' is estimated for each person because people have potentially unique time scores? And if so, is this why overall fit is not simple to obtain?
-
If it's the case that there is no single model-implied mean vector and variance-covariance structure, what does that mean for estimated residual variances? If I use my 3-wave panel study, but I use age at my time metric and not wave, through the use of age definition variables (age at each wave)...how do I interpret the residual variances being estimated? I obtain three residual variances, one for each variable at each wave, but shouldn't my residual variances be equal to the number of time points (i.e., number of unique ages), and not wave? In MLM the residual variances are held constant across time so just one is estimated. If using age-definition variables, should I also constrain the residual variances to equality across time? If not, how do I interpret those 3 residual variances?---people of the same age (i.e., same time score) could very well be in different waves.
-
Related to question 3, if I do biometric growth modeling with age definition variables am I supposed to constrain those to equality across time and then decompose the one residual variance into ACE components? I have the same confusion about how to interpret those if not constrained.
-
I was not under the impression that decomposing the residual variances in a LGM would change the values estimated in the ACE decomposition of my latent slope and intercept. Might improve fit to estimate them, but should not impact the decomposition of the slope and intercept. Though, I've found that decomposing them (with age definition variables) does change the ACE decomps of the slope and intercept. Am I modeling incorrectly or confused or both?
-
This is potentially the most stupid question and I know I've asked similar questions before. But my 3-wave study has, of course, only a maximum of three observations a person, though with age as my time metric I have 40y-10y = 30 time points or more because the ages are individually varying. Is there any possible way of actually estimating the random effects in models that would otherwise require more time points? Quadratic and piecewise come to mind for me. Ryne, in your book you mention the use of time-windows as an alternative to the use of age definition variables if one wants to model continuous time (in this case age). What if I restructured my 3-wave data so that I had four time-window variables (say they're age-based windows). There would be four variables, but no one would have complete data (only up to three non missing observations). Instead of just modeling that data using fixed time scores, I use age definition variables. Merging the time-window and age definition variable approaches. I now have four variables, FIML handles the missing data, and with age as my time metric I have 30+ actual time points. Can I use that method to magically estimate a quadratic model (saturated with all random effects estimated) using a 3-wave study?
-
I was bugging Bengt Muthen with similar questions but related to a piecewise (bilinear spline). When I first learned about age definition variables, or well TSCORES in Mplus, I was like why can't I run a piecewise growth model with my 3-wave study when age is my time metric and is truly individually varying and I have lots of ages?? It should be theoretically possible, maybe not to estimate the variance in each slope. Though, Mplus won't easily let me. It limits the complexity of the growth modeling capability as if wave is used as the time metric when in fact I'm using age. One idea we had was to run a bivariate growth model. Each 'model' would actually have the same three variables. But in one model (representing the first piece of the piecewise spline), older aged folks would have their data set to missing with psudeo-missing time scores for their definition variables. So with the first model I estimate data on those maybe aged 10y to 20y. In the second model (second piece of piecewise), I do the opposite...set younger folks data to missing with psudeo-missing def. vars. I can include overlapping non missing data in both models, say the oldest people at wave 1 and the youngest people from wave 3. Then I run that bivariate growth model but center the intercepts at the same age, fix the intercept means, variances, and covariances with the slopes to equality and so forth....so it's like I run a model with only one intercept and two slopes on 6 variables. In reality I only have a 3-wave study, though :). In both 'models' the same knot point would be used. Perhaps I did this wrong, but I was able to get this to run in Mplus with the three random effects estimated (intercept and two slope variances). Does that sound legitimate to you and is there a more simple way to do this in openmx?
Ok I think I've finally got all my definition variable LGM questions listed. Thanks for humoring me!
Eric