Attachment | Size |
---|---|
data_wide.csv | 9.36 KB |
I am trying to fit a nonlinear latent growth curve model (as part of simulations I am conducting) using the three-parameter logistic (s-shaped change) function below
$$y_{obs} = \frac{diff}{1+e^{\frac{\beta - time_i}{\gamma}}}, $$
where $diff$ represents the difference between the starting and ending values (i.e., how much change occurs over time), $\beta$ represents the number of days needed to reach the midpoint of the logistic curve, and $\gamma$ represents the number of days needed to go from the midpoint to approximately 73% of the observed change (i.e., a satiation point). Because the SEM framework only allows any given outcome variable to be predicted as a weighted sum of the predictor variables (i.e., SEM framework is inherently linear), the logistic function cannot simply be inserted into a latent growth curve model. As per the structured latent curve approach outlined in several articles (e.g., Browne & du Toit, 1991; Grimm, Ram, & Estabrook 2010; Preacher & Hancock 2015), I linearized the logistic function using a first-order Taylor series expansion.
The code below outlines the nonlinear latent growth curve model that I have fit to a generated data set (data_wide.csv
). The generated data set contains scores from 100 people measured at five points over a year period (specifically on days 1, 150, 180, 210, and 360) and is generated from a population model where change is logistic. I have also attached a set of starting values (generated by largely copying the starting value procedure in MPlus) to initialize parameter estimation.
Given that my model contains latent variables (i.e., diff, beta[$\beta$], and gamma [$\gamma$]), I would like to use a parameter estimation algorithm that is well suited to dealing with latent variables. From what I understand, the mxFitFunctionML()
function uses full information maximum likelihood (FIML) estimation to estimate parameter values when raw data are used (as in my case). Unfortunately, (and from what I understand about estimation algorithms), FIML estimation is not well suited to dealing with latent variables since it does not iteratively compute lower bounds of the marginal log-likelihood and maximize these lower bounds (i.e., the e- and m-steps of the EM algorithm). Given the iterative nature of the EM algorithm, it makes sense that I have realized considerable performance improvements (both in time and parameter estimation accuracy) when I have used the EM algorithm (versus FIML) to fit growth mixture models.
Although I was successful in implementing the EM algorithm for growth mixture models, it was largely as a result of copying over the example code from the documentation of mxComputeEM()
. I have looked at other posts and have not been able to work out how to implement the EM algorithm for my nonlinear latent growth curve model. Just wondering if anyone might be able to help me out.
data_wide <- read_csv('data_wide.csv') #data table of randomly generated starting values (procedure largely mirrors process followed in MPlus) starting_values <- data.table(beta_fixed = 65.57385, gamma_fixed = 20, diff_fixed = 80, beta_rand = 21.75747, gamma_rand = 0.5, diff_rand = 0.8290784, epsilon = 0.9832023 ) #setup variables manifest_vars <- c("t1_1", "t2_150", "t3_180", "t4_210", "t5_360") latent_vars <- c('diff', 'beta', 'gamma') measurement_days <- as.numeric(str_extract(string = names(data_wide[ 2:ncol(data_wide)]), pattern = '[^_]*$')) #1 150 180 210 360; sets values for 'time' object #code adapted from Grimm, Ram, & Estabrook (2016; Chapter 12) model <- mxModel(model = 'nonlin_lgc', type = 'RAM', mxData(observed = data_wide, type = 'raw'), manifestVars = manifest_vars, latentVars = latent_vars, #Manifest means manMeans = mxPath(from = 'one', to = manifest_vars, free = FALSE, arrows = 1, values = 0), #Latent variable means (linear parameters). ##Note that the nonlinear parameters (i.e., beta, gamma) do not have estimated means mxPath(from = 'one', to = c('diff'),free = T, arrows = 1, values = starting_values$diff_fixed, labels = 'diff'), #Residual variances; by using one label, they are assumed to all be equal (homogeneity of variance) mxPath(from = manifest_vars, arrows=2, free=TRUE, values = starting_values$epsilon, labels='epsilon'), #Latent variable covariances and variances mxPath(from = latent_vars, connect='unique.pairs', arrows=2, #aa(var_diff), ab(cov_diff_beta), ac(cov_diff_gamma), bb(var_beta), bc(var_beta_gamma), cc(var_gamma) free = c(TRUE, FALSE, FALSE, TRUE, FALSE, TRUE), values=c(starting_values$diff_rand, NA,NA, starting_values$beta_rand, NA, starting_values$gamma_rand), labels=c('var_diff', 'NA(cov_diff_beta)','NA(cov_diff_gamma)', 'var_beta', 'NA(var_beta_gamma)', 'var_gamma')), #Functional constraints mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$diff_fixed, labels = 'diff', name = 'd'), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$beta_fixed, labels = 'beta', name = 'b'), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$gamma_fixed, labels = 'gamma', name = 'g'), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, values = measurement_days, name = 'time'), #partial derivative of logistic function with respect to each parameter (diff, beta, gamma, respectively) mxAlgebra(expression = 1/(1 + exp((beta - time)/gamma)), name="Dl"), mxAlgebra(expression = -(diff*(exp((beta - time)/gamma) * (1/gamma))/(1 + exp((beta - time)/gamma))^2), name = 'Bl'), mxAlgebra(expression = diff * (exp((beta - time)/gamma) * ((beta - time)/gamma^2))/(1 + exp((beta - time)/gamma))^2, name = 'Gl'), #Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE, labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Gl[5,1]")), mxPath(from='beta', to = manifest_vars, arrows=1, free=FALSE, labels = c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Gl[5,1]")), mxPath(from='gamma', to = manifest_vars, arrows=1, free=FALSE, labels = c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]")), #Enable likelihood vector; returns vector of likelihoods for each data point (i.e., p(X|parameter_values)) mxFitFunctionML(vector = TRUE) #implement EM algorithm to estimate parameter values?? ) mxRun(model) #it appears no errors (hopefully) were made in defining the model, but the model does not converge (can I use EM algorithm?)
#not sure how to repurpose this code for the nonlinear latent growth curve model mxModel( "Mixture", data4mx, class1, class2, mxAlgebra((1-Posteriors) * Class1.fitfunction, name="PL1"), mxAlgebra(Posteriors * Class2.fitfunction, name="PL2"), mxAlgebra(PL1 + PL2, name="PL"), mxAlgebra(PL2 / PL, recompute='onDemand', initial=matrix(runif(N,.4,.6), nrow=N, ncol = 1), name="Posteriors"), mxAlgebra(-2*sum(log(PL)), name="FF"), mxFitFunctionAlgebra(algebra="FF"), mxComputeEM( estep=mxComputeOnce("Mixture.Posteriors"), mstep=mxComputeGradientDescent(fitfunction="Mixture.fitfunction")))
References
Browne, M. W., & du Toit, S. H. C. (1991). Models for learning data. In L. M. Collins & J. L. Horn (Eds.), Best methods for the analysis of change (pp. 47–68). Washington, DC: American Psychological Association.
Grimm, K. J., Ram, N., & Estabrook, R. (2010). Nonlinear structured growth mixture models in Mplus and OpenMx. Multivariate Behavioral Research, 45(6), 887–909. https://doi.org/10.1080/00273171.2010.531230
Preacher, K. J., & Hancock, G. R. (2015). Meaningful aspects of change as novel random coefficients: A general method for reparameterizing longitudinal models. Psychological Methods, 20(1), 84–101. http://dx.doi.org/10.1037/met0000028
"FIML estimation is not well suited to dealing with latent variables" -- Not sure about this reasoning. How you tried it?
You aren't trying to fit a mixture model. From what I can tell, you have a growth curve model with non-standard basis functions. I'd expect the default optimization strategy to work reasonably well.
If you want to re-frame the optimization as an E-M algorithm then you need to divide the model into two separate phases (E and M). Can you describe how you envision this? What would happen in phase E and phase M?
When I run your model, I get
Can you put some box constraints on the parameters so the optimizer doesn't explore crazy solutions?
gamma cannot be greater than 300, yes? beta should be between -200 and 200? Try the attached.
Thank you for that insight! Didn't know I could set constraints on the parameter estimation process. With respect to the fixed-effect parameters, I would set the following constraints:
For variances and residuals, I suppose nothing below zero.
I did realize, however, that the code in your response (
lgc.R
) used a different starting value for epsilon (0.9832023 vs 5). than I had initially posted (not sure if that was an error in copying the code or intentional to point out the importance of setting box constraints).With respect to the model itself, I am indeed trying to implement the EM algorithm in a latent growth curve model with a linearized nonlinear function (I just realized that my mentioning of growth mixture models might have introduced confusion). Although I understand how the EM algorithm works in the common coin flipping example (e.g., one of two coins, each with its own probability of heads [$h_1$, $h_2$], is selected with a given probability [q] and $h_1$, $h_2$, and q must be estimated ), I am unsure how to precisely compute the expectation (i.e., a function that lower-bounds the marginal log-likelihood) and then maximize the lower-bounding function.
Given that I am currently integrating the parallel method of running a model with multiple sets of starting values (using
snowfall
), I will include the tip on box constraints and see if my model fitting procedure is robust. With respect to FIML, is there any resource you know of that discusses its ability to deal with latent variables? I could only find articles discussing FIML with missing data.FIML, a flavor of maximum likelihood, is commonly used with latent variable models. Check out "In All Likelihood: Statistical Modelling And Inference Using Likelihood" by Pawitan. I think you wont find any articles because it is not remotely controversial.
> used a different starting value for epsilon (0.9832023 vs 5).
Yeah, I edited that. It's usually a good idea to start with a larger residual variance than you expect. It helps the optimizer not get locked into a small search space too early.
> Given that I am currently integrating the parallel method of running a model with multiple sets of starting values
Yeah, that sounds like a sensible approach. Report back here on your experience!
I tried fitting the nonlinear latent growth curve model to the data set (
data_wide.csv
) with 300 sets of starting values (starting_values.csv
) and have not been able to find a solution (only status codes of 5 and 6 were obtained). Note that I generated starting values forepsilon
and also updated upper and lower boundaries in the parameter estimation process as per your advice. I am unsure where to go from here and so figured I would provide an update on what I have done (hopefully this problem can be solved).Because I generated the data set, I 'know' the population values and, unfortunately, have not been able to obtain remotely close estimates. The population values that were used to generate the data are listed below:
diff_fixed
= 0.48beta_fixed
= 60gamma_fixed
= 20diff_rand
= $0.30^2$beta_rand
= $5^2$gamma_rand
= $5^2$epsilon
= $0.25^2$The parameter values were used with the logistic function below to generate observed scores for each i individual at each t time point:
$$ y_{it} = \frac{diff_i}{1+e^{\frac{\beta_i - time_t}{\gamma_i}}} + \epsilon_{it}$$
Note that all covariances have been set to zero. Also note that I have nested code within functions to allow a more efficient (and readable) model fitting procedure. The necessary functions are provided in the code chunk below.
Code attempts to find a solution by fitting a nonlinear latent growth curve model with 300 sets of starting values (i.e.,
nrow(starting_values)
.I also tried fitting the above model fitting procedure with larger samples sizes and a more ideal set of measurement days (e.g., 1, 90, 280, 270, 360) but had no success. I have not included code for these tests largely because the data in
data_wide.csv
represent a fairly non-optimal condition in my simulations and I assume that if I can devise a model fitting procedure that succeeds under these conditions, then the procedure will succeed under more ideal conditions. I hope I have not provided a confusing summary of my efforts.Code below fits model with ideal set of starting values (not sure if this is a useful test; I assume convergence should occur in this context?).
I asked for some help on the openmx mailing list. Mike Neale wrote,
"if the derivatives of the non-linear function with respect to its
parameters are known, these can be used for the factor loadings. This was the
approach McArdle and I used (based on Michael Browne’s work) to specify LGMs
that used logistic, exponential and Gompertz curves." See the attached manuscript.
Thank you for helping me out! Hmm, I thought the factor loadings in my model were already specified as derivatives of the three-parameter logistic function with respect to the corresponding parameter? That is, doesn't the code below implement these factor loading specifications?
Are you fixing the paths going into your manifest variables, or estimating them?
Mikes point is that you can fix the paths going into the manifests across different times, and then the task is to create inputs into the latent variable hub driving those paths, and it’s those estimates the parameters, that are inputs to a function like Gompertz.
Phone with web interface - may contain typos :-)
The paths going into the manifest variables are fixed (from my understanding, that is implemented by setting
free=FALSE
) and my goal is to only estimate values for the logistic function parameters (diff,
beta
, andgamma
, in my case). In looking at the output of any of my models, I have 7 estimated parameters (fixed and random effects fordiff,
beta
, andgamma
and a residual termepsilon
) and, if I am correct, the fixed-effect estimates represent the inputs into the latent variable hub that you are talking about.Minor edit to the code I posted. Because I renamed the fixed-effects variables, the
from
arguments change (the_fixed
suffix has been added; see below code snippet).Important edit to code:
I realized that my definitions of the partial derivatives indexed the incorrect variables (I confused the
name
andlabels
arguments). I have made corresponding changes to my code such that the paths going into the manifest variables are now fixed to the (correctly defined) partial derivatives. Here is the relevant code that was changed:I also reran my model fitting procedure (see
nonlin_lgc_fitting.R
) using 300 sets of starting values but no model converged (seenonlinear_lgc_procedure.R
).so you want the path values to be controlled by algebras which are working out the value (function derivatives) based on the current function control parameters.
Relevant piece from Mike Neale
> Modeling growth curves such as those that arise from differential equations is a more complex task than is
> usual for structural equation modeling. Effectively, the factor loadings are neither constants nor
> free parameters, but complex functions obtained as the partial derivatives of the growth curve function
> with respect to its free parameters. The expected means are obtained directly from the growth
> curve function itself, but these too may involve non-trivial algebra. In our 2000 paper, Jack and
> I tabulated these derivatives for the Logistic, Gompertz and Exponential growth curve forms
> (Neale & McArdle, 2000). Here, for illustration, I reproduce the equations for the Gompertz curve in Table 14.1.
> Figures 14.4 and 14.5 respectively show path diagrams for the conventional level, slope,
> and quadratic polynomial component LGC model, and a parametric structured curve model. The
> primary difference between the figures is that in the LGC model the factor loadings are fixed to constant
> values, whereas in the PGC model the loadings are complicated algebraic functions of the free parameters
> of the growth curve being fitted.
Yes, in OpenMx vernacular, my goal is to fix the loadings from the latent variables using the algebras. If I am correct, the code below (taken from
nonlin_lgc_procedure.R
) does this:Note that, in my model-creating function (
create_logistic_growth_model.R
), I have automated the label creation process usingsprintf()
. Also note that I changed the names of the latent variables to avoid any confusion with the labels used for the functional constraints.