How to use the EM algorithm with a nonlinear latent growth curve model?

Posted on Wed, 09/29/2021 - 14:23

ssciarra Joined: 01/08/2021

Forums

Longitudinal SEM and Latent Growth Curves

I am trying to fit a nonlinear latent growth curve model (as part of simulations I am conducting) using the three-parameter logistic (s-shaped change) function below

$$y_{obs} = \frac{diff}{1+e^{\frac{\beta - time_i}{\gamma}}}, $$

where $diff$ represents the difference between the starting and ending values (i.e., how much change occurs over time), $\beta$ represents the number of days needed to reach the midpoint of the logistic curve, and $\gamma$ represents the number of days needed to go from the midpoint to approximately 73% of the observed change (i.e., a satiation point). Because the SEM framework only allows any given outcome variable to be predicted as a weighted sum of the predictor variables (i.e., SEM framework is inherently linear), the logistic function cannot simply be inserted into a latent growth curve model. As per the structured latent curve approach outlined in several articles (e.g., Browne & du Toit, 1991; Grimm, Ram, & Estabrook 2010; Preacher & Hancock 2015), I linearized the logistic function using a first-order Taylor series expansion.

The code below outlines the nonlinear latent growth curve model that I have fit to a generated data set (`data_wide.csv`). The generated data set contains scores from 100 people measured at five points over a year period (specifically on days 1, 150, 180, 210, and 360) and is generated from a population model where change is logistic. I have also attached a set of starting values (generated by largely copying the starting value procedure in MPlus) to initialize parameter estimation.

Given that my model contains latent variables (i.e., diff, beta[$\beta$], and gamma [$\gamma$]), I would like to use a parameter estimation algorithm that is well suited to dealing with latent variables. From what I understand, the `mxFitFunctionML()` function uses full information maximum likelihood (FIML) estimation to estimate parameter values when raw data are used (as in my case). Unfortunately, (and from what I understand about estimation algorithms), FIML estimation is not well suited to dealing with latent variables since it does not iteratively compute lower bounds of the marginal log-likelihood and maximize these lower bounds (i.e., the e- and m-steps of the EM algorithm). Given the iterative nature of the EM algorithm, it makes sense that I have realized considerable performance improvements (both in time and parameter estimation accuracy) when I have used the EM algorithm (versus FIML) to fit growth mixture models.

Although I was successful in implementing the EM algorithm for growth mixture models, it was largely as a result of copying over the example code from the documentation of `mxComputeEM()`. I have looked at other posts and have not been able to work out how to implement the EM algorithm for my nonlinear latent growth curve model. Just wondering if anyone might be able to help me out.

data_wide <- read_csv('data_wide.csv')


#data table of randomly generated starting values (procedure largely mirrors process followed in MPlus)

starting_values <- data.table(beta_fixed = 65.57385,

                        gamma_fixed = 20,

                        diff_fixed = 80,

                        beta_rand = 21.75747,

                        gamma_rand = 0.5,

                        diff_rand = 0.8290784,

                        epsilon = 0.9832023

                        )
#setup variables

manifest_vars <- c("t1_1", "t2_150", "t3_180", "t4_210", "t5_360")

latent_vars <- c('diff', 'beta', 'gamma')
measurement_days <- as.numeric(str_extract(string = names(data_wide[ 2:ncol(data_wide)]),

                                          pattern = '[^_]*$')) #1 150 180 210 360; sets values for 'time' object
#code adapted from Grimm, Ram, & Estabrook (2016; Chapter 12)

model <- mxModel(model = 'nonlin_lgc',

                 type = 'RAM',

                 mxData(observed = data_wide, type = 'raw'),
                 manifestVars = manifest_vars,

                 latentVars = latent_vars,
                 #Manifest means

                 manMeans = mxPath(from = 'one', to = manifest_vars, free = FALSE, arrows = 1, values = 0),
                 #Latent variable means (linear parameters).

                 ##Note that the nonlinear parameters (i.e., beta, gamma) do not have estimated means

                 mxPath(from = 'one', to = c('diff'),free = T, arrows = 1, values = starting_values$diff_fixed,

                        labels = 'diff'),
                 #Residual variances; by using one label, they are assumed to all be equal (homogeneity of variance)

                 mxPath(from = manifest_vars,

                        arrows=2, free=TRUE, values = starting_values$epsilon, labels='epsilon'),
                 #Latent variable covariances and variances

                 mxPath(from = latent_vars,

                        connect='unique.pairs', arrows=2,

                        #aa(var_diff), ab(cov_diff_beta), ac(cov_diff_gamma), bb(var_beta), bc(var_beta_gamma), cc(var_gamma)

                        free = c(TRUE,

                                 FALSE, FALSE,

                                 TRUE, FALSE, TRUE),

                        values=c(starting_values$diff_rand,

                                 NA,NA,

                                 starting_values$beta_rand, NA, starting_values$gamma_rand),

                        labels=c('var_diff',

                                 'NA(cov_diff_beta)','NA(cov_diff_gamma)',

                                 'var_beta', 'NA(var_beta_gamma)', 'var_gamma')),
                 #Functional constraints

                  mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE,

                          values = starting_values$diff_fixed, labels = 'diff', name = 'd'),

                 mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE,

                          values = starting_values$beta_fixed, labels = 'beta', name = 'b'),

                 mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE,

                          values = starting_values$gamma_fixed, labels = 'gamma', name = 'g'),

                 mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE,

                          values = measurement_days, name = 'time'),
                 #partial derivative of logistic function with respect to each parameter (diff, beta, gamma, respectively)

                 mxAlgebra(expression = 1/(1 + exp((beta - time)/gamma)), name="Dl"),

                 mxAlgebra(expression =  -(diff*(exp((beta - time)/gamma) * (1/gamma))/(1 + exp((beta - time)/gamma))^2), name = 'Bl'),

                 mxAlgebra(expression =  diff * (exp((beta - time)/gamma) * ((beta - time)/gamma^2))/(1 + exp((beta - time)/gamma))^2, name = 'Gl'),

                 #Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives

                 mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE,

                        labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Gl[5,1]")),

                 mxPath(from='beta', to = manifest_vars, arrows=1,  free=FALSE,

                        labels =  c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Gl[5,1]")),

                 mxPath(from='gamma', to = manifest_vars, arrows=1,  free=FALSE,

                        labels =  c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]")),
                 #Enable likelihood vector; returns vector of likelihoods for each data point (i.e., p(X|parameter_values))

                 mxFitFunctionML(vector = TRUE)
                 #implement EM algorithm to estimate parameter values??
)

mxRun(model) #it appears no errors (hopefully) were made in defining the model, but the model does not converge (can I use EM algorithm?)

#not sure how to repurpose this code for the nonlinear latent growth curve model mxModel( "Mixture", data4mx, class1, class2, mxAlgebra((1-Posteriors) * Class1.fitfunction, name="PL1"), mxAlgebra(Posteriors * Class2.fitfunction, name="PL2"), mxAlgebra(PL1 + PL2, name="PL"), mxAlgebra(PL2 / PL, recompute='onDemand', initial=matrix(runif(N,.4,.6), nrow=N, ncol = 1), name="Posteriors"), mxAlgebra(-2*sum(log(PL)), name="FF"), mxFitFunctionAlgebra(algebra="FF"), mxComputeEM( estep=mxComputeOnce("Mixture.Posteriors"), mstep=mxComputeGradientDescent(fitfunction="Mixture.fitfunction")))

References

Browne, M. W., & du Toit, S. H. C. (1991). Models for learning data. In L. M. Collins & J. L. Horn (Eds.), Best methods for the analysis of change (pp. 47–68). Washington, DC: American Psychological Association.
Grimm, K. J., Ram, N., & Estabrook, R. (2010). Nonlinear structured growth mixture models in Mplus and OpenMx. Multivariate Behavioral Research, 45(6), 887–909. https://doi.org/10.1080/00273171.2010.531230
Preacher, K. J., & Hancock, G. R. (2015). Meaningful aspects of change as novel random coefficients: A general method for reparameterizing longitudinal models. Psychological Methods, 20(1), 84–101. http://dx.doi.org/10.1037/met0000028

File attachments

data_wide.csv

Replied on Wed, 09/29/2021 - 16:44

AdminJosh Joined: Dec 12, 2012

more clarity

"FIML estimation is not well suited to dealing with latent variables" -- Not sure about this reasoning. How you tried it?

You aren't trying to fit a mixture model. From what I can tell, you have a growth curve model with non-standard basis functions. I'd expect the default optimization strategy to work reasonably well.

If you want to re-frame the optimization as an E-M algorithm then you need to divide the model into two separate phases (E and M). Can you describe how you envision this? What would happen in phase E and phase M?

Replied on Wed, 09/29/2021 - 17:00

AdminJosh Joined: Dec 12, 2012

results

When I run your model, I get

name matrix row col Estimate Std.Error A 1 epsilon S t1_1 t1_1 3.134425 6.390361e+01 2 var_diff S diff diff -4.420305 4.166044e+01 3 var_beta S beta beta 345.381668 NA ! 4 var_gamma S gamma gamma 346.345101 NA ! 5 diff M 1 diff 8.317164 3.432677e+01 6 beta b 1 1 52604.749831 2.013878e-03 ! 7 gamma g 1 1 137508.614087 7.684322e-04 !

Can you put some box constraints on the parameters so the optimizer doesn't explore crazy solutions?

gamma cannot be greater than 300, yes? beta should be between -200 and 200? Try the attached.

File attachments

lgc.R

Replied on Wed, 09/29/2021 - 18:01

ssciarra Joined: Jan 08, 2021

Thank you for that insight!

Thank you for that insight! Didn't know I could set constraints on the parameter estimation process. With respect to the fixed-effect parameters, I would set the following constraints:

* beta: 0-360
* gamma: 0-360
* diff: 0:1 (diff is in standardized effect size units)

For variances and residuals, I suppose nothing below zero.

I did realize, however, that the code in your response (`lgc.R`) used a different starting value for epsilon (0.9832023 vs 5). than I had initially posted (not sure if that was an error in copying the code or intentional to point out the importance of setting box constraints).

With respect to the model itself, I am indeed trying to implement the EM algorithm in a latent growth curve model with a linearized nonlinear function (I just realized that my mentioning of growth mixture models might have introduced confusion). Although I understand how the EM algorithm works in the common coin flipping example (e.g., one of two coins, each with its own probability of heads [$h_1$, $h_2$], is selected with a given probability [q] and $h_1$, $h_2$, and q must be estimated ), I am unsure how to precisely compute the expectation (i.e., a function that lower-bounds the marginal log-likelihood) and then maximize the lower-bounding function.

Given that I am currently integrating the parallel method of running a model with multiple sets of starting values (using `snowfall`), I will include the tip on box constraints and see if my model fitting procedure is robust. With respect to FIML, is there any resource you know of that discusses its ability to deal with latent variables? I could only find articles discussing FIML with missing data.

Replied on Wed, 09/29/2021 - 18:58

AdminJosh Joined: Dec 12, 2012

FIML

FIML, a flavor of maximum likelihood, is commonly used with latent variable models. Check out "In All Likelihood: Statistical Modelling And Inference Using Likelihood" by Pawitan. I think you wont find any articles because it is not remotely controversial.

> used a different starting value for epsilon (0.9832023 vs 5).

Yeah, I edited that. It's usually a good idea to start with a larger residual variance than you expect. It helps the optimizer not get locked into a small search space too early.

> Given that I am currently integrating the parallel method of running a model with multiple sets of starting values

Yeah, that sounds like a sensible approach. Report back here on your experience!

Replied on Fri, 10/01/2021 - 15:13

ssciarra Joined: Jan 08, 2021

I tried fitting the nonlinear

I tried fitting the nonlinear latent growth curve model to the data set (`data_wide.csv`) with 300 sets of starting values (`starting_values.csv`) and have not been able to find a solution (only status codes of 5 and 6 were obtained). Note that I generated starting values for `epsilon` and also updated upper and lower boundaries in the parameter estimation process as per your advice. I am unsure where to go from here and so figured I would provide an update on what I have done (hopefully this problem can be solved).

Because I generated the data set, I 'know' the population values and, unfortunately, have not been able to obtain remotely close estimates. The population values that were used to generate the data are listed below:

* `diff_fixed` = 0.48
* `beta_fixed` = 60
* `gamma_fixed`= 20
* `diff_rand` = $0.30^2$
* `beta_rand` = $5^2$
* `gamma_rand`= $5^2$
* `epsilon` = $0.25^2$

The parameter values were used with the logistic function below to generate observed scores for each *i* individual at each *t* time point:

$$ y_{it} = \frac{diff_i}{1+e^{\frac{\beta_i - time_t}{\gamma_i}}} + \epsilon_{it}$$

Note that all covariances have been set to zero. Also note that I have nested code within functions to allow a more efficient (and readable) model fitting procedure. The necessary functions are provided in the code chunk below.

#generates names for manifest variables (this is nested within create_logistic_growth_model) generate_manifest_var_names <- function(data_wide) {


  manifest_names <- names(data_wide[ , 2:ncol(data_wide)])

  return(manifest_names)

}
#create nonlinear latent growth curve model

create_logistic_growth_model <- function(data_wide, model_name, starting_values) {
  manifest_vars <- generate_manifest_var_names(data = data_wide)

  latent_vars <- c('diff_fixed', 'beta_fixed', 'gamma_fixed')

  measurement_days <- as.numeric(str_extract(string = names(data_wide[ 2:ncol(data_wide)]), pattern = '[^_]*$'))
  model <- mxModel(model = model_name,

                   type = 'RAM', independent = T,

                   mxData(observed = data_wide, type = 'raw'),
                   manifestVars = manifest_vars,

                   latentVars = latent_vars,
                   #Manifest means

                   manMeans = mxPath(from = 'one', to = manifest_vars, free = FALSE, arrows = 1),
                   #Latent variable means (linear parameters). Note that the nonlinear parameters of beta and gamma do not have estimated means

                   mxPath(from = 'one', to = c('diff_fixed'),free = TRUE, arrows = 1, values = starting_values$diff_fixed,

                          labels = 'diff_fixed', lbound = 0, ubound = 2),
                   #Residual variances; by using one label, they are assumed to all be equal (homogeneity of variance)

                   mxPath(from = manifest_vars,

                          arrows=2, free=TRUE, values = starting_values$epsilon, labels='epsilon', lbound = 1e-3),
                   #Latent variable covariances and variances

                   mxPath(from = latent_vars,

                          connect='unique.pairs', arrows=2,

                          #aa(diff_rand), ab(cov_diff_beta), ac(cov_diff_gamma), bb(beta_rand), bc(var_beta_gamma), cc(gamma_rand)

                          free = c(TRUE,

                                   FALSE, FALSE,

                                   TRUE, FALSE, TRUE),

                          values=c(starting_values$diff_rand,

                                   NA,NA,

                                   starting_values$beta_rand, NA, starting_values$gamma_rand),

                          labels=c('diff_rand',

                                   'NA(cov_diff_beta)','NA(cov_diff_gamma)',

                                   'beta_rand', 'NA(var_beta_gamma)', 'gamma_rand'),

                          lbound = c(1e-3,

                                     NA, NA,

                                     1, NA, 1)),
                   #Functional constraints

                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE,

                            values = starting_values$diff_fixed, labels = 'diff_fixed', name = 'd',  lbound = 0, ubound = 2),

                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE,

                            values = starting_values$beta_fixed, labels = 'beta_fixed', name = 'b', lbound = 0, ubound = 360),

                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE,

                            values = starting_values$gamma_fixed, labels = 'gamma_fixed', name = 'g', lbound = 0, ubound = 360),

                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, values = measurement_days, name = 'time'),
                   mxAlgebra(expression = 1/(1 + exp((beta_fixed - time)/gamma_fixed)), name="Dl"),

                   mxAlgebra(expression =  -(diff_fixed*(exp((beta_fixed - time)/gamma_fixed) *

                                                           (1/gamma_fixed))/(1 + exp((beta_fixed - time)/gamma_fixed))^2), name = 'Bl'),

                   mxAlgebra(expression =  diff_fixed * (exp((beta_fixed - time)/gamma_fixed) * ((beta_fixed - time)/gamma_fixed^2))/

                               (1 + exp((beta_fixed - time)/gamma_fixed))^2, name = 'Gl'),
                     #Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives

                 mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE,

                        labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Dl[5,1]")),

                 mxPath(from='beta', to = manifest_vars, arrows=1,  free=FALSE,

                        labels =  c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Bl[5,1]")),

                 mxPath(from='gamma', to = manifest_vars, arrows=1,  free=FALSE,

                        labels =  c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]")),
                   mxFitFunctionML()

                   )
  return(model)

}
#changes starting values of model created from create_logistic_growth_model

update_starting_values <- function(model_number, model, starting_values){
  param_name_order <- names(omxGetParameters(model))
    #set model with new starting values

    model_name <- paste("Iteration", model_number, sep="")

    new_model <- omxSetParameters(model = model,

                                  name = model_name,

                                  labels = param_name_order,

                                  values = as.numeric(setcolorder(x = starting_values[model_number, ],

                                                                  neworder = param_name_order)))
    #return model with new starting values

    return(new_model)

}
#iterates update_starting_values to create a parent model where each submodel has different starting values

assemble_model_starts <- function(model, starting_values) {
  parent_model <- mxModel(name = "parent_model")

  sub_models <- lapply(X = 1:nrow(starting_values), FUN = update_starting_values, model = model, starting_values = starting_values)

  merged_model <- mxModel(parent_model, sub_models)
  return(merged_model)

}
#extract necessary information about fit from each submodel

extract_fit_info <- function(model){

retval <- c( model$output$fit, model$output$status$code, model$output$iterations) #round(model$expectation$output$weights[1], 4)) #temporarily omitted return(retval) }

Code attempts to find a solution by fitting a nonlinear latent growth curve model with 300 sets of starting values (i.e., `nrow(starting_values)`.

library(OpenMx) library(data.table) library(stringr) library(snowfall)


#read in starting values and raw data sets

starting_values <- read_csv('starting_values.csv')

data_wide <- read_csv('data_wide.csv')
#create nonlinear latent growth curve model

latent_growth_model <- create_logistic_growth_model(data_wide = data_wide,

                                                    model_name = 'lg_nonlinear', starting_values = starting_values[1, ])
#run submodel creation and model fitting in parallel mode

sfInit(parallel = TRUE, cpus = detectCores() - 1) #save 1 core to prevent computer from crashing

sfLibrary(OpenMx)
#create submodels that each have a different set of starting values

merged_models <- assemble_model_starts(model = latent_growth_model, starting_values = starting_values)
#run all models

output <- mxRun(merged_models, unsafe = T)
resultsFit  <- data.table(t(omxSapply(output$submodels, fitStats)))

names(resultsFit) <- c("fit", "status_code", "num_iterations")

sum(resultsFit$status_code == 0) #0

sfStop() #close parallel model

I also tried fitting the above model fitting procedure with larger samples sizes and a more ideal set of measurement days (e.g., 1, 90, 280, 270, 360) but had no success. I have not included code for these tests largely because the data in `data_wide.csv` represent a fairly non-optimal condition in my simulations and I assume that if I can devise a model fitting procedure that succeeds under these conditions, then the procedure will succeed under more ideal conditions. I hope I have not provided a confusing summary of my efforts.

Code below fits model with ideal set of starting values (not sure if this is a useful test; I assume convergence should occur in this context?).

ideal_starting_values <- data.table('beta_fixed' = 67, 'gamma_fixed' = 15, 'beta_rand' = 25, 'gamma_rand' = 20, 'diff_fixed' = 0.7, 'diff_rand' = 0.6, 'epsilon' = 0.9)


latent_growth_model_ideal <- create_logistic_growth_model(data_wide = data_wide,

                                                    model_name = 'lg_nonlinear', starting_values = ideal_starting_values)
first_run <- mxRun(latent_growth_model_ideal) #status code 6
mxRun(first_run) #status code 5

mxTryHard(latent_growth_model_ideal) #All fit attempts resulted in errors

File attachments

starting_values.csv

Replied on Mon, 10/04/2021 - 11:13

AdminJosh Joined: Dec 12, 2012

Structured latent growth curves for twin data

I asked for some help on the openmx mailing list. Mike Neale wrote,
"if the derivatives of the non-linear function with respect to its
parameters are known, these can be used for the factor loadings. This was the
approach McArdle and I used (based on Michael Browne’s work) to specify LGMs
that used logistic, exponential and Gompertz curves." See the attached manuscript.

File attachments

9781138064225c14_301-314.pdf

Replied on Mon, 10/04/2021 - 11:30

ssciarra Joined: Jan 08, 2021

Thank you for helping me out!

Thank you for helping me out! Hmm, I thought the factor loadings in my model were already specified as derivatives of the three-parameter logistic function with respect to the corresponding parameter? That is, doesn't the code below implement these factor loading specifications?

mxAlgebra(expression = 1/(1 + exp((beta_fixed - time)/gamma_fixed)), name="Dl"), mxAlgebra(expression = -(diff_fixed*(exp((beta_fixed - time)/gamma_fixed) * (1/gamma_fixed))/(1 + exp((beta_fixed - time)/gamma_fixed))^2), name = 'Bl'), mxAlgebra(expression = diff_fixed * (exp((beta_fixed - time)/gamma_fixed) * ((beta_fixed - time)/gamma_fixed^2))/ (1 + exp((beta_fixed - time)/gamma_fixed))^2, name = 'Gl'),

#Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE, labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Dl[5,1]")), mxPath(from='beta', to = manifest_vars, arrows=1, free=FALSE, labels = c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Bl[5,1]")), mxPath(from='gamma', to = manifest_vars, arrows=1, free=FALSE, labels = c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]"))

Replied on Mon, 10/04/2021 - 14:41

tbates Joined: Jul 31, 2009

Are you fixing the parameters?

Are you fixing the paths going into your manifest variables, or estimating them?

Mikes point is that you can fix the paths going into the manifests across different times, and then the task is to create inputs into the latent variable hub driving those paths, and it’s those estimates the parameters, that are inputs to a function like Gompertz.
Phone with web interface - may contain typos :-)

Replied on Mon, 10/04/2021 - 17:56

ssciarra Joined: Jan 08, 2021

The paths going into the

The paths going into the manifest variables are fixed (from my understanding, that is implemented by setting free=FALSE) and my goal is to only estimate values for the logistic function parameters (`diff,` `beta`, and `gamma`, in my case). In looking at the output of any of my models, I have 7 estimated parameters (fixed and random effects for `diff,` `beta`, and `gamma` and a residual term `epsilon`) and, if I am correct, the fixed-effect estimates represent the inputs into the latent variable hub that you are talking about.

Replied on Mon, 10/04/2021 - 18:24

ssciarra Joined: Jan 08, 2021

Minor edit to the code I

Minor edit to the code I posted. Because I renamed the fixed-effects variables, the `from` arguments change (the `_fixed` suffix has been added; see below code snippet).
#Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives mxPath(from = 'diff_fixed', to = manifest_vars, arrows=1, free=FALSE, labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Dl[5,1]")), mxPath(from='beta_fixed', to = manifest_vars, arrows=1, free=FALSE, labels = c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Bl[5,1]")), mxPath(from='gamma_fixed', to = manifest_vars, arrows=1, free=FALSE, labels = c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]"))

Replied on Tue, 10/05/2021 - 00:19

ssciarra Joined: Jan 08, 2021

Important edit to code:

I realized that my definitions of the partial derivatives indexed the incorrect variables (I confused the `name` and `labels` arguments). I have made corresponding changes to my code such that the paths going into the manifest variables are now fixed to the (correctly defined) partial derivatives. Here is the relevant code that was changed:
#Functional constraints mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$diff_fixed[1], labels = 'diff_fixed', name = 'd', lbound = 0, ubound = 2), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$beta_fixed[1], labels = 'beta_fixed', name = 'b', lbound = 0, ubound = 360), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$gamma_fixed[1], labels = 'gamma_fixed', name = 'g', lbound = 0, ubound = 360), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, values = measurement_days, name = 'time'),

#partial derivatives mxAlgebra(expression = 1/(1 + exp((b - time)/g)), name="Dl"), mxAlgebra(expression = -(d * (exp((b - time)/g) * (1/g))/(1 + exp((b - time)/g))^2), name = 'Bl'), mxAlgebra(expression = d * (exp((b - time)/g) * ((b - time)/g^2))/(1 + exp((b - time)/g))^2, name = 'Gl')

I also reran my model fitting procedure (see `nonlin_lgc_fitting.R`) using 300 sets of starting values but no model converged (see `nonlinear_lgc_procedure.R`).

File attachments

nonlin_lgc._procedure.R

Replied on Tue, 10/05/2021 - 08:22

tbates Joined: Jul 31, 2009

set path label= "algebra[1,1]"

so you want the path values to be controlled by algebras which are working out the value (function derivatives) based on the current function control parameters.

Relevant piece from Mike Neale

> Modeling growth curves such as those that arise from differential equations is a more complex task than is
> usual for structural equation modeling. Effectively, the factor loadings are neither constants nor
> free parameters, but complex functions obtained as the partial derivatives of the growth curve function
> with respect to its free parameters. The expected means are obtained directly from the growth
> curve function itself, but these too may involve non-trivial algebra. In our 2000 paper, Jack and
> I tabulated these derivatives for the Logistic, Gompertz and Exponential growth curve forms
> (Neale & McArdle, 2000). Here, for illustration, I reproduce the equations for the Gompertz curve in Table 14.1.
> Figures 14.4 and 14.5 respectively show path diagrams for the conventional level, slope,
> and quadratic polynomial component LGC model, and a parametric structured curve model. The
> primary difference between the figures is that in the LGC model the factor loadings are fixed to constant
> values, whereas in the PGC model the loadings are complicated algebraic functions of the free parameters
> of the growth curve being fitted.

Replied on Tue, 10/05/2021 - 10:05

ssciarra Joined: Jan 08, 2021

Correct algebra implementation?

Yes, in OpenMx vernacular, my goal is to fix the loadings from the latent variables using the algebras. If I am correct, the code below (taken from `nonlin_lgc_procedure.R`) does this:
latent_vars <- c('diff', 'beta', 'gamma') #Functional constraints mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$diff_fixed[1], labels = 'diff_fixed', name = 'd', lbound = 0, ubound = 2), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$beta_fixed[1], labels = 'beta_fixed', name = 'b', lbound = 0, ubound = 360), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, values = starting_values$gamma_fixed[1], labels = 'gamma_fixed', name = 'g', lbound = 0, ubound = 360), mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, values = measurement_days, name = 'time'),


#Algebras (corresponding partial derivatives)

mxAlgebra(expression = 1/(1 + exp((b - time)/g)), name="Dl"),

mxAlgebra(expression =  -(d * (exp((b - time)/g) * (1/g))/(1 + exp((b - time)/g))^2), name = 'Bl'),

mxAlgebra(expression =  d * (exp((b - time)/g) * ((b - time)/g^2))/(1 + exp((b - time)/g))^2, name = 'Gl'),

#Factor loadings; all fixed and, importantly, constrained to change according to their partial derivatives mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE, values = 1, labels = c('Dl[1,1]', 'Dl[2,1]', 'Dl[3,1]', 'Dl[4,1]', 'Dl[5,1])')), mxPath(from='beta', to = manifest_vars, arrows=1, free=FALSE, labels = c('Bl[1,1]', 'Bl[2,1]', 'Bl[3,1]', 'Bl[4,1]', 'Bl[5,1])')), mxPath(from='gamma', to = manifest_vars, arrows=1, free=FALSE, labels = c('Gl[1,1]', 'Gl[2,1]', 'Gl[3,1]', 'Gl[4,1]', 'Gl[5,1])'))

Note that, in my model-creating function (`create_logistic_growth_model.R`), I have automated the label creation process using `sprintf()`. Also note that I changed the names of the latent variables to avoid any confusion with the labels used for the functional constraints.

File attachments

create_logistic_growth_model.R

News

Recent Posts

How to use the EM algorithm with a nonlinear latent growth curve model?

References

more clarity

results

Thank you for that insight!

FIML

I tried fitting the nonlinear

Structured latent growth curves for twin data

Thank you for helping me out!

Are you fixing the parameters?

The paths going into the

Minor edit to the code I

Important edit to code:

set path label= "algebra[1,1]"

Correct algebra implementation?