You are here

How to use the EM algorithm with a nonlinear latent growth curve model?

14 posts / 0 new
Last post
ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
How to use the EM algorithm with a nonlinear latent growth curve model?
AttachmentSize
File data_wide.csv9.36 KB

I am trying to fit a nonlinear latent growth curve model (as part of simulations I am conducting) using the three-parameter logistic (s-shaped change) function below

$$y_{obs} = \frac{diff}{1+e^{\frac{\beta - time_i}{\gamma}}}, $$

where $diff$ represents the difference between the starting and ending values (i.e., how much change occurs over time), $\beta$ represents the number of days needed to reach the midpoint of the logistic curve, and $\gamma$ represents the number of days needed to go from the midpoint to approximately 73% of the observed change (i.e., a satiation point). Because the SEM framework only allows any given outcome variable to be predicted as a weighted sum of the predictor variables (i.e., SEM framework is inherently linear), the logistic function cannot simply be inserted into a latent growth curve model. As per the structured latent curve approach outlined in several articles (e.g., Browne & du Toit, 1991; Grimm, Ram, & Estabrook 2010; Preacher & Hancock 2015), I linearized the logistic function using a first-order Taylor series expansion.

The code below outlines the nonlinear latent growth curve model that I have fit to a generated data set (data_wide.csv). The generated data set contains scores from 100 people measured at five points over a year period (specifically on days 1, 150, 180, 210, and 360) and is generated from a population model where change is logistic. I have also attached a set of starting values (generated by largely copying the starting value procedure in MPlus) to initialize parameter estimation.

Given that my model contains latent variables (i.e., diff, beta[$\beta$], and gamma [$\gamma$]), I would like to use a parameter estimation algorithm that is well suited to dealing with latent variables. From what I understand, the mxFitFunctionML() function uses full information maximum likelihood (FIML) estimation to estimate parameter values when raw data are used (as in my case). Unfortunately, (and from what I understand about estimation algorithms), FIML estimation is not well suited to dealing with latent variables since it does not iteratively compute lower bounds of the marginal log-likelihood and maximize these lower bounds (i.e., the e- and m-steps of the EM algorithm). Given the iterative nature of the EM algorithm, it makes sense that I have realized considerable performance improvements (both in time and parameter estimation accuracy) when I have used the EM algorithm (versus FIML) to fit growth mixture models.

Although I was successful in implementing the EM algorithm for growth mixture models, it was largely as a result of copying over the example code from the documentation of mxComputeEM(). I have looked at other posts and have not been able to work out how to implement the EM algorithm for my nonlinear latent growth curve model. Just wondering if anyone might be able to help me out.

data_wide <- read_csv('data_wide.csv') 
 
#data table of randomly generated starting values (procedure largely mirrors process followed in MPlus)
starting_values <- data.table(beta_fixed = 65.57385, 
                        gamma_fixed = 20, 
                        diff_fixed = 80, 
                        beta_rand = 21.75747, 
                        gamma_rand = 0.5, 
                        diff_rand = 0.8290784, 
                        epsilon = 0.9832023
                        )
 
#setup variables 
manifest_vars <- c("t1_1", "t2_150", "t3_180", "t4_210", "t5_360")
latent_vars <- c('diff', 'beta', 'gamma')
 
measurement_days <- as.numeric(str_extract(string = names(data_wide[ 2:ncol(data_wide)]), 
                                          pattern = '[^_]*$')) #1 150 180 210 360; sets values for 'time' object
 
#code adapted from Grimm, Ram, & Estabrook (2016; Chapter 12) 
model <- mxModel(model = 'nonlin_lgc',
                 type = 'RAM',
                 mxData(observed = data_wide, type = 'raw'),
 
                 manifestVars = manifest_vars,
                 latentVars = latent_vars,
 
                 #Manifest means
                 manMeans = mxPath(from = 'one', to = manifest_vars, free = FALSE, arrows = 1, values = 0),
 
                 #Latent variable means (linear parameters). 
                 ##Note that the nonlinear parameters (i.e., beta, gamma) do not have estimated means
                 mxPath(from = 'one', to = c('diff'),free = T, arrows = 1, values = starting_values$diff_fixed,
                        labels = 'diff'),
 
                 #Residual variances; by using one label, they are assumed to all be equal (homogeneity of variance)
                 mxPath(from = manifest_vars,
                        arrows=2, free=TRUE, values = starting_values$epsilon, labels='epsilon'),
 
                 #Latent variable covariances and variances
                 mxPath(from = latent_vars,
                        connect='unique.pairs', arrows=2,
                        #aa(var_diff), ab(cov_diff_beta), ac(cov_diff_gamma), bb(var_beta), bc(var_beta_gamma), cc(var_gamma)
                        free = c(TRUE,
                                 FALSE, FALSE,
                                 TRUE, FALSE, TRUE),
                        values=c(starting_values$diff_rand,
                                 NA,NA,
                                 starting_values$beta_rand, NA, starting_values$gamma_rand),
                        labels=c('var_diff',
                                 'NA(cov_diff_beta)','NA(cov_diff_gamma)',
                                 'var_beta', 'NA(var_beta_gamma)', 'var_gamma')),
 
 
                 #Functional constraints
                  mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                          values = starting_values$diff_fixed, labels = 'diff', name = 'd'),
                 mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                          values = starting_values$beta_fixed, labels = 'beta', name = 'b'),
                 mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                          values = starting_values$gamma_fixed, labels = 'gamma', name = 'g'),
                 mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, 
                          values = measurement_days, name = 'time'),
 
                 #partial derivative of logistic function with respect to each parameter (diff, beta, gamma, respectively)
                 mxAlgebra(expression = 1/(1 + exp((beta - time)/gamma)), name="Dl"), 
                 mxAlgebra(expression =  -(diff*(exp((beta - time)/gamma) * (1/gamma))/(1 + exp((beta - time)/gamma))^2), name = 'Bl'),
                 mxAlgebra(expression =  diff * (exp((beta - time)/gamma) * ((beta - time)/gamma^2))/(1 + exp((beta - time)/gamma))^2, name = 'Gl'),
                 #Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives 
                 mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE,
                        labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Gl[5,1]")),
                 mxPath(from='beta', to = manifest_vars, arrows=1,  free=FALSE,
                        labels =  c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Gl[5,1]")),
                 mxPath(from='gamma', to = manifest_vars, arrows=1,  free=FALSE,
                        labels =  c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]")),
 
                 #Enable likelihood vector; returns vector of likelihoods for each data point (i.e., p(X|parameter_values))
                 mxFitFunctionML(vector = TRUE)
 
                 #implement EM algorithm to estimate parameter values??
 
)
 
mxRun(model) #it appears no errors (hopefully) were made in defining the model, but the model does not converge (can I use EM algorithm?)
#not sure how to repurpose this code for the nonlinear latent growth curve model 
mxModel(
  "Mixture", data4mx, class1, class2,
  mxAlgebra((1-Posteriors) * Class1.fitfunction, name="PL1"),
  mxAlgebra(Posteriors * Class2.fitfunction, name="PL2"),
  mxAlgebra(PL1 + PL2, name="PL"),
  mxAlgebra(PL2 / PL,  recompute='onDemand',
            initial=matrix(runif(N,.4,.6), nrow=N, ncol = 1), name="Posteriors"),
  mxAlgebra(-2*sum(log(PL)), name="FF"),
  mxFitFunctionAlgebra(algebra="FF"),
  mxComputeEM(
    estep=mxComputeOnce("Mixture.Posteriors"),
    mstep=mxComputeGradientDescent(fitfunction="Mixture.fitfunction")))

References

Browne, M. W., & du Toit, S. H. C. (1991). Models for learning data. In L. M. Collins & J. L. Horn (Eds.), Best methods for the analysis of change (pp. 47–68). Washington, DC: American Psychological Association.
Grimm, K. J., Ram, N., & Estabrook, R. (2010). Nonlinear structured growth mixture models in Mplus and OpenMx. Multivariate Behavioral Research, 45(6), 887–909. https://doi.org/10.1080/00273171.2010.531230
Preacher, K. J., & Hancock, G. R. (2015). Meaningful aspects of change as novel random coefficients: A general method for reparameterizing longitudinal models. Psychological Methods, 20(1), 84–101. http://dx.doi.org/10.1037/met0000028

AdminJosh's picture
Offline
Joined: 12/12/2012 - 12:15
more clarity

"FIML estimation is not well suited to dealing with latent variables" -- Not sure about this reasoning. How you tried it?

You aren't trying to fit a mixture model. From what I can tell, you have a growth curve model with non-standard basis functions. I'd expect the default optimization strategy to work reasonably well.

If you want to re-frame the optimization as an E-M algorithm then you need to divide the model into two separate phases (E and M). Can you describe how you envision this? What would happen in phase E and phase M?

AdminJosh's picture
Offline
Joined: 12/12/2012 - 12:15
results

When I run your model, I get

       name matrix   row   col      Estimate    Std.Error A
1   epsilon      S  t1_1  t1_1      3.134425 6.390361e+01  
2  var_diff      S  diff  diff     -4.420305 4.166044e+01  
3  var_beta      S  beta  beta    345.381668           NA !
4 var_gamma      S gamma gamma    346.345101           NA !
5      diff      M     1  diff      8.317164 3.432677e+01  
6      beta      b     1     1  52604.749831 2.013878e-03 !
7     gamma      g     1     1 137508.614087 7.684322e-04 !

Can you put some box constraints on the parameters so the optimizer doesn't explore crazy solutions?

gamma cannot be greater than 300, yes? beta should be between -200 and 200? Try the attached.

File attachments: 
ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
Thank you for that insight!

Thank you for that insight! Didn't know I could set constraints on the parameter estimation process. With respect to the fixed-effect parameters, I would set the following constraints:

  • beta: 0-360
  • gamma: 0-360
  • diff: 0:1 (diff is in standardized effect size units)

For variances and residuals, I suppose nothing below zero.

I did realize, however, that the code in your response (lgc.R) used a different starting value for epsilon (0.9832023 vs 5). than I had initially posted (not sure if that was an error in copying the code or intentional to point out the importance of setting box constraints).

With respect to the model itself, I am indeed trying to implement the EM algorithm in a latent growth curve model with a linearized nonlinear function (I just realized that my mentioning of growth mixture models might have introduced confusion). Although I understand how the EM algorithm works in the common coin flipping example (e.g., one of two coins, each with its own probability of heads [$h_1$, $h_2$], is selected with a given probability [q] and $h_1$, $h_2$, and q must be estimated ), I am unsure how to precisely compute the expectation (i.e., a function that lower-bounds the marginal log-likelihood) and then maximize the lower-bounding function.

Given that I am currently integrating the parallel method of running a model with multiple sets of starting values (using snowfall), I will include the tip on box constraints and see if my model fitting procedure is robust. With respect to FIML, is there any resource you know of that discusses its ability to deal with latent variables? I could only find articles discussing FIML with missing data.

AdminJosh's picture
Offline
Joined: 12/12/2012 - 12:15
FIML

FIML, a flavor of maximum likelihood, is commonly used with latent variable models. Check out "In All Likelihood: Statistical Modelling And Inference Using Likelihood" by Pawitan. I think you wont find any articles because it is not remotely controversial.

> used a different starting value for epsilon (0.9832023 vs 5).

Yeah, I edited that. It's usually a good idea to start with a larger residual variance than you expect. It helps the optimizer not get locked into a small search space too early.

> Given that I am currently integrating the parallel method of running a model with multiple sets of starting values

Yeah, that sounds like a sensible approach. Report back here on your experience!

ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
I tried fitting the nonlinear

I tried fitting the nonlinear latent growth curve model to the data set (data_wide.csv) with 300 sets of starting values (starting_values.csv) and have not been able to find a solution (only status codes of 5 and 6 were obtained). Note that I generated starting values for epsilon and also updated upper and lower boundaries in the parameter estimation process as per your advice. I am unsure where to go from here and so figured I would provide an update on what I have done (hopefully this problem can be solved).

Because I generated the data set, I 'know' the population values and, unfortunately, have not been able to obtain remotely close estimates. The population values that were used to generate the data are listed below:

  • diff_fixed = 0.48
  • beta_fixed = 60
  • gamma_fixed= 20
  • diff_rand = $0.30^2$
  • beta_rand = $5^2$
  • gamma_rand= $5^2$
  • epsilon = $0.25^2$

The parameter values were used with the logistic function below to generate observed scores for each i individual at each t time point:

$$ y_{it} = \frac{diff_i}{1+e^{\frac{\beta_i - time_t}{\gamma_i}}} + \epsilon_{it}$$

Note that all covariances have been set to zero. Also note that I have nested code within functions to allow a more efficient (and readable) model fitting procedure. The necessary functions are provided in the code chunk below.

#generates names for manifest variables (this is nested within create_logistic_growth_model)
generate_manifest_var_names <- function(data_wide) {
 
  manifest_names <- names(data_wide[ , 2:ncol(data_wide)])
  return(manifest_names)
}
 
#create nonlinear latent growth curve model 
create_logistic_growth_model <- function(data_wide, model_name, starting_values) {
 
  manifest_vars <- generate_manifest_var_names(data = data_wide)
  latent_vars <- c('diff_fixed', 'beta_fixed', 'gamma_fixed')
  measurement_days <- as.numeric(str_extract(string = names(data_wide[ 2:ncol(data_wide)]), pattern = '[^_]*$'))
 
  model <- mxModel(model = model_name,
                   type = 'RAM', independent = T,
                   mxData(observed = data_wide, type = 'raw'),
 
                   manifestVars = manifest_vars,
                   latentVars = latent_vars,
 
                   #Manifest means
                   manMeans = mxPath(from = 'one', to = manifest_vars, free = FALSE, arrows = 1),
 
                   #Latent variable means (linear parameters). Note that the nonlinear parameters of beta and gamma do not have estimated means
                   mxPath(from = 'one', to = c('diff_fixed'),free = TRUE, arrows = 1, values = starting_values$diff_fixed,
                          labels = 'diff_fixed', lbound = 0, ubound = 2),
 
                   #Residual variances; by using one label, they are assumed to all be equal (homogeneity of variance)
                   mxPath(from = manifest_vars,
                          arrows=2, free=TRUE, values = starting_values$epsilon, labels='epsilon', lbound = 1e-3),
 
                   #Latent variable covariances and variances
                   mxPath(from = latent_vars,
                          connect='unique.pairs', arrows=2,
                          #aa(diff_rand), ab(cov_diff_beta), ac(cov_diff_gamma), bb(beta_rand), bc(var_beta_gamma), cc(gamma_rand)
                          free = c(TRUE,
                                   FALSE, FALSE,
                                   TRUE, FALSE, TRUE),
                          values=c(starting_values$diff_rand,
                                   NA,NA,
                                   starting_values$beta_rand, NA, starting_values$gamma_rand),
                          labels=c('diff_rand',
                                   'NA(cov_diff_beta)','NA(cov_diff_gamma)',
                                   'beta_rand', 'NA(var_beta_gamma)', 'gamma_rand'), 
                          lbound = c(1e-3, 
                                     NA, NA, 
                                     1, NA, 1)),
 
                   #Functional constraints
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                            values = starting_values$diff_fixed, labels = 'diff_fixed', name = 'd',  lbound = 0, ubound = 2),
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                            values = starting_values$beta_fixed, labels = 'beta_fixed', name = 'b', lbound = 0, ubound = 360),
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                            values = starting_values$gamma_fixed, labels = 'gamma_fixed', name = 'g', lbound = 0, ubound = 360),
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, values = measurement_days, name = 'time'),
 
                   mxAlgebra(expression = 1/(1 + exp((beta_fixed - time)/gamma_fixed)), name="Dl"),
                   mxAlgebra(expression =  -(diff_fixed*(exp((beta_fixed - time)/gamma_fixed) * 
                                                           (1/gamma_fixed))/(1 + exp((beta_fixed - time)/gamma_fixed))^2), name = 'Bl'),
                   mxAlgebra(expression =  diff_fixed * (exp((beta_fixed - time)/gamma_fixed) * ((beta_fixed - time)/gamma_fixed^2))/
                               (1 + exp((beta_fixed - time)/gamma_fixed))^2, name = 'Gl'),
 
                     #Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives 
                 mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE,
                        labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Dl[5,1]")),
                 mxPath(from='beta', to = manifest_vars, arrows=1,  free=FALSE,
                        labels =  c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Bl[5,1]")),
                 mxPath(from='gamma', to = manifest_vars, arrows=1,  free=FALSE,
                        labels =  c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]")),
 
                   mxFitFunctionML()
                   )
 
  return(model)
}
 
 
#changes starting values of model created from create_logistic_growth_model
update_starting_values <- function(model_number, model, starting_values){
 
  param_name_order <- names(omxGetParameters(model))
 
    #set model with new starting values  
    model_name <- paste("Iteration", model_number, sep="")
    new_model <- omxSetParameters(model = model, 
                                  name = model_name,
                                  labels = param_name_order, 
                                  values = as.numeric(setcolorder(x = starting_values[model_number, ], 
                                                                  neworder = param_name_order)))
 
    #return model with new starting values  
    return(new_model)
}
 
#iterates update_starting_values to create a parent model where each submodel has different starting values 
assemble_model_starts <- function(model, starting_values) {
 
  parent_model <- mxModel(name = "parent_model")
  sub_models <- lapply(X = 1:nrow(starting_values), FUN = update_starting_values, model = model, starting_values = starting_values)
  merged_model <- mxModel(parent_model, sub_models)
 
  return(merged_model)
}
 
#extract necessary information about fit from each submodel 
extract_fit_info <- function(model){
 
  retval  <- c(
      model$output$fit,
      model$output$status$code,
      model$output$iterations)
      #round(model$expectation$output$weights[1], 4)) #temporarily omitted 
  return(retval)
}

Code attempts to find a solution by fitting a nonlinear latent growth curve model with 300 sets of starting values (i.e., nrow(starting_values).

library(OpenMx)
library(data.table)
library(stringr)
library(snowfall)
 
#read in starting values and raw data sets
starting_values <- read_csv('starting_values.csv')
data_wide <- read_csv('data_wide.csv')
 
#create nonlinear latent growth curve model 
latent_growth_model <- create_logistic_growth_model(data_wide = data_wide, 
                                                    model_name = 'lg_nonlinear', starting_values = starting_values[1, ])
 
#run submodel creation and model fitting in parallel mode
sfInit(parallel = TRUE, cpus = detectCores() - 1) #save 1 core to prevent computer from crashing
sfLibrary(OpenMx)
 
#create submodels that each have a different set of starting values 
merged_models <- assemble_model_starts(model = latent_growth_model, starting_values = starting_values)
 
#run all models 
output <- mxRun(merged_models, unsafe = T)
 
resultsFit  <- data.table(t(omxSapply(output$submodels, fitStats)))
names(resultsFit) <- c("fit", "status_code", "num_iterations")
sum(resultsFit$status_code == 0) #0
 
sfStop() #close parallel model

I also tried fitting the above model fitting procedure with larger samples sizes and a more ideal set of measurement days (e.g., 1, 90, 280, 270, 360) but had no success. I have not included code for these tests largely because the data in data_wide.csv represent a fairly non-optimal condition in my simulations and I assume that if I can devise a model fitting procedure that succeeds under these conditions, then the procedure will succeed under more ideal conditions. I hope I have not provided a confusing summary of my efforts.

Code below fits model with ideal set of starting values (not sure if this is a useful test; I assume convergence should occur in this context?).

ideal_starting_values <- data.table('beta_fixed' = 67, 
                                    'gamma_fixed' = 15, 
                                    'beta_rand' = 25, 
                                    'gamma_rand' = 20, 
                                    'diff_fixed' = 0.7,
                                    'diff_rand' = 0.6, 
                                    'epsilon' = 0.9)
 
latent_growth_model_ideal <- create_logistic_growth_model(data_wide = data_wide, 
                                                    model_name = 'lg_nonlinear', starting_values = ideal_starting_values)
 
first_run <- mxRun(latent_growth_model_ideal) #status code 6
 
mxRun(first_run) #status code 5 
 
mxTryHard(latent_growth_model_ideal) #All fit attempts resulted in errors
AdminJosh's picture
Offline
Joined: 12/12/2012 - 12:15
Structured latent growth curves for twin data

I asked for some help on the openmx mailing list. Mike Neale wrote,
"if the derivatives of the non-linear function with respect to its
parameters are known, these can be used for the factor loadings. This was the
approach McArdle and I used (based on Michael Browne’s work) to specify LGMs
that used logistic, exponential and Gompertz curves." See the attached manuscript.

File attachments: 
ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
Thank you for helping me out!

Thank you for helping me out! Hmm, I thought the factor loadings in my model were already specified as derivatives of the three-parameter logistic function with respect to the corresponding parameter? That is, doesn't the code below implement these factor loading specifications?

mxAlgebra(expression = 1/(1 + exp((beta_fixed - time)/gamma_fixed)), name="Dl"),
                   mxAlgebra(expression =  -(diff_fixed*(exp((beta_fixed - time)/gamma_fixed) * 
                                                           (1/gamma_fixed))/(1 + exp((beta_fixed - time)/gamma_fixed))^2), name = 'Bl'),
                   mxAlgebra(expression =  diff_fixed * (exp((beta_fixed - time)/gamma_fixed) * ((beta_fixed - time)/gamma_fixed^2))/
                               (1 + exp((beta_fixed - time)/gamma_fixed))^2, name = 'Gl'),
 
                     #Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives 
                 mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE,
                        labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Dl[5,1]")),
                 mxPath(from='beta', to = manifest_vars, arrows=1,  free=FALSE,
                        labels =  c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Bl[5,1]")),
                 mxPath(from='gamma', to = manifest_vars, arrows=1,  free=FALSE,
                        labels =  c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]"))
tbates's picture
Offline
Joined: 07/31/2009 - 14:25
Are you fixing the parameters?

Are you fixing the paths going into your manifest variables, or estimating them?

Mikes point is that you can fix the paths going into the manifests across different times, and then the task is to create inputs into the latent variable hub driving those paths, and it’s those estimates the parameters, that are inputs to a function like Gompertz.
Phone with web interface - may contain typos :-)

ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
The paths going into the

The paths going into the manifest variables are fixed (from my understanding, that is implemented by setting free=FALSE) and my goal is to only estimate values for the logistic function parameters (diff, beta, and gamma, in my case). In looking at the output of any of my models, I have 7 estimated parameters (fixed and random effects for diff, beta, and gamma and a residual term epsilon) and, if I am correct, the fixed-effect estimates represent the inputs into the latent variable hub that you are talking about.

ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
Minor edit to the code I

Minor edit to the code I posted. Because I renamed the fixed-effects variables, the from arguments change (the _fixed suffix has been added; see below code snippet).

#Factor loadings; all fixed and, importantly, constrained to equal their partial derivatives 
 mxPath(from = 'diff_fixed', to = manifest_vars, arrows=1, free=FALSE,
    labels = c("Dl[1,1]", "Dl[2,1]", "Dl[3,1]", "Dl[4,1]", "Dl[5,1]")),
mxPath(from='beta_fixed', to = manifest_vars, arrows=1,  free=FALSE,
    labels =  c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Bl[5,1]")),
mxPath(from='gamma_fixed', to = manifest_vars, arrows=1,  free=FALSE,
    labels =  c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]"))
ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
Important edit to code:

Important edit to code:

I realized that my definitions of the partial derivatives indexed the incorrect variables (I confused the name and labels arguments). I have made corresponding changes to my code such that the paths going into the manifest variables are now fixed to the (correctly defined) partial derivatives. Here is the relevant code that was changed:

#Functional constraints
mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
    values = starting_values$diff_fixed[1], labels = 'diff_fixed', name = 'd',  lbound = 0, ubound = 2),
 mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
    values = starting_values$beta_fixed[1], labels = 'beta_fixed', name = 'b', lbound = 0, ubound = 360),
mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
    values = starting_values$gamma_fixed[1], labels = 'gamma_fixed', name = 'g', lbound = 0, ubound = 360),
mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, 
    values = measurement_days, name = 'time'),
 
#partial derivatives
mxAlgebra(expression = 1/(1 + exp((b - time)/g)), name="Dl"),
mxAlgebra(expression =  -(d * (exp((b - time)/g) * (1/g))/(1 + exp((b - time)/g))^2), name = 'Bl'),
mxAlgebra(expression =  d * (exp((b - time)/g) * ((b - time)/g^2))/(1 + exp((b - time)/g))^2, name = 'Gl')

I also reran my model fitting procedure (see nonlin_lgc_fitting.R) using 300 sets of starting values but no model converged (see nonlinear_lgc_procedure.R).

tbates's picture
Offline
Joined: 07/31/2009 - 14:25
set path label= "algebra[1,1]"

so you want the path values to be controlled by algebras which are working out the value (function derivatives) based on the current function control parameters.

Relevant piece from Mike Neale

> Modeling growth curves such as those that arise from differential equations is a more complex task than is
> usual for structural equation modeling. Effectively, the factor loadings are neither constants nor
> free parameters, but complex functions obtained as the partial derivatives of the growth curve function
> with respect to its free parameters. The expected means are obtained directly from the growth
> curve function itself, but these too may involve non-trivial algebra. In our 2000 paper, Jack and
> I tabulated these derivatives for the Logistic, Gompertz and Exponential growth curve forms
> (Neale & McArdle, 2000). Here, for illustration, I reproduce the equations for the Gompertz curve in Table 14.1.
> Figures 14.4 and 14.5 respectively show path diagrams for the conventional level, slope,
> and quadratic polynomial component LGC model, and a parametric structured curve model. The
> primary difference between the figures is that in the LGC model the factor loadings are fixed to constant
> values, whereas in the PGC model the loadings are complicated algebraic functions of the free parameters
> of the growth curve being fitted.

ssciarra's picture
Offline
Joined: 01/08/2021 - 12:50
Correct algebra implementation?

Yes, in OpenMx vernacular, my goal is to fix the loadings from the latent variables using the algebras. If I am correct, the code below (taken from nonlin_lgc_procedure.R) does this:

latent_vars <- c('diff', 'beta', 'gamma')
#Functional constraints
mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
    values = starting_values$diff_fixed[1], labels = 'diff_fixed', name = 'd',  lbound = 0, ubound = 2),
mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
    values = starting_values$beta_fixed[1], labels = 'beta_fixed', name = 'b', lbound = 0, ubound = 360),
mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
    values = starting_values$gamma_fixed[1], labels = 'gamma_fixed', name = 'g', lbound = 0, ubound = 360),
mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, 
    values = measurement_days, name = 'time'),
 
#Algebras (corresponding partial derivatives)
mxAlgebra(expression = 1/(1 + exp((b - time)/g)), name="Dl"),
mxAlgebra(expression =  -(d * (exp((b - time)/g) * (1/g))/(1 + exp((b - time)/g))^2), name = 'Bl'),
mxAlgebra(expression =  d * (exp((b - time)/g) * ((b - time)/g^2))/(1 + exp((b - time)/g))^2, name = 'Gl'),
 
#Factor loadings; all fixed and, importantly, constrained to change according to their partial derivatives
mxPath(from = 'diff', to = manifest_vars, arrows=1, free=FALSE, values = 1,
    labels = c('Dl[1,1]', 'Dl[2,1]', 'Dl[3,1]', 'Dl[4,1]', 'Dl[5,1])')), 
mxPath(from='beta', to = manifest_vars, arrows=1,  free=FALSE,
    labels =  c('Bl[1,1]', 'Bl[2,1]', 'Bl[3,1]', 'Bl[4,1]', 'Bl[5,1])')), 
mxPath(from='gamma', to = manifest_vars, arrows=1,  free=FALSE,
    labels =  c('Gl[1,1]', 'Gl[2,1]', 'Gl[3,1]', 'Gl[4,1]', 'Gl[5,1])'))

Note that, in my model-creating function (create_logistic_growth_model.R), I have automated the label creation process using sprintf(). Also note that I changed the names of the latent variables to avoid any confusion with the labels used for the functional constraints.

File attachments: