Structured latent growth curve with definition variables

7 posts / 0 new

Mon, 11/14/2022 - 14:03

ssciarra

Offline

Joined: 01/08/2021 - 12:50

Structured latent growth curve with definition variables

Attachment	Size
data.csv	43.75 KB

I am trying to fit a structured latent growth curve model where the actual days of observation are inserted into the latent variable loadings using definition variables. Unfortunately, I run into the following error:

 
Error in value[rows[[i]], cols[[i]]] <- startValue : 
incorrect number of subscripts on matrix

The model I am trying to fit is provided below (with the data file attached in the post). Note that I am not estimating covariances and only estimating fixed- and random-effect parameters along with a residual parameter (nine parameters in total).

 
data_wide <- read_csv(file = 'data.csv')
 
#variable names 
manifest_vars <- names(data_wide)[str_detect(string = names(data_wide), pattern = 'obs_score')]
time_vars <- names(data_wide)[str_detect(string = names(data_wide), pattern = 'actual')]
latent_vars <- c('theta', 'alpha', 'beta', 'gamma')
measurement_days <- as.numeric(str_extract(string = names(data_wide[ 2:ncol(data_wide)]), pattern = '[^_]*$'))[1:length(time_vars)]
 
model <- mxModel(model = 'Definition model',
                   type = 'RAM', independent = T,
                   mxData(observed = data_wide, type = 'raw'),
                 manifestVars = manifest_vars, 
                 latentVars = latent_vars,
 
 
 
                   #Residual variances; by using one label, they are assumed to all be equal (homogeneity of variance)
                   mxPath(from = manifest_vars,
                          arrows=2, free=TRUE,  labels='epsilon', values = 0.04, lbound = 0),
 
                   #Latent variable covariances and variances
                   mxPath(from = latent_vars,
                          connect='unique.pairs', arrows=2,
                          #aa(diff_rand), ab(cov_diff_beta), ac(cov_diff_gamma), bb(beta_rand), bc(var_beta_gamma), cc(gamma_rand)
                          free = c(TRUE,FALSE, FALSE, FALSE, 
                                   TRUE, FALSE, FALSE, 
                                   TRUE, FALSE, 
                                   TRUE), 
                          values=c(0.04, NA, NA, NA, 
                                   0.07, NA, NA, 
                                   15, NA,
                                   5.6),
                          labels=c('theta_rand', 'NA(cov_theta_alpha)', 'NA(cov_theta_beta)', 'NA(cov_theta_gamma)',
                                   'alpha_rand','NA(cov_alpha_beta)', 'NA(cov_alpha_gamma)', 
                                   'beta_rand', 'NA(cov_beta_gamma)', 
                                   'gamma_rand'), 
                          lbound = c(1e-3, NA, NA, NA, 
                                     1e-3, NA, NA, 
                                     1, NA,
                                     1), 
                          ubound = c(2, NA, NA, NA, 
                                     2, NA, NA, 
                                     90^2, NA, 45^2)),
 
                   #Latent variable means (linear parameters). Note that the nonlinear parameters of beta and gamma do not have estimated means
                   mxPath(from = 'one', to = c('theta', 'alpha'), free = c(TRUE, TRUE), arrows = 1,
                          labels = c('theta_fixed', 'alpha_fixed'), lbound = 0, ubound = 7, 
                          values = c(2.9, 3.3)),
 
                   #Functional constraints
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                            labels = 'theta_fixed', name = 't',  lbound = 0,  ubound = 7, values = 2.9),
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                            labels = 'alpha_fixed', name = 'a',  lbound = 0,  ubound = 7, values = 3.3), 
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                            labels = 'beta_fixed', name = 'b', lbound = 1, ubound = 360, values = 171), 
                   mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = TRUE, 
                            labels = 'gamma_fixed', name = 'g', lbound = 1, ubound = 360,  values = 15), 
 
                  ##Specifies time matrix that assumes time-structured data (model runs with this time matrix)
                  #mxMatrix(type = 'Full', nrow = length(time_vars), ncol = 1, free = FALSE, 
                  #           values = measurement_days[1:7], 
                  #           name = 'time'),
 
                 #Specifies time matrix with definition variables that can account for varying measurements points (i.e., time-unstructured data)
                 ##Error occurs with this time matrix 
                  mxMatrix(type = 'Full', nrow = length(manifest_vars), ncol = 1, free = FALSE, 
                           labels = c("data.actual_measurement_day_0", "data.actual_measurement_day_60", "data.actual_measurement_day_120", 
                                      "data.actual_measurement_day_180", "data.actual_measurement_day_240", "data.actual_measurement_day_300",
                                      "data.actual_measurement_day_360"), name = 'time'),
 
                   #Algebra specifying first partial derivatives; 
                   mxAlgebra(expression = 1 - 1/(1 + exp((b - time)/g)), name="Tl"),
                   mxAlgebra(expression = 1/(1 + exp((b - time)/g)), name = 'Al'), 
                   mxAlgebra(expression = -((a - t) * (exp((b - time)/g) * (1/g))/(1 + exp((b - time)/g))^2), name = 'Bl'),
                   mxAlgebra(expression =  (a - t) * (exp((b - time)/g) * ((b - time)/g^2))/(1 + exp((b -time)/g))^2, name = 'Gl'),
 
                   #Factor loadings; all fixed and, importantly, constrained to change according to their partial derivatives (i.e., nonlinear functions) 
                   mxPath(from = 'theta', to = manifest_vars, arrows=1, free=FALSE,  
                          labels = c("Tl[1,1]", "Tl[2,1]", "Tl[3,1]", "Tl[4,1]", "Tl[5,1]", "Tl[6,1]", "Tl[7,1]")),
                   mxPath(from = 'alpha', to = manifest_vars, arrows=1, free=FALSE,  
                          labels = c("Al[1,1]", "Al[2,1]", "Al[3,1]", "Al[4,1]", "Al[5,1]", "Al[6,1]", "Al[7,1]")),
                   mxPath(from='beta', to = manifest_vars, arrows=1,  free=FALSE,
                         labels = c("Bl[1,1]", "Bl[2,1]", "Bl[3,1]", "Bl[4,1]", "Bl[5,1]", "Bl[6,1]", "Bl[7,1]")),
                   mxPath(from='gamma', to = manifest_vars, arrows=1,  free=FALSE,
                           labels = c("Gl[1,1]", "Gl[2,1]", "Gl[3,1]", "Gl[4,1]", "Gl[5,1]", "Gl[6,1]", "Gl[7,1]")),
 
                   mxFitFunctionML(vector = FALSE)
)
 
model_results <- mxRun(model) #only get error when fitting time matrix specified with definition variables

Mon, 11/14/2022 - 16:58

Ryne

Offline

Joined: 07/31/2009 - 15:12

I was unable to replicate

I was unable to replicate your error: model ran fine! I've copy-pasted your results.

-Ryne

> model_results <- mxRun(model) #only get error when fitting time matrix specified with definition variables
Running Definition model with 9 parameters
 
> summary(model_results)
Summary of Definition model 
 
free parameters:
         name matrix         row         col     Estimate    Std.Error A lbound ubound
1     epsilon      S obs_score_0 obs_score_0 2.506122e-03 1.185207e-04 !     0!       
2  theta_rand      S       theta       theta 2.603298e-03 3.306708e-04 ! 0.001!      2
3  alpha_rand      S       alpha       alpha 2.812481e-03 3.443755e-04 ! 0.001!      2
4   beta_rand      S        beta        beta 1.160674e+02 3.253442e+01        1   8100
5  gamma_rand      S       gamma       gamma 1.000000e+00 1.309539e+01 !     1!   2025
6 theta_fixed      M           1       theta 3.000642e+00 4.141833e-03        0      7
7 alpha_fixed      M           1       alpha 3.318915e+00 4.123807e-03        0      7
8  beta_fixed      b           1           1 1.797509e+02 1.229329e+00        1    360
9 gamma_fixed      g           1           1 1.996665e+01 1.117314e+00        1    360
 
Model Statistics: 
               |  Parameters  |  Degrees of Freedom  |  Fit (-2lnL units)
       Model:              9                   1566             -4193.892
   Saturated:             35                   1540                    NA
Independence:             14                   1561                    NA
Number of observations/statistics: 225/1575
 
Information Criteria: 
      |  df Penalty  |  Parameters Penalty  |  Sample-Size Adjusted
AIC:      -7325.892              -4175.892                -4175.054
BIC:     -12675.505              -4145.147                -4173.670
To get additional fit indices, see help(mxRefModels)
timestamp: 2022-11-14 14:54:28 
Wall clock time: 4.132559 secs 
optimizer:  SLSQP 
OpenMx version number: 2.20.6 
Need help?  See help(mxSummary)

Mon, 11/14/2022 - 17:03

mhunter

Offline

Joined: 07/31/2009 - 15:26

Fixed

Hi!

There was a bug in OpenMx around using tibbles with definition variables.

This forum post had the same issue: https://openmx.ssri.psu.edu/comment/9559#comment-9559

The temporary fix on your end is to hand OpenMx a data.frame instead of a tibble. For example, use

mxData(observed = as.data.frame(data_wide), type = 'raw'),

The ultimate fix is already solved in the OpenMx development version on GitHub with this issue: https://github.com/OpenMx/OpenMx/issues/345. The solution will be part of the next release of OpenMx.

Cheers!

Mon, 11/14/2022 - 18:39

(Reply to #3) #4

ssciarra

Offline

Joined: 01/08/2021 - 12:50

Is convergence more difficult with definition variables?

Thank you very much! I forgot about the data.frame requirement. Just have one more question: Is convergence more difficult to obtain with definition variables? My understanding of using definition variables in this case is, by using the actual time at which each measurement is recorded for each person, an individual model has to be fit for each person. Thus, although using definition variables allows a model to account for individually varying measurement times, convergence becomes more difficult because the optimization problem of fitting models for each person's data is inherently more complex.

Just asking this question because I have repeatedly generated data sets and noticed that convergence codes of 0 (i.e., code = 0 can only be obtained by using mxTryHard() and convergence time has also increased considerably (from almost instant when not using definition variables to roughly 8 seconds).

Mon, 11/14/2022 - 20:27

mhunter

Offline

Joined: 07/31/2009 - 15:26

Convergence

> Is convergence more difficult to obtain with definition variables?

In general, no. Models with definition variables do not fit a separate model for each person. Rather, these models allow the expected means and covariances to differ in some systematic way for every person. In the case of a growth model with definition variables on the loadings that allows different times of measurement across people, every person has the same model with the same parameters; however, the expected means and covariances differ for each person depending on their particular times of measurement. If the model is "true" and the times of measurement differ across people, then not accounting for this difference amounts to model misspecification. In this case of misspecification, the model should fit more poorly, converge less frequently, and have biased estimates (particularly of the residual error variance).

As far as the problems you're encountering, I have two thoughts. First, models with definition variables are generally slower to estimate. Allowing the means and covariances to differ across people is more computationally expensive, so it takes a bit longer. What you reported is not unusual. Second, for the model convergence, there are a lot of possible factors, but if it's solved by mxTryHard() then that's a sufficient solution without going into detail on all the possibilities.

Wed, 11/16/2022 - 17:10

(Reply to #5) #6

ssciarra

Offline

Joined: 01/08/2021 - 12:50

Additional resources?

Ok. Thank you very much for that answer! If I wanted to dig a bit deeper into the parameter estimation procedure with definition variables, are there any resources that you might suggest?

Fri, 12/09/2022 - 14:35

mhunter

Offline

Joined: 07/31/2009 - 15:26

User's Guide

These two chapters from the user's guide go into some detail:

Definition Means Path Spec
Definition Means Matrix Spec

Main menu

Navigation

You are here

Structured latent growth curve with definition variables

Main menu

User login

Navigation

You are here

Structured latent growth curve with definition variables

Search form