You are here

Product of Variables Identification

4 posts / 0 new
Last post
jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Product of Variables Identification
AttachmentSize
PDF icon model (8).pdf56.42 KB

Hey,

my aim is to use product of variables to fit a random-intercept random-slope model with latent mean centering. To be precise, my model is as follows. $i$ indexes person, $t$ indexes observations within a person.

$$Y_{it} = \beta_i (X_{it}-meanX_i) + meanY_i + residualY_{ij}$$
$$X_{it} = meanX_i + residualX_{ij} $$

$$\beta_i \sim N(\mu_\beta, \sigma_\beta^2), \quad meanX_i \sim N(\mu_X, \sigma_X^2), \quad meanY_i \sim N(\mu_Y, \sigma_Y^2)$$

If using a fixed slope, that is, $\beta_i=\beta$, this can be fitted in any SEM programm. The path diagram is as attached. Apologies for the difference in notation. To make $\beta$ (or a in the path diagram notation) random, it needs to switch from a path loading to a latent variable, and the regression prediction is essentially represented by a product node involving the latent variable $a$ and $X_1$ or $X_2$ etc. Mplus seems to fit this model using a Bayesian approach, and there is also a tutorial on how to fit similar models using brms/stan (https://vuorre.com/posts/latent-mean-centering/).

After reading the recent paper by Steve et al. about Products of variables, I tried to fit it in OpenMx to get frequentist estimates. Unfortunately, there seems to be a problem with identification. I get the warning message: In model 'SEM' Optimizer returned a non-zero status code. It seems the model is not identified. Can somebody help with that? Is the model truely not identified or is this an OpenMx problem? I am not sure how to assess indentification with product variables. Did I make a mistake?

Below is my code:

library(OpenMx)
library(dplyr)
library(tidyr)
 
simulateData <- function(n_subjs, n_timepoints) {
  N <- n_subjs * n_timepoints
  the_data <- data.frame(ID = numeric(N), T = numeric(N), X = numeric(N), Y = numeric(N))
  Var_X <- 1
  residual_Y <- 1
  within_coef_dist <- function() rnorm(1, mean = 0.2)
  for (i in 1:n_subjs) {
    rIcep1 <- rnorm(1)
    rIcep2 <- rIcep1 + rnorm(1)
    dev1 <- rnorm(n_timepoints)
    var1 <- rIcep1 + dev1
    dev2 <- within_coef_dist() * dev1 + rnorm(n_timepoints)
    var2 <- rIcep2 + dev2
    the_data[(i - 1) * n_timepoints + (1:n_timepoints), ] <- c(rep(i, n_timepoints), 1:n_timepoints, var1, var2)
  }
  return(the_data)
}
 
nVars <- 6 # 
varNamesX <- paste0("X_", 1:nVars)
varNamesY <- paste0("Y_", 1:nVars)
resXnames <- paste0("res_x", 1:nVars)
resYnames <- paste0("res_y", 1:nVars)
productVars <- paste("coeffcient", resXnames, sep = "_")
 
# Define model
manifests <- c(varNamesX, varNamesY)
latents <- c("InterceptX", "InterceptY", "coefficient", resXnames, resYnames)
model_template <- mxModel(model = "SEM",
                          type="RAM",
                          manifestVars=manifests,
                          latentVars=latents,
 
                          # random intercepts
                          mxPath(from="one", to=c("InterceptX", "InterceptY"), arrows=1, 
                                 free=TRUE, values=1, labels=c("muX","muY")),
                          mxPath(from=c("InterceptX"), to=varNamesX, arrows=1, free=FALSE, values=1),
                          mxPath(from=c("InterceptY"), to=varNamesY, arrows=1, free=FALSE, values=1),
                          mxPath(from = "InterceptX", to = "InterceptY", arrows = 2 , free = TRUE, label = "covXY"),
                          mxPath(from = "InterceptX", to = "InterceptX", arrows = 2 , free = TRUE, label = "varIX"),
                          mxPath(from = "InterceptY", to = "InterceptY", arrows = 2 , free = TRUE, label = "varIY"),
 
                          # residuals
                          mxPath(from=resXnames, to=varNamesX, arrows=1, free=FALSE, values=1),
                          mxPath(from=resYnames, to=varNamesY, arrows=1, free=FALSE, values=1),
                          mxPath(from=resXnames, to=resXnames, arrows=2, free=TRUE, values=0.1, labels="res_x"),
                          mxPath(from=resYnames, to=resYnames, arrows=2, free=TRUE, values=0.1, labels="res_y"),
 
                          #random slope
                          mxPath(from = "coefficient", to = "coefficient", arrows = 2 , free = TRUE, label = "varCoef"),
                          mxPath(from = "one", to = "coefficient", arrows = 1 , free = TRUE, label = "muCoef")
)
 
 
# add product nodes for random slope
model_template <- mxModel(model_template, productVars=productVars)
for(i in 1:nVars){
  model_template <- mxModel(model_template, 
                            mxPath(from=c("coefficient",resXnames[i]), to=productVars[i], arrows=1, free=FALSE, values=1), # connect the latent predictor and moderator to the product node
                            mxPath(from=productVars[i], to=varNamesY[i], arrows=1, free=FALSE, values=1) # connect the product node to the outcome
  )
}
 
test <- simulateData(300, nVars)
wide_data <- test %>%
  pivot_wider(names_from = "T", values_from = c("X", "Y"))
model <- mxModel(model_template, mxData(wide_data, type = "raw"))
fitted <- mxRun(model)
AdminNeale's picture
Offline
Joined: 03/01/2013 - 14:09
COnfirmed

Hi

I can confirm the issue you report. It seems odd that the optimizer seems happy with the solution, albeit with some very large standard errors for varCoef and res_y yet the local identification check fails. We will be looking into this issue, thank you for raising it!

Best,
Mike

sboker's picture
Offline
Joined: 07/28/2009 - 23:00
The problem is that the

The problem is that the residuals always have a mean of zero. Products with multiplicands that have a mean of zero are unidentified in the method we use.

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Thanks for clarifying steve!

Thanks for clarifying steve! Are you aware of another method that would allow fitting this model?