I've been trying to determine the time ordering of events for some biomedical data. Suppose I have two variables (biomarker A and biomarker B), each measured at three timepoints (for a total of six variables). I have one model where biomarker A causes biomarker B. I write this model as follows:
mat = data.frame(matrix(c(
"B1_1", "B1_2", 1, .2,
"B1_2", "B1_3", 1, .2, #### biomarkers 1 cause themselves across time
"B2_1", "B2_2", 1, .2,
"B2_2", "B2_3", 1, .2, #### biomarkers 2 cause themselves across time
"B1_1", "B2_2", 1, .2,
"B1_2", "B2_3", 1, .2 #### biomarker 1 causes biomarker 2 across time
), ncol=4, byrow=TRUE), stringsAsFactors=F)
paths = mxPath(from=mat[,1], to=mat[,2], arrows=as.numeric(mat[,3]), values=as.numeric(mat[,4]), labels=paste(mat[,1], "to", mat[,3], sep=""), free=T)
Then I write the opposite model (where biomarkers 2 cause biomarkers 1). When I fit both models, the RMSEA is quite good for one model and poor for the other model. However, when I look at the residual matrix, neither looks very good. (In particular, the correlations between B2_1, B2_2, and B2_3/B1_1, B1_2, and B1_3).
So here's my question: If I'm only interested in determining which causal relationship is better supported by the data, should I even care that some aspect of the correlation matrix (the ones unimportant to my substantive question) are not well modeled? Am I still safe in saying that one causal structure is more likely than the other?