Let's try to get df correct in OpenMx. We thought this was a priority at the meeting yesterday. There are one or two things to consider, however, when it comes to raw data input because the statistics are not counted the same way in every software package.
In general, df = nstats - nparameters
Suppose there are i=1...ng groups (submodels) in the model. Let the number of variables in submodel i be m_i. Let the number of nonlinear equality constraints be nk. Then, for only covariance matrix input we have:
nstats(cov)= sum_i {m_i*(m_i + 1)/2} + nk
if means are also input then
nstats(cov+mean)= sum_i {m_i*(m_i + 2)/2} + nk
if a correlation matrix was input then
nstats(cor)= sum_i {m_i*(m_i - 1)/2} + nk
Mx version 1 uses the total number of raw data observations. If there are m_ij variables observed in the j=1...n subjects in submodel i, then
nstats(raw) = sum_i sum_j {m_ij} + nk
This is suitable for both raw continuous and raw ordinal. For certain purposes, we might use nstats(cov+mean) in place of nstats(raw) as some other programs use this. The disadvantage is that it is possible to end up with negative degrees of freedom if there are definition variables and parameters relating the definition variables to the observed variables in the model.
For the time being, let's just make OpenMx agree with ShutMx.
Cheers
Mike
#1
Currently, we are giving DOF credit for definition variables. This should not be the case.
Log in or register to post comments
#2
Let's assume that some model has the following entries for definition variables: submodel1.data.foo, submodel2.data.foo, and submodel2.data.bar. Do I treat submodel1.data.foo and submodel2.data.foo as one definition variable because they have the same identifier ('foo')? Or are they two definition variables because they are from two different data sources?
Log in or register to post comments
#3
I think you do not have to keep checks and balances across submodels, because they would have been included twice in the calculation of the number of statistics. Since they are declared as definition variables twice (once each in submodel1 and submodel2), they should be subtracted twice. However, note that they should only be subtracted once per dataset reference.
The tricky bit of calculating the df in the case of a mixture distribution is that the same data may appear in multiple submodels, but should only be counted once.
NB there does not seem to be any way to get at the reported fit statistics AIC BIC etc. but this may be my ignorance.
Log in or register to post comments
#4
i.e.,
fit = mxRun(model)
f = summary(fit)
f$AIC
[1] -2.615998
Log in or register to post comments
#5
To see what fields are available from the output of the summary function, use names(summary(modelOut)). Anything from that list can be accessed using the '$' notation, such as "summary(modelOut)$AIC.Mx".
Log in or register to post comments
#6
Second, I am thinking that non-linear inequality constraints do not typically add to the degrees of freedom (just narrow the search space by excluding, e.g., areas where the eigenvalue of a matrix is non-positive). So let's *not* include inequality constraints by default. It may be possible to tell if the inequality constraints are at a bound which might translate to a gain in degrees of freedom.
Third, now I think about it, the optimizer should be returning a status vector about the parameters - whether they are free, equal to their upper or lower bound, or perhaps irrelevant as far as the fit function goes. I don't know if we get that vector back in usable form, but it might be an idea to let on to the user that all is not right with the parameters.
Log in or register to post comments
#7
Log in or register to post comments
#8
The second issue is a problem when models are marked independent. They sure run faster, but the summary of a mxModel with independent sub-mxModels seems a bit broken. For example, parameter estimates are not printed, and the number of estimated parameters is incorrect. Here are the summaries. The original files can be found at http://www.vipbg.vcu.edu/~vipbg/OpenMx/OpenMx09.shtml - the script is excerpted from
http://www.vipbg.vcu.edu/~vipbg/OpenMx/MultivariateTwin_MatrixRaw.R and the dataset is at
http://www.vipbg.vcu.edu/~vipbg/OpenMx/myData/iqnl.rec
You may not need these to identify the source of the problem, so I have not attached.
> multivTwinSatFit2<-multivTwinSatFit
> multivTwinSatFit2$MZ@independent<-T
> multivTwinSatFit2$DZ@independent<-T
> summary(multivTwinSatFitInd<-mxRun(multivTwinSatFit2))
Running multivTwinSat
Running MZ
Running DZ
Observed statistics: 960
Estimated parameters: 0 <---- This is wrong
Degrees of freedom: 960
-2 log likelihood: 7077.284
Saturated -2 log likelihood: NA
numObs: 101
Chi-Square: NA
p: NA
AIC (Mx): 5157.284
BIC (Mx): 1323.384
adjusted BIC:
RMSEA: NA
frontend elapsed time: 6.863703 secs
backend elapsed time: 0.0001099110 secs
openmx version number: 0.2.5-1050
> summary(multivTwinSatFitNonInd<-mxRun(multivTwinSatFit)) MZ.CholMZ 1 1 8.0691472 1.0732826 MZ.CholMZ 2 1 5.7976804 2.3414357 MZ.CholMZ 3 1 -3.9505130 3.0501696 MZ.CholMZ 4 1 1.9145079 2.3218853 MZ.CholMZ 5 1 7.6819283 2.1861574 MZ.CholMZ 6 1 9.3474455 3.5320825 MZ.CholMZ 7 1 5.8097208 1.1331387 MZ.CholMZ 8 1 4.5412215 2.4398496 MZ.CholMZ 9 1 4.5284555 3.8634005 MZ.CholMZ 10 1 4.9012715 2.1571319 MZ.CholMZ 11 1 7.4675836 2.4044132 MZ.CholMZ 12 1 11.2614615 4.0899246 MZ.CholMZ 2 2 13.7097494 1.7317102 MZ.CholMZ 3 2 -4.8853796 2.5522299 MZ.CholMZ 4 2 -4.4212417 1.9447796 MZ.CholMZ 5 2 7.6019956 1.8809615 MZ.CholMZ 6 2 2.0239043 2.7295962 MZ.CholMZ 7 2 0.3931511 0.9784640 MZ.CholMZ 8 2 12.1701397 1.9840149 MZ.CholMZ 9 2 0.8342502 2.8679624 MZ.CholMZ 10 2 -1.8377437 1.6353945 MZ.CholMZ 11 2 8.3196549 1.7776910 MZ.CholMZ 12 2 3.7657882 2.9853059 MZ.CholMZ 3 3 -16.7576951 2.0090221 MZ.CholMZ 4 3 -11.3341803 1.5014457 MZ.CholMZ 5 3 0.6473425 1.6684793 MZ.CholMZ 6 3 -6.6516852 3.5255159 MZ.CholMZ 7 3 -0.5828628 1.0057496 MZ.CholMZ 8 3 -1.7110786 1.6196522 MZ.CholMZ 9 3 -15.4516763 2.9388766 MZ.CholMZ 10 3 -7.3310101 1.6820049 MZ.CholMZ 11 3 -3.8955226 1.6548237 MZ.CholMZ 12 3 -8.8825913 4.0998821 MZ.CholMZ 4 4 4.3672534 0.6644559 MZ.CholMZ 5 4 2.4726786 1.6212282 MZ.CholMZ 6 4 4.3141255 1.9502488 MZ.CholMZ 7 4 1.0381515 0.9271846 MZ.CholMZ 8 4 1.8804550 1.4832500 MZ.CholMZ 9 4 4.3789660 2.1312612 MZ.CholMZ 10 4 1.0520785 1.3795140 MZ.CholMZ 11 4 -2.7135089 1.4123113 MZ.CholMZ 12 4 1.9183231 2.1026609 MZ.CholMZ 5 5 7.7050120 1.2252174 MZ.CholMZ 6 5 3.1157616 2.6463380 MZ.CholMZ 7 5 -0.7091913 1.0098279 MZ.CholMZ 8 5 -0.8524264 1.5642942 MZ.CholMZ 9 5 -5.2060968 2.7124875 MZ.CholMZ 10 5 -0.3159779 1.7045813 MZ.CholMZ 11 5 2.0654059 1.5173967 MZ.CholMZ 12 5 6.1075191 2.8573487 MZ.CholMZ 6 6 17.1989909 1.9146110 MZ.CholMZ 7 6 -0.7795847 0.8800432 MZ.CholMZ 8 6 0.1614940 1.4568054 MZ.CholMZ 9 6 -7.7386199 2.0894156 MZ.CholMZ 10 6 -3.5022475 1.4173561 MZ.CholMZ 11 6 -0.7767760 1.3909081 MZ.CholMZ 12 6 16.5611240 2.1885774 MZ.CholMZ 7 7 4.3681341 0.6648710 MZ.CholMZ 8 7 3.3580002 1.4040993 MZ.CholMZ 9 7 1.0543458 1.6431174 MZ.CholMZ 10 7 -1.0700938 1.1851657 MZ.CholMZ 11 7 2.4592443 1.2936795 MZ.CholMZ 12 7 4.9634671 1.3061237 MZ.CholMZ 8 8 6.1002963 1.0195596 MZ.CholMZ 9 8 3.3339129 1.7500792 MZ.CholMZ 10 8 2.5644042 1.2164417 MZ.CholMZ 11 8 -0.6913176 1.3360432 MZ.CholMZ 12 8 -2.1227871 1.1660956 MZ.CholMZ 9 9 10.8283653 1.4189795 MZ.CholMZ 10 9 6.3228518 1.0834757 MZ.CholMZ 11 9 -2.3232227 1.3132448 MZ.CholMZ 12 9 -0.9263210 1.1199464 MZ.CholMZ 10 10 3.3021788 0.5525994 MZ.CholMZ 11 10 3.5193018 1.2519846 MZ.CholMZ 12 10 0.5066331 1.0833072 MZ.CholMZ 11 11 4.6429947 0.7592430 MZ.CholMZ 12 11 1.0642005 1.1050532 MZ.CholMZ 12 12 4.3826715 0.7266716 MZ.ExpMeanMZ 1 t1var1 88.8556939 1.4484863 MZ.ExpMeanMZ 1 t1var2 62.9124972 2.2886702 MZ.ExpMeanMZ 1 t1var3 82.6115042 2.2348946 MZ.ExpMeanMZ 1 t1var4 87.4944152 1.8229106 MZ.ExpMeanMZ 1 t1var5 66.0304754 2.4201326 MZ.ExpMeanMZ 1 t1var6 83.4273270 3.9607192 MZ.ExpMeanMZ 1 t2var1 87.9068725 1.2598299 MZ.ExpMeanMZ 1 t2var2 63.6904471 2.3676151 MZ.ExpMeanMZ 1 t2var3 81.2466729 3.8015470 MZ.ExpMeanMZ 1 t2var4 89.5202345 2.1056613 MZ.ExpMeanMZ 1 t2var5 68.1122751 2.4356428 MZ.ExpMeanMZ 1 t2var6 79.8039877 4.4232448 DZ.CholDZ 1 1 7.0114591 0.6376685 DZ.CholDZ 2 1 1.0435618 1.7026270 DZ.CholDZ 3 1 3.8143673 1.6935397 DZ.CholDZ 4 1 2.2550688 1.1390540 DZ.CholDZ 5 1 3.1858991 1.2942147 DZ.CholDZ 6 1 7.3983766 1.8602587 DZ.CholDZ 7 1 1.3373428 1.2943831 DZ.CholDZ 8 1 1.8699949 1.5499771 DZ.CholDZ 9 1 1.4911779 2.3696022 DZ.CholDZ 10 1 0.8631757 1.4598629 DZ.CholDZ 11 1 4.1052428 1.3541185 DZ.CholDZ 12 1 -0.7849423 2.8397895 DZ.CholDZ 2 2 14.1162370 1.2828371 DZ.CholDZ 3 2 3.1816734 1.8285801 DZ.CholDZ 4 2 1.2002686 1.2337240 DZ.CholDZ 5 2 6.3535829 1.1419839 DZ.CholDZ 6 2 2.2416786 1.9988526 DZ.CholDZ 7 2 0.6901015 1.3585201 DZ.CholDZ 8 2 2.7993095 1.5503701 DZ.CholDZ 9 2 -0.1909845 2.1891052 DZ.CholDZ 10 2 0.4830136 1.4386612 DZ.CholDZ 11 2 1.7781879 1.3448281 DZ.CholDZ 12 2 -0.1945152 3.0019408 DZ.CholDZ 3 3 16.7410965 1.3354086 DZ.CholDZ 4 3 6.4582793 1.0634667 DZ.CholDZ 5 3 -2.7122551 1.0236711 DZ.CholDZ 6 3 6.4268824 1.5502505 DZ.CholDZ 7 3 -0.3092245 1.2752244 DZ.CholDZ 8 3 1.6834900 1.3228721 DZ.CholDZ 9 3 1.0536587 1.8321951 DZ.CholDZ 10 3 1.2046250 1.2277874 DZ.CholDZ 11 3 -0.0278694 1.1184718 DZ.CholDZ 12 3 3.4594942 1.9566617 DZ.CholDZ 4 4 7.8784233 0.7285716 DZ.CholDZ 5 4 0.2073750 1.0223907 DZ.CholDZ 6 4 4.6327224 1.6541284 DZ.CholDZ 7 4 -1.2652937 1.3652870 DZ.CholDZ 8 4 -2.0130171 1.4919314 DZ.CholDZ 9 4 -0.8223750 2.5861212 DZ.CholDZ 10 4 -0.6911486 1.5839382 DZ.CholDZ 11 4 -1.6218766 1.3321906 DZ.CholDZ 12 4 -1.5447221 2.7464785 DZ.CholDZ 5 5 7.4738408 0.7019570 DZ.CholDZ 6 5 3.4039223 1.5502326 DZ.CholDZ 7 5 -1.1007670 1.2565001 DZ.CholDZ 8 5 1.2290576 1.3280689 DZ.CholDZ 9 5 -0.7773175 1.8248106 DZ.CholDZ 10 5 -0.3205831 1.2231729 DZ.CholDZ 11 5 3.0316184 1.1436088 DZ.CholDZ 12 5 3.4933303 2.0566451 DZ.CholDZ 6 6 15.6626700 1.4494960 DZ.CholDZ 7 6 1.9498420 1.3758274 DZ.CholDZ 8 6 2.3498153 1.4311883 DZ.CholDZ 9 6 4.4765624 2.1896492 DZ.CholDZ 10 6 0.9680028 1.3912950 DZ.CholDZ 11 6 0.7862629 1.2333278 DZ.CholDZ 12 6 10.7693540 2.5932207 DZ.CholDZ 7 7 8.2721798 0.7984110 DZ.CholDZ 8 7 0.0597498 1.2610928 DZ.CholDZ 9 7 0.4538864 1.4578153 DZ.CholDZ 10 7 0.4712694 1.0798613 DZ.CholDZ 11 7 1.8290381 1.0234228 DZ.CholDZ 12 7 4.7366035 1.6681049 DZ.CholDZ 8 8 9.8538518 0.9688942 DZ.CholDZ 9 8 6.6522613 1.5444225 DZ.CholDZ 10 8 1.1450187 1.1089281 DZ.CholDZ 11 8 3.7097223 1.0193224 DZ.CholDZ 12 8 7.3142162 1.6144036 DZ.CholDZ 9 9 16.4045642 1.3787220 DZ.CholDZ 10 9 8.0658829 1.0776572 DZ.CholDZ 11 9 0.7821834 0.9569707 DZ.CholDZ 12 9 2.8252316 1.7910543 DZ.CholDZ 10 10 6.6394925 0.6351437 DZ.CholDZ 11 10 -1.5154869 0.9229526 DZ.CholDZ 12 10 2.3055531 1.4654856 DZ.CholDZ 11 11 6.5972798 0.6456054 DZ.CholDZ 12 11 4.7322642 1.7842350 DZ.CholDZ 12 12 16.9135068 1.4430224 DZ.ExpMeanDZ 1 t1var1 85.3823889 0.8749562 DZ.ExpMeanDZ 1 t1var2 65.8991832 1.4248946 DZ.ExpMeanDZ 1 t1var3 81.9011386 1.5776110 DZ.ExpMeanDZ 1 t1var4 89.2408917 1.1229378 DZ.ExpMeanDZ 1 t1var5 72.2635408 1.1677584 DZ.ExpMeanDZ 1 t1var6 81.7326824 1.5452943 DZ.ExpMeanDZ 1 t2var1 87.3305952 1.2547202 DZ.ExpMeanDZ 1 t2var2 65.8939063 1.2765653 DZ.ExpMeanDZ 1 t2var3 80.3596417 1.8712746 DZ.ExpMeanDZ 1 t2var4 88.5088371 1.2487792 DZ.ExpMeanDZ 1 t2var5 68.8412330 1.1728683 DZ.ExpMeanDZ 1 t2var6 76.2659867 1.8779957
Running multivTwinSat
name matrix row col Estimate Std.Error
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
Observed statistics: 960
Estimated parameters: 180
Degrees of freedom: 780
-2 log likelihood: 7077.284
Saturated -2 log likelihood: NA
numObs: 101
Chi-Square: NA
p: NA
AIC (Mx): 5517.284
BIC (Mx): 1738.745
adjusted BIC:
RMSEA: NA
frontend elapsed time: 0.5050371 secs
backend elapsed time: 37.11894 secs
openmx version number: 0.2.5-1050
>
Log in or register to post comments
#9
There may be more corrections to the summary function when using independent submodels. And I need to adjust the summary due to constraints.
Log in or register to post comments
#10
Log in or register to post comments
#11
Log in or register to post comments