Improve performance of R front end of OpenMx
Posted on

Forums
So I have this model and it takes 800 seconds to compile in the front end. Yes, it has some 81 submodels in it, but still... Classic Mx had finished running the script in less time whereas Open is still going after 9hrs (preliminary results only, it is possible that model is mis-specified).
This thread caught my eye: http://www.r-bloggers.com/the-new-r-compiler-package-in-r-2-13-0-some-first-experiments/
Is it worth trying to compile some bits of the OpenMx frontend to improve performance there?
We looked at this a couple months ago but discussed it via email
I was curious so I had a bit of a poke around to see what info was out there on what is actually going on with the compiler.
From what I can gather, what is doing is actually very different from most JIT compilers. The Java compiler for example goes through a few different transformations at different times:
*At compile time*: Source Code ==> Abstract Syntax Tree ==> Bytecode
*At run time*: Bytecode is interpreted, Bytecode ==> Native code (only if a function is a hot spot)
What normally happens with R is that when the script is loaded, the source code is transformed into an abstract syntax tree and then that tree is interpreted directly. So for standard R, what happens is:
*At compile time (or script load time)*: Source Code => Abstract Syntax Tree
*At run time*: abstract syntax tree is interpreted
When we turn JIT on, I gather what happens is:
*At compile time (or script load time)*: Source Code => Abstract Syntax Tree
*At run time*: Abstract Syntax Tree => bytecode (when function first called), then bytecode is interpreted
Interpreting bytecode is still very slow compared to native code, and transforming code which only runs a few times would add a lot of overhead, so perhaps it's not so surprising that there isn't a big performance boost.
Wow.. I looked at the code and it turns out that the compiler itself is entirely written in R rather that C, so perhaps its not so surprising that it adds a lot of overhead
Log in or register to post comments
In reply to We looked at this a couple months ago but discussed it via email by rgore
Compiled R
Log in or register to post comments
Err, the short answer is no.
Log in or register to post comments
In reply to Err, the short answer is no. by mspiegel
Yes & no
frontend time: 790.3566 secs
backend time: 0.02779293 secs
independent submodels time: 9.012222e-05 secs
wall clock time: 790.3844 secs
cpu time: 790.3844 secs
openmx version number: 999.0.0-1661
So users' script development production cycle time could be reduced by front-end performance boosts. Corresponding results from Classic (but with parameters freely estimated):
Task Time elapsed (DD:HH:MM:SS)
Reading script & data 0: 0: 0: 0.54
Execution 0: 0: 9:20.46
TOTAL 0: 0: 9:21.00
Yes, I know that the overhead of R is expensive, and that it does buy the user a tremendous amount in terms of programing capabilities, data handling environment, graphics etc. etc. Indeed it was a delight to develop this script in OpenMx compared to Classic, but it is still the case that many users would benefit from any performance boosts we can muster, frontend or back. For example, Tim Bates' non-independent submodel suggestion is a good one and I think should be in 2.0.
Log in or register to post comments
In reply to Yes & no by neale
OK. Here are some
mxRun()
place a call toRprof("filename.out")
and then after the call tomxRun()
place a call toRprof(NULL)
. You can either run the model with no free parameters, or another alternative is to call mxRun() with the argument 'onlyFrontend=TRUE'.Once the run has finished, then at the unix command-line, you want to run
That will identify where are the bottlenecks in the front-end. The other task is to speedup the bottlenecks, or rewrite portions of the implementations to not use those bottlenecks.
Log in or register to post comments
In reply to OK. Here are some by mspiegel
dataframes -> lists or matrices?
I wonder if converting dataframes to lists might buy a speed up in the front end?
I guess too, that mxRun might output a progress bar (just a dot for each model it compiles before kicking off the backend...)
Log in or register to post comments
In reply to OK. Here are some by mspiegel
Profiling
Each sample represents 0.02 seconds.
Total run time: 471.52 seconds.
Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.
% total % self
total seconds self seconds name
100.0 471.46 0.0 0.00 "mxRun"
100.0 471.46 0.0 0.00 "runHelper"
96.7 455.80 0.1 0.66 "translateObjectivesHelper"
96.7 455.80 0.0 0.00 "translateObjectives"
89.7 423.08 1.6 7.48 "imxFlattenModel"
86.5 407.84 72.1 339.90 "flattenModelHelper"
68.8 324.40 0.1 0.28 "standardGeneric"
66.6 314.22 0.1 0.60 "genericObjModelConvert"
66.5 313.62 0.1 0.48 "updateObjectiveDimnames"
66.1 311.78 0.0 0.00 "eval"
66.1 311.68 0.0 0.00 "mxEval"
8.1 37.96 0.1 0.32 "@<-"
8.0 37.70 0.1 0.28 "slot<-"
6.4 30.14 6.4 30.14 ".Call"
6.2 29.00 1.1 5.02 "generateLabelsMatrix"
6.2 29.00 0.0 0.00 "generateLabelsHelper"
6.2 29.00 0.0 0.00 "imxGenerateLabels"
5.5 25.76 5.5 25.76 "append"
4.9 23.26 0.1 0.32 "[<-"
4.9 22.94 2.8 13.26 "[<-.data.frame"
3.4 16.12 0.1 0.56 "lapply"
3.4 15.86 0.9 4.10 "FUN"
3.0 14.02 1.2 5.72 ".local"
2.0 9.58 1.8 8.54 "match"
1.7 8.18 0.0 0.00 "[[<-"
1.5 7.28 0.1 0.52 "checkSlotAssignment"
1.5 6.96 0.3 1.52 "updateModelEntitiesHelper"
1.4 6.86 0.4 2.02 "namespaceSearchReplaceHelper"
1.4 6.84 0.0 0.00 "imxReplaceMethod"
1.4 6.84 0.0 0.00 "namespaceSearchReplace"
1.4 6.68 0.0 0.20 "sapply"
1.2 5.76 0.0 0.00 "initialize"
1.2 5.76 0.0 0.00 "new"
0.9 4.42 0.2 0.84 "%in%"
0.9 4.14 0.0 0.00 "updateModelMatrices"
0.8 3.64 0.1 0.34 "as"
0.7 3.36 0.0 0.00 "convertAlgebras"
0.7 3.34 0.4 1.70 "convertSingleAlgebra"
0.6 2.90 0.2 1.06 "namespaceConvertFormula"
0.6 2.88 0.0 0.20 "genericObjFunNamespace"
0.6 2.88 0.0 0.00 "namespaceConvertObjective"
0.6 2.82 0.0 0.00 "updateModelAlgebras"
0.5 2.24 0.0 0.00 "imxGenerateNamespace"
0.5 2.22 0.0 0.00 "generateLocalNamespace"
0.5 2.14 0.2 0.98 "unlist"
0.5 2.10 0.0 0.00 "omxLapply"
0.4 2.06 0.1 0.44 "imxConvertIdentifier"
0.4 1.88 0.3 1.28 "populateDefInitialValues"
0.4 1.86 0.1 0.22 "elNamed"
0.3 1.66 0.1 0.58 "unique"
0.3 1.56 0.0 0.00 "checkEvaluation"
0.3 1.48 0.3 1.32 "grep"
0.3 1.48 0.0 0.02 "namespaceGetValues"
0.3 1.38 0.0 0.04 ".classEnv"
0.3 1.34 0.0 0.02 "flatNamespaceSearchReplace"
0.3 1.34 0.0 0.00 "flatReplaceMethod"
0.3 1.32 0.2 1.20 "doFlatNamespaceSearchReplace"
0.3 1.32 0.1 0.26 "hasSquareBrackets"
0.2 1.20 0.0 0.08 "getGeneric"
0.2 1.18 0.0 0.06 ".requirePackage"
0.2 1.16 0.1 0.50 "checkAlgebraEvaluation"
0.2 1.12 0.1 0.26 ".getGeneric"
0.2 0.86 0.2 0.86 "strsplit"
0.2 0.86 0.1 0.22 "imxIsDefinitionVariable"
0.2 0.84 0.0 0.02 "convertFormulaInsertModel"
0.2 0.82 0.1 0.56 "insertNumericValue"
0.2 0.78 0.1 0.42 "anyDuplicated"
0.2 0.78 0.0 0.02 "loadedNamespaces"
0.2 0.76 0.0 0.16 "ls"
0.1 0.64 0.0 0.08 "isLocalDefinitionVariable"
0.1 0.62 0.1 0.22 "possibleExtends"
0.1 0.60 0.1 0.48 "identical"
0.1 0.54 0.0 0.20 "NROW"
0.1 0.52 0.1 0.52 "all"
0.1 0.52 0.1 0.52 "sum"
0.1 0.52 0.1 0.28 "unique.default"
0.1 0.52 0.0 0.04 "tryCatch"
0.1 0.50 0.0 0.00 "try"
0.1 0.46 0.1 0.46 "!"
0.1 0.46 0.1 0.46 "inherits"
0.1 0.44 0.0 0.02 "imxExtractNames"
0.1 0.42 0.1 0.22 "getClass"
0.1 0.42 0.0 0.18 "tryCatchList"
0.1 0.38 0.1 0.38 "any"
0.1 0.38 0.1 0.36 "exists"
0.1 0.36 0.1 0.36 "anyDuplicated.default"
0.1 0.36 0.1 0.36 "length"
0.1 0.36 0.0 0.06 "loadMethod"
0.1 0.36 0.0 0.02 "namespaceGetEntities"
0.1 0.32 0.1 0.32 "el"
0.1 0.32 0.0 0.16 ".getMethodsTable"
0.1 0.32 0.0 0.00 "namespaceGetParameters"
0.1 0.30 0.0 0.16 "getClassDef"
0.1 0.28 0.1 0.28 "is.na"
0.1 0.28 0.1 0.24 ".getClassFromCache"
0.1 0.28 0.0 0.10 ".quickCoerceSelect"
0.1 0.28 0.0 0.04 "is.data.frame"
0.1 0.28 0.0 0.02 ".cacheGeneric"
0.1 0.28 0.0 0.00 "evaluateExpression"
0.1 0.28 0.0 0.00 "evaluateSymbol"
0.1 0.26 0.1 0.24 "=="
0.1 0.26 0.0 0.12 ".cacheGenericTable"
0.1 0.26 0.0 0.00 "computeAlgebra"
0.1 0.26 0.0 0.00 "evaluateAlgebra"
0.1 0.22 0.1 0.22 ".row_names_info"
0.1 0.22 0.1 0.22 ">"
0.1 0.22 0.1 0.22 "get"
0.1 0.22 0.1 0.22 "sys.call"
0.1 0.22 0.0 0.08 "tryCatchOne"
0.0 0.20 0.0 0.00 "checkNameAlignment"
0.0 0.18 0.0 0.18 ".identC"
0.0 0.18 0.0 0.00 "evaluateMatrix"
0.0 0.16 0.0 0.00 "computeMatrix"
0.0 0.14 0.0 0.14 "attr"
0.0 0.14 0.0 0.14 "is.null"
0.0 0.14 0.0 0.08 "apply"
0.0 0.14 0.0 0.06 "doTryCatch"
0.0 0.14 0.0 0.06 "imxIdentifier"
0.0 0.14 0.0 0.00 "cycleDetection"
0.0 0.14 0.0 0.00 "intersect"
0.0 0.12 0.0 0.12 "is.character"
0.0 0.12 0.0 0.04 "getNamespace"
0.0 0.12 0.0 0.00 "checkVariablesHelper"
0.0 0.12 0.0 0.00 "imxAddDependency"
0.0 0.12 0.0 0.00 "imxCheckVariables"
0.0 0.12 0.0 0.00 "is"
0.0 0.12 0.0 0.00 "ncol"
0.0 0.10 0.0 0.10 "is.vector"
0.0 0.10 0.0 0.10 "paste"
0.0 0.10 0.0 0.04 "localNamespaceSearchReplace"
0.0 0.10 0.0 0.00 "addAlgebraDetection"
0.0 0.10 0.0 0.00 "addFormulaDetection"
0.0 0.10 0.0 0.00 "evaluateMxObject"
0.0 0.10 0.0 0.00 "nrow"
0.0 0.08 0.0 0.08 "assign"
0.0 0.08 0.0 0.08 "list"
0.0 0.08 0.0 0.08 "match.fun"
0.0 0.08 0.0 0.08 "max"
0.0 0.08 0.0 0.08 "names<-"
0.0 0.08 0.0 0.06 "addEdge"
0.0 0.08 0.0 0.00 "[["
0.0 0.06 0.0 0.06 "&"
0.0 0.06 0.0 0.06 "is.list"
0.0 0.04 0.0 0.04 "as.character"
0.0 0.04 0.0 0.04 "class"
0.0 0.04 0.0 0.04 "class<-"
0.0 0.04 0.0 0.04 "col"
0.0 0.04 0.0 0.04 "environment"
0.0 0.04 0.0 0.04 "is.array"
0.0 0.04 0.0 0.04 "nargs"
0.0 0.04 0.0 0.04 "seq_len"
0.0 0.04 0.0 0.04 "xpdrows.data.frame"
0.0 0.04 0.0 0.02 ".class1"
0.0 0.04 0.0 0.02 "flatNamespaceSearch"
0.0 0.04 0.0 0.02 "is.factor"
0.0 0.04 0.0 0.00 "addNode"
0.0 0.04 0.0 0.00 "callNextMethod"
0.0 0.04 0.0 0.00 "dimnames<-"
0.0 0.04 0.0 0.00 "flatExtractMethod"
0.0 0.04 0.0 0.00 "genericObjNewEntities"
0.0 0.04 0.0 0.00 "library"
0.0 0.04 0.0 0.00 "mxMatrix"
0.0 0.04 0.0 0.00 "packageSlot"
0.0 0.04 0.0 0.00 "union"
0.0 0.02 0.0 0.02 "!="
0.0 0.02 0.0 0.02 "+"
0.0 0.02 0.0 0.02 "<"
0.0 0.02 0.0 0.02 "<="
0.0 0.02 0.0 0.02 "Ops.data.frame"
0.0 0.02 0.0 0.02 "as.name"
0.0 0.02 0.0 0.02 "as.symbol"
0.0 0.02 0.0 0.02 "assignDimnames"
0.0 0.02 0.0 0.02 "file.exists"
0.0 0.02 0.0 0.02 "is.numeric"
0.0 0.02 0.0 0.02 "lazyLoadDBfetch"
0.0 0.02 0.0 0.02 "names"
0.0 0.02 0.0 0.02 "populateDefVarMatrix"
0.0 0.02 0.0 0.00 ".sigLabel"
0.0 0.02 0.0 0.00 "addObjectiveDetection"
0.0 0.02 0.0 0.00 "attachNamespace"
0.0 0.02 0.0 0.00 "checkConflicts"
0.0 0.02 0.0 0.00 "containsCycle"
0.0 0.02 0.0 0.00 "convertFormula"
0.0 0.02 0.0 0.00 "convertObjectives"
0.0 0.02 0.0 0.00 "cycleVisitor"
0.0 0.02 0.0 0.00 "do.call"
0.0 0.02 0.0 0.00 "generateAlgebraList"
0.0 0.02 0.0 0.00 "genericObjDependencies"
0.0 0.02 0.0 0.00 "genericObjFunConvert"
0.0 0.02 0.0 0.00 "imxExtractMethod"
0.0 0.02 0.0 0.00 "imxSameType"
0.0 0.02 0.0 0.00 "imxVerifyMatrix"
0.0 0.02 0.0 0.00 "imxVerifyName"
0.0 0.02 0.0 0.00 "isNumber"
0.0 0.02 0.0 0.00 "lookupNumericValue"
0.0 0.02 0.0 0.00 "namespaceModelSearch"
0.0 0.02 0.0 0.00 "namespaceSearch
0.0 0.02 0.0 0.00 "namespaceSearch"
0.0 0.02 0.0 0.00 "omxCheckMatrices"
0.0 0.02 0.0 0.00 "same.isFn"
0.0 0.02 0.0 0.00 "shareData"
0.0 0.02 0.0 0.00 "shareDataHelper"
0.0 0.02 0.0 0.00 "substituteOperators"
% self % total
self seconds total seconds name
72.1 339.90 86.5 407.84 "flattenModelHelper"
6.4 30.14 6.4 30.14 ".Call"
5.5 25.76 5.5 25.76 "append"
2.8 13.26 4.9 22.94 "[<-.data.frame"
1.8 8.54 2.0 9.58 "match"
1.6 7.48 89.7 423.08 "imxFlattenModel"
1.2 5.72 3.0 14.02 ".local"
1.1 5.02 6.2 29.00 "generateLabelsMatrix"
0.9 4.10 3.4 15.86 "FUN"
0.4 2.02 1.4 6.86 "namespaceSearchReplaceHelper"
0.4 1.70 0.7 3.34 "convertSingleAlgebra"
0.3 1.52 1.5 6.96 "updateModelEntitiesHelper"
0.3 1.32 0.3 1.48 "grep"
0.3 1.28 0.4 1.88 "populateDefInitialValues"
0.2 1.20 0.3 1.32 "doFlatNamespaceSearchReplace"
0.2 1.06 0.6 2.90 "namespaceConvertFormula"
0.2 0.98 0.5 2.14 "unlist"
0.2 0.86 0.2 0.86 "strsplit"
0.2 0.84 0.9 4.42 "%in%"
0.1 0.66 96.7 455.80 "translateObjectivesHelper"
0.1 0.60 66.6 314.22 "genericObjModelConvert"
0.1 0.58 0.3 1.66 "unique"
0.1 0.56 3.4 16.12 "lapply"
0.1 0.56 0.2 0.82 "insertNumericValue"
0.1 0.52 1.5 7.28 "checkSlotAssignment"
0.1 0.52 0.1 0.52 "all"
0.1 0.52 0.1 0.52 "sum"
0.1 0.50 0.2 1.16 "checkAlgebraEvaluation"
0.1 0.48 66.5 313.62 "updateObjectiveDimnames"
0.1 0.48 0.1 0.60 "identical"
0.1 0.46 0.1 0.46 "!"
0.1 0.46 0.1 0.46 "inherits"
0.1 0.44 0.4 2.06 "imxConvertIdentifier"
0.1 0.42 0.2 0.78 "anyDuplicated"
0.1 0.38 0.1 0.38 "any"
0.1 0.36 0.1 0.38 "exists"
0.1 0.36 0.1 0.36 "anyDuplicated.default"
0.1 0.36 0.1 0.36 "length"
0.1 0.34 0.8 3.64 "as"
0.1 0.32 8.1 37.96 "@<-"
0.1 0.32 4.9 23.26 "[<-"
0.1 0.32 0.1 0.32 "el"
0.1 0.28 68.8 324.40 "standardGeneric"
0.1 0.28 8.0 37.70 "slot<-"
0.1 0.28 0.1 0.52 "unique.default"
0.1 0.28 0.1 0.28 "is.na"
0.1 0.26 0.3 1.32 "hasSquareBrackets"
0.1 0.26 0.2 1.12 ".getGeneric"
0.1 0.24 0.1 0.28 ".getClassFromCache"
0.1 0.24 0.1 0.26 "=="
0.1 0.22 0.4 1.86 "elNamed"
0.1 0.22 0.2 0.86 "imxIsDefinitionVariable"
0.1 0.22 0.1 0.62 "possibleExtends"
0.1 0.22 0.1 0.42 "getClass"
0.1 0.22 0.1 0.22 ".row_names_info"
0.1 0.22 0.1 0.22 ">"
0.1 0.22 0.1 0.22 "get"
0.1 0.22 0.1 0.22 "sys.call"
0.0 0.20 1.4 6.68 "sapply"
0.0 0.20 0.6 2.88 "genericObjFunNamespace"
0.0 0.20 0.1 0.54 "NROW"
0.0 0.18 0.1 0.42 "tryCatchList"
0.0 0.18 0.0 0.18 ".identC"
0.0 0.16 0.2 0.76 "ls"
0.0 0.16 0.1 0.32 ".getMethodsTable"
0.0 0.16 0.1 0.30 "getClassDef"
0.0 0.14 0.0 0.14 "attr"
0.0 0.14 0.0 0.14 "is.null"
0.0 0.12 0.1 0.26 ".cacheGenericTable"
0.0 0.12 0.0 0.12 "is.character"
0.0 0.10 0.1 0.28 ".quickCoerceSelect"
0.0 0.10 0.0 0.10 "is.vector"
0.0 0.10 0.0 0.10 "paste"
0.0 0.08 0.2 1.20 "getGeneric"
0.0 0.08 0.1 0.64 "isLocalDefinitionVariable"
0.0 0.08 0.1 0.22 "tryCatchOne"
0.0 0.08 0.0 0.14 "apply"
0.0 0.08 0.0 0.08 "assign"
0.0 0.08 0.0 0.08 "list"
0.0 0.08 0.0 0.08 "match.fun"
0.0 0.08 0.0 0.08 "max"
0.0 0.08 0.0 0.08 "names<-"
0.0 0.06 0.2 1.18 ".requirePackage"
0.0 0.06 0.1 0.36 "loadMethod"
0.0 0.06 0.0 0.14 "doTryCatch"
0.0 0.06 0.0 0.14 "imxIdentifier"
0.0 0.06 0.0 0.08 "addEdge"
0.0 0.06 0.0 0.06 "&"
0.0 0.06 0.0 0.06 "is.list"
0.0 0.04 0.3 1.38 ".classEnv"
0.0 0.04 0.1 0.52 "tryCatch"
0.0 0.04 0.1 0.28 "is.data.frame"
0.0 0.04 0.0 0.12 "getNamespace"
0.0 0.04 0.0 0.10 "localNamespaceSearchReplace"
0.0 0.04 0.0 0.04 "as.character"
0.0 0.04 0.0 0.04 "class"
0.0 0.04 0.0 0.04 "class<-"
0.0 0.04 0.0 0.04 "col"
0.0 0.04 0.0 0.04 "environment"
0.0 0.04 0.0 0.04 "is.array"
0.0 0.04 0.0 0.04 "nargs"
0.0 0.04 0.0 0.04 "seq_len"
0.0 0.04 0.0 0.04 "xpdrows.data.frame"
0.0 0.02 0.3 1.48 "namespaceGetValues"
0.0 0.02 0.3 1.34 "flatNamespaceSearchReplace"
0.0 0.02 0.2 0.84 "convertFormulaInsertModel"
0.0 0.02 0.2 0.78 "loadedNamespaces"
0.0 0.02 0.1 0.44 "imxExtractNames"
0.0 0.02 0.1 0.36 "namespaceGetEntities"
0.0 0.02 0.1 0.28 ".cacheGeneric"
0.0 0.02 0.0 0.04 ".class1"
0.0 0.02 0.0 0.04 "flatNamespaceSearch"
0.0 0.02 0.0 0.04 "is.factor"
0.0 0.02 0.0 0.02 "!="
0.0 0.02 0.0 0.02 "+"
0.0 0.02 0.0 0.02 "<"
0.0 0.02 0.0 0.02 "<="
0.0 0.02 0.0 0.02 "Ops.data.frame"
0.0 0.02 0.0 0.02 "as.name"
0.0 0.02 0.0 0.02 "as.symbol"
0.0 0.02 0.0 0.02 "assignDimnames"
0.0 0.02 0.0 0.02 "file.exists"
0.0 0.02 0.0 0.02 "is.numeric"
0.0 0.02 0.0 0.02 "lazyLoadDBfetch"
0.0 0.02 0.0 0.02 "names"
0.0 0.02 0.0 0.02 "populateDefVarMatrix"
Warning message:
In readLines(con, n = chunksize) :
incomplete final line found on 'RegimeSwitching_MatrixRawFixed.Rprof'
I am not experienced in parsing such output, but:
> R CMD Rprof RegimeSwitching_MatrixRawFixed.Rprof | awk '{if ($4 > 200) print $4,$5}'
seems to implicate the following routines:
339.90 "flattenModelHelper"
total
seconds name
407.84 "flattenModelHelper"
423.08 "imxFlattenModel"
455.80 "translateObjectivesHelper"
314.22 "genericObjModelConvert"
313.62 "updateObjectiveDimnames"
324.40 "standardGeneric"
Would that be your take also?
Log in or register to post comments
In reply to Profiling by neale
I tend to look at the "self
I've checked in a patch to the trunk that brings down the front-end time on your model from 886 seconds to 320 seconds. We have a discrepancy in our baseline performance, but hopefully you will notice some improvement after applying the patch. I was testing with R 2.11.1 on MacPro (Quad-core 3 GHz Xeon).
Log in or register to post comments
In reply to I tend to look at the "self by mspiegel
Neat
Another decidedly odd thing about this model (when the parameters are free) - I added checkpoint=TRUE to the mxRun call, and indeed after about 800 seconds it created a checkpoint file. However, it has not updated it in the last 8 hours, although it is still hitting the CPU.
Log in or register to post comments
In reply to Neat by neale
How many free parameters are
If there are a lot, that might be the reason for the slow checkpointing.
OpenMx currently only checkpoints at major iterations. This is because we don't have a clear way to get speedup from knowing where the last minor iteration was; NPSOL only lets us specify a starting point and a hessian when we restart the optimization. Estimating the initial hessian values takes on the order of k^2 evaluations of the objective function, where k is the number of free parameters. For models with a very large number of free parameters and an objective that takes a long time to process, that first set of calculations can be a doozy.
We've talked before about writing to the checkpoint file at minor iterations even though there's no way to specifically restore the computation from that checkpoint. That's not a big change, and it would at least reassure the user that OpenMx is actually doing something useful and isn't just spinning its wheels. And the trail of computed values of the objective function could be useful to some users.
We're also looking at different ways to speed the computation up, including algebraic computation of gradient and/or hessian values. There's still work to be done on those, but they should start making an appearance over the next several releases.
Log in or register to post comments
In reply to Neat by neale
Is it stuck at the end?
http://openmx.psyc.virginia.edu/thread/830
Disclaimer: I have not read, much less run, your script so cannot be sure this is the issue.
Log in or register to post comments
In reply to Is it stuck at the end? by mdewey
I had forgotten that
Log in or register to post comments
In reply to I had forgotten that by mspiegel
Every Function Evaluation, Please!
That being said, there is clearly some problem with the model I am trying to run. It works when some of the parameters are free, but not when they all are - in which case it just hangs, and no I don't get updated checkpoints. These are resolutely 0 bytes long, even after the 5 minutes or so for front-end compiling.
Anyway, progress to date is that the script with Fixed at the end of the name has some 9 parameters free, and it runs happily since I svn up'd to include mspiegel's faster frontend (which is a great innovation). The second script causes R to hang (and burn CPU) and produces a 0 byte checkpoint file. You have to get rid of the .txt extension on the data file for the scripts to see it.
Log in or register to post comments
In reply to Every Function Evaluation, Please! by neale
You really shouldn't be
Changes to the checkpoint spec may take a bit more work. If we're writing out checkpoints at every function evaluation, we'll need to do some work to figure out which one is the last valid starting point before restoring--I think there are cases where the parameter values at minor iterations are not valid starting points. If it will be useful to users, though, we can tack it onto the feature list for a future release.
Log in or register to post comments
In reply to You really shouldn't be by tbrick
checkpoint validity
Log in or register to post comments
In reply to Every Function Evaluation, Please! by neale
Well, I can tell you why it
Log in or register to post comments
In reply to Well, I can tell you why it by mspiegel
And I can tell you (me) how to avoid the problem
Nevertheless, we should fix this issue - it looks as though the infinite loop may have something to do with a non-positive predicted covariance matrix.
Log in or register to post comments
In reply to I had forgotten that by mspiegel
We could do something that
Hessian/gradient calculation in OpenMx uses an iterative process that essentially takes small steps in each direction to estimate the change to the objective function in those directions. We could save the objective values at those locations and use them, if we restored from a checkpoint, to avoid having to recalculate.
One catch here is that the checkpoint file should probably not be the primary form of user feedback. If we really want to send feedback to the user about the state of the optimization, we should provide users a switch that either sends text directly to R's standard output or to a file or socket.
I haven't really heard much of a call for either of these features (status output or checkpointing during hessian/gradient calculation). Are either of them likely to be helpful?
Log in or register to post comments