Copyright © 2007-2024 The OpenMx Project
This howto provides step-by-step instructions on writing your own objective function for OpenMx. You will first want to prototype your objective function using either an MxAlgebraObjective or an MxRObjective. An algebra objective consists of an arbitrary number of matrix algebra statements that ultimately evaluate to a 1 x 1 matrix. An R objective consists of an arbitrary R function that evaluates a model to return a numeric value. Algebra objectives are computationally faster than R objectives, but algebra objectives are more limited in their expressive power. When you are satisfied prototyping your new objective function, we will guide you through the process of writing the front-end for this objective function in R, and writing the back-end for this objective function in C.
Can you express your objective function using the matrix operators and functions provided by OpenMx? If so, then create an MxAlgebraObjective. The following example is from the SimpleAlgebra demo.
A <- mxMatrix("Full", values = c(1, 2), nrow=1, ncol=2, name="A") B <- mxMatrix("Full", values = c(3, 4), nrow = 2, ncol = 1, free = c(TRUE, FALSE), name="B") algebra <- mxAlgebra(A %*% B - 3, "algebra") algSquared <- mxAlgebra(algebra %*% algebra, "algebraSquared") model <- mxModel("themodel") model <- mxModel(model, A, B) model <- mxModel(model, algebra, algSquared) model <- mxModel(model, mxAlgebraObjective("algebraSquared")) # Test 1: Algebra is a multiply and subtract modelOut <- mxRun(model)
Otherwise, express your objective function in the R front-end. Write a function in R that accepts two arguments. The first argument is the mxModel that should be evaluated, and the second argument is some persistent state information that can be stored between one iteration of optimization to the next iteration. It is valid for the function to simply ignore the second argument. The function must return either a single numeric value, or a list of exactly two elements. If the function returns a list, the first argument must be a single numeric value and the second element will be the new persistent state information to be passed into this function at the next iteration. The single numeric value will be used by the optimizer to perform optimization. For more help on the slots in the MxModel class, see the help provided with the OpenMx package (type '?MxModel' in R). The following example is from the SimpleRObjective demo.
A <- mxMatrix(nrow = 2, ncol = 2, values = c(1:4), free = TRUE, name = 'A') squared <- function(x) { x ^ 2 } objFunction <- function(model, state) { values <- mxEval(A, model) return(squared(values[1,1] - 4) + squared(values[1,2] - 3) + squared(values[2,1] - 2) + squared(values[2,2] - 1)) } objective <- mxRObjective(objFunction) model <- mxModel('model', A, objective) modelOut <- mxRun(model)
In order to write your own objective function, you will need to create your own objective function class in R. For teaching purposes, we are going to use the MxAlgebraObjective class as a guide. First, your class needs to be a sub-class of the MxBaseObjective class. The MxBaseObjective class contains three slots: 'name' which is a character string, 'data' which is either a character string or a number, and 'result' which is a 1x1 matrix that stores the result of optimization. Your class should contain any additional slots that you need to pass along to the optimizer. In the MxAlgebraObjective, we have an additional slot that will inform the optimizer as to which MxAlgebra to use for optimization.
setClass(Class = "MxAlgebraObjective", representation = representation( algebra = "MxCharOrNumber"), contains = "MxBaseObjective")
Next, you will need to overwrite five generic methods for your new objective function class. These six methods are initialize, genericObjModelConvert, genericObjFunNamespace, genericObjFunConvert, genericObjDependencies, and genericObjRename.
The initialize method is a constructor method. It will never be directly called by the user, but it will be critical for making new instances of your class. When writing your initialize method, it is required that the default value for the data slot is as.integer(NA) and the default value for the name slot is 'objective'. The data slot will only be modified by the genericObjFunConvert method, and the name slot will only be modified by the genericObjFunNamespace method.
setMethod("initialize", "MxAlgebraObjective", function(.Object, algebra, data = as.integer(NA), name = 'objective') { .Object@name <- name .Object@algebra <- algebra .Object@data <- data return(.Object) } )
The genericObjModelConvert method is used in the event that you need to perform transformations on the model that are specific to your objective function. One example use-case of this method is to transform RAM objective functions to FIML objective functions, if raw data is provided to the RAM objective function. During that particular transformation, several additional matrices and algebras are added to the model, and the RAM objective function is replaced with a FIML objective function. For now, we will show you the default behavior of this function, which is the most common behavior. You do not need to copy/paste the default behavior into your objective function, it is inherited from "MxBaseObjective".The function genericObjModelConvert accepts five parameters: '.Object' is the current objective function, 'job' is the entire job that was passed to the mxRun() function, 'model' is the individual model associated with the current objective function, 'namespace' is the model namespace, and 'flatJob' is a flattened version of 'job'. This function MUST return the job. Note: As of OpenMx 1.2, this function must return list(job, flatJob)
. This change was implemented to improve performance. If you set job@.newobjects
, job@.newobjective
, and job@.newtree
to FALSE (see next paragraph), then you must manually update flatJob with any changes you make to job. Otherwise, you do not need to update flatJob as the job will be re-flattened.
The function is responsible for updating three private slots in the job object. All three of the slots contain boolean values. job@.newobjects
should report whether additional matrices or algebras were added anywhere in the model. This triggers a regeneration of the namespace. job@.newobjective
should report whether the current objective function was transformed into another type of objective function. This triggers the genericObjModelConvert method on the new objective function. job@.newtree
should report whether any new submodels were introduced in the model. This triggers genericObjModelConvert on the new submodels.
setMethod("genericObjModelConvert", "MxBaseObjective", function(.Object, job, model, namespace, flatJob) { job@.newobjects <- FALSE job@.newobjective <- FALSE job@.newtree <- FALSE return(job) # or return(list(job, flatJob)) in the new API })
genericObjFunNamespace The genericObjFunNamespace method is used to transform an objective function from its local model namespace, into the flattened model namespace. You must use the function omxConvertIdentifier on any slot that is storing name information. Remember to always invoke omxConvertIdentifier on the 'name' slot and the 'data' slot. Shown below is the implementation for the FIML objective function.
setMethod("genericObjFunNamespace", signature("MxFIMLObjective"), function(.Object, modelname, namespace) { .Object@name <- omxIdentifier(modelname, .Object@name) .Object@covariance <- omxConvertIdentifier(.Object@covariance, modelname, namespace) .Object@means <- omxConvertIdentifier(.Object@means, modelname, namespace) .Object@data <- omxConvertIdentifier(.Object@data, modelname, namespace) return(.Object) })
The genericObjFunConvert method is invoked to transform an objective function from a form that interacts with the R front-end, to a form that is suitable for processing by the C back-end. Any character strings in the objective function are converted into matrix/algebra pointers for the back-end to process. You should read the documentation for the function omxLocateIndex to familiarise yourself with matrix/algebra pointers. Two arguments are passed into this function, a flattened version of the model, and the original model itself. When at all possible, it is recommended that you interact with the flattened version of the model. Below we have included the genericObjFunConvert method for both the algebra objective and the FIML objective. The FIML objective has been included so that you may see how the data slot is processed. NOTE: .Object@algebra is a string on input, and is converted into a matrix/algebra pointer.
setMethod("genericObjFunConvert", signature("MxAlgebraObjective"), function(.Object, flatModel, model) { name <- .Object@name algebra <- .Object@algebra if (is.na(algebra)) { modelname <- omxReverseIdentifier(model, .Object@name)[[1]] msg <- paste("The algebra name cannot be NA", "for the algebra objective of model", omxQuotes(modelname)) stop(msg, call. = FALSE) } algebraIndex <- omxLocateIndex(flatModel, algebra, name) .Object@algebra <- algebraIndex return(.Object) })
NOTE: .Object@data, .Object@means, and .Object@covariance are all strings at the beginning of this function, and are converted into data pointers or matrix/algebra pointers.
setMethod("genericObjFunConvert", signature("MxFIMLObjective"), function(.Object, flatModel, model) { modelname <- omxReverseIdentifier(model, .Object@name)[[1]] name <- .Object@name if(is.na(.Object@data)) { msg <- paste("The FIML objective", "does not have a dataset associated with it in model", omxQuotes(modelname)) stop(msg, call.=FALSE) } mxDataObject <- flatModel@datasets[[.Object@data]] if (mxDataObject@type != 'raw') { msg <- paste("The dataset associated with the FIML objective", "in model", omxQuotes(modelname), "is not raw data.") stop(msg, call.=FALSE) } verifyObservedNames(mxDataObject@observed, mxDataObject@type, flatModel, modelname, "FIML") checkNumericData(mxDataObject) meansName <- .Object@means covName <- .Object@covariance dataName <- .Object@data threshNames <- .Object@thresholds .Object@means <- omxLocateIndex(flatModel, .Object@means, name) .Object@covariance <- omxLocateIndex(flatModel, .Object@covariance, name) .Object@data <- omxLocateIndex(flatModel, .Object@data, name) verifyExpectedNames(covName, meansName, flatModel, modelname, "FIML") .Object@definitionVars <- generateDefinitionList(flatModel) .Object@dataColumns <- generateDataColumns(flatModel, covName, dataName) .Object@thresholdColumns <- generateThresholdColumns(flatModel, threshNames, dataName) if (length(mxDataObject@observed) == 0) { .Object@data <- as.integer(NA) } return(.Object) })
The OpenMx library performs cycle detection to determine if the algebraic relationships in a model contain any cycles. You must specify the dependencies of your objective function type. Below we see how the dependencies for the RAM objective function are specified.
setMethod("genericObjDependencies", signature("MxRAMObjective"), function(.Object, dependencies) { sources <- c(.Object@A, .Object@S, .Object@F, .Object@M, .Object@thresholds) sources <- sources[!is.na(sources)] dependencies <- omxAddDependency(sources, .Object@name, dependencies) return(dependencies) })
This method is invoked by the mxRename() function whenever an objective function is encountered. Renaming allows an existing model name to be transformed into a new name, and all identifiers using the original model name are transformed into the new name. A typical need for renaming pieces of the objective function concern any definition variables that are named "oldname.dataset.defvar" must be renamed to "newname.dataset.defvar".
setMethod("genericObjRename", signature("MxFIMLObjective"), function(.Object, oldname, newname) { .Object@covariance <- renameReference(.Object@covariance, oldname, newname) .Object@means <- renameReference(.Object@means, oldname, newname) .Object@data <- renameReference(.Object@data, oldname, newname) .Object@thresholds <- sapply(.Object@thresholds, renameReference, oldname, newname) return(.Object) })
Finally, you should write your own method that creates instances of your new objective function class. This interface will invoke the 'initialize' constructor that you wrote earlier, but should also include some basic sanity-checking before invoking the constructor. Here is the algebra objective function interface. Since this method does not have access to any parts of the model, you will not be able to perform full error-checking at this stage. Full error-checking should be performed in the genericObjFunConvert method.
mxAlgebraObjective <- function(algebra) { if (missing(algebra) || typeof(algebra) != "character") { stop("Algebra argument is not a string (the name of the algebra)") } return(new("MxAlgebraObjective", algebra)) }
The following is the workflow of the mxRun() function, showing where each objective function method occurs in the workflow.
In order to complete the implementation of the objective function, you will have to build a matching construction in the back-end to perform the actual computation of the objective function. This requires populating the various fields of the omxObjective object, which has been designed for the purpose.
In addition to the creation of appropriate memory structures to store the appropriate data, you will need to provide a defintion of at least three of the five objective-related functions: an Init() function, a Call() function, a Free() function, and possibly a NeedsUpdate() function and a Gradient() function. The last two are optional, but it will be difficult to process your objective function effectively without all of the first three. Finally, you will need to list your objective function in the Objective Table, so that the OpenMx back-end will know how to deal with it. With the exception of the last piece, these can all be listed in a single .c file, which should be named after the objective. For this example, we will look at the omxRAMObjective function.
An omxObjective structure is a container that will house the data and functions required for your objective calculation. In order to create a working objective object, you must create functions to fill in the following slots:
Several other functions are also available, though not required:
These functions are not required, and OpenMx will intelligently default if they are not specified.
Steps:
typedef struct { omxMatrix *cov, *means; // Expected covariance and means omxMatrix *A, *S, *F, *M, *I; // Matrices defined in the MxRAMObjective R Structure omxMatrix *C, *X, *Y, *Z, *Ax, *P, *V, *Mns; // Computation storage space // Metadata for RAM calculation int numIters; double logDetObserved; double n; double* work; int lwork; // Metadata for PPML computations ... } omxRAMObjective;
The first line creates locations for the model-implied covariance and means matrices that will be calculated by the RAM objective. The second creates locations for the structures passed to C in the R objective object. The third line creates intermediary matrices used during computation.
The next set of lines includes metadata used during computation; for example, workspace required by the inversion routine and the precomputed value of the saturated model.
The final structures are for ML Preprocessing, and are currently in development; they will not be discussed during this tutorial.
Notice that matrix and algebra elements are all stored as pointers to omxMatrix structures--that is, each one is an omxMatrix*. The omxMatrix* structure is a container which may hold a static matrix, the results of an algebra, or the results of another objective function. You need not worry about which of these is contained there; you may treat them all as matrices. This allows users to specify these matrices in any way they choose in the front-end without requiring any changes from the back-end developer.
The initFun routine must populate each of these structures and allocate memory to fill them.
If your objective function can be calculated using a standard ML computation, it is easiest to set it up as a subobjective of the ML objective function. To do this, your objective function will need to calculate the expected covariance and (if specified) expected means structures at each cycle of computation.
This structure is recommended when the objective function closely resembles Maximum Likelihood, since the new objective function will gain the benefits of existing OpenMx computation routines, such as Full Information computation and faster computations.
In order to set up a subobjective, two steps must occur. First, you must define and add an omxSubobjective structure to the objective tree. Second, createMLObjective must be called on the oo structure passed to your initFun.
For example, in omxInitRAMObjective(), notice the following lines:
omxObjective *subObjective = omxCreateSubObjective(oo); .... // Here, we create RAM structures as above omxCreateMLObjective(oo, rObj, RAMobj->cov, RAMobj->means);
After the subObjective is created, all changes are made to the subObjective object, and the original objective, oo, is left alone. This allows the underlying ML objective to create appropriate structures in the oo structure without interfering with the subobjective.
If your objective function cannot be calculated as a maximum likelihood process or if you do not wish to use the existing ML objective, you may omit these steps and directly use the oo structure instead of the subobjective. Note that in this case you must calculate the total objective value yourself.
Stages of the initFun:
omxRAMObjective *RAMobj = (omxRAMObjective*) R_alloc(1, sizeof(omxRAMObjective));
subObjective->argStruct = (void*) RAMobj;
Functions are registered by setting function pointers in the appropriate slots of the objective object. The functions themselves must either have definitions or forward declarations. These definitions/declarations must either be earlier in the file than the init function, or must be in an included .h file. An easy solution to this is to place the omxInitObjective() function last in your created file.
The two most important functions to register are the objectiveFun and the destructFun, although there are others that can be specified.
subObjective->objectiveFun = omxCallRAMObjective; subObjective->destructFun = omxDestroyRAMObjective;
if(OMX_DEBUG) { Rprintf("Processing M.\n"); } RAMobj->M = omxNewMatrixFromIndexSlot(rObj, currentState, "M"); if(OMX_DEBUG) { Rprintf("Processing A.\n"); } RAMobj->A = omxNewMatrixFromIndexSlot(rObj, currentState, "A"); if(OMX_DEBUG) { Rprintf("Processing S.\n"); } RAMobj->S = omxNewMatrixFromIndexSlot(rObj, currentState, "S"); if(OMX_DEBUG) { Rprintf("Processing F.\n"); } RAMobj->F = omxNewMatrixFromIndexSlot(rObj, currentState, "F");
Extracting non-matrix data requires use of R accessor functions and the PROTECT() macro. PROTECT() builds a stack of R object pointers that protect structures from R's garbage collection functions. It is important to PROTECT() any structures while you extract data from them, and then UNPROTECT() the same number of objects when you are finished. For more information on the PROTECT() macro, see Writing R Extensions.
In general, extracting data from R objects will follow the same format. First, create a SEXP object. Use PROTECT and GET_SLOT() to extract the element. Cast the element to the appropriate type using INTEGER() or NUMERIC(). These return pointers to C arrays of integers or doubles. End by UNPROTECT()ing the same number of elements that were protected.
if(OMX_DEBUG) { Rprintf("Processing expansion iteration depth.\n"); } PROTECT(slotValue = GET_SLOT(rObj, install("depth"))); RAMobj->numIters = INTEGER(slotValue)[0]; if(OMX_DEBUG) { Rprintf("Using %d iterations.", RAMobj->numIters); } UNPROTECT(1);
The if(OMX_DEBUG) lines contain debugging information to be printed to the console if the DEBUG_MX flag is passed to the compiler during construction. Rprintf functions like printf(), but prints to the R console.
/* Identity Matrix, Size Of A */ if(OMX_DEBUG) { Rprintf("Generating I.\n"); } RAMobj->I = omxNewIdentityMatrix(RAMobj->A->rows, currentState); omxRecompute(RAMobj->I); RAMobj->Z = omxInitMatrix(NULL, k, k, TRUE, currentState); RAMobj->Ax = omxInitMatrix(NULL, k, k, TRUE, currentState); RAMobj->Y = omxInitMatrix(NULL, l, k, TRUE, currentState); RAMobj->X = omxInitMatrix(NULL, l, k, TRUE, currentState);
It also creates the omxMatrix() objects to store the covariance and (if applicable) means for the ML objective. The ML objective will notice if the means argument is NULL, and will attempt to calculate the result without including means. If this is not possible (for example, an implied means structure is required for FIML), it will throw an appropriate error.
RAMobj->cov = omxInitMatrix(NULL, l, l, TRUE, currentState); if(RAMobj->M != NULL) { RAMobj->means = omxInitMatrix(NULL, 1, l, TRUE, currentState); } else RAMobj->means = NULL;
omxCreateMLObjective(oo, rObj, RAMobj->cov, RAMobj->means);
omxRAMObjective* oro = (omxRAMObjective*) (oo->argStruct);
Next, extract those elements of the storage structure that will be needed for the calculation into local variables.
omxMatrix* A = oro->A; omxMatrix* S = oro->S; omxMatrix* X = oro->X; omxMatrix* Ax= oro->Ax; omxMatrix* Z = oro->Z; omxMatrix* I = oro->I; int numIters = oro->numIters;
Before using these matrices/algebras, it is recommended that you call omxRecompute() on each one. This will ensure that the matrix is up-to-date, or that the algebra it represents is not dirty before continuing. The omxRecompute() function is fairly light, and will only force actual recomputation when it is needed, so there is little cost in recomputing something that has already been computed.
BLAS wrappers to perform most of the expected BLAS and LAPACK functions will soon be available. In the meantime, you must directly use the BLAS operators provided by R. The omxMatrix object has several useful structures for interacting with BLAS. The data members majority and minority keep appropriate BLAS structures for passing to BLAS to indicate the appropriate transposition of the matrix for computation; use matrix->majority as the TRANSA or TRANSB arguments of BLAS to indicate no transposition, and matrix->minority to indicate transposition. The rows and cols data members keep the number of rows and/or columns of the data structure. matrix->data contains the actual data matrix in BLAS-readable format. The matrix->leading and matrix->lagging members include the "leading edge of the matrix in the original program" and the appropriate trailing edge.
OpenMx also implements a faster (but NA-unsafe) multiplication routine following the BLAS dgemm signature. Unless there is an expected way for an NA to find its way into your objective calculation, this method will usually be safe to use. A standard untransposed calculation of C = A %*% B looks like this:
double oned = 1.0, zerod = 0.0; F77_CALL(omxunsafedgemm)(A->majority, B->majority, &(A->rows), &(B->cols), &(B->rows), &oned, A->data, &(A->leading), B->data, &(B->leading), &zerod, C->data, &(C->leading)); // C = A %*% B
See the RAM objective and the BLAS documentation for more examples.
If you are creating a subobjective, your call function only needs to populate the correct values into those functions passed to the parent objective. In the case of ML, this means that only the model-implied covariance and means matrices need to be populated. The parent objective will take care of the rest.
If you are not creating an ML-based subobjective, the final result of your calculation must be a scalar value. The value calculated should be populated to the 0,0 element of the objective matrix at oo->matrix. The value can be populated using omxSetMatrixElement(omxMatrix* matrix, int row, int col, double value). In practice, this call looks like this:
omxSetMatrixElement(oo->matrix, 0, 0, value);
Where value contains the final output of the objective calculation.
Workflow of the OpenMx back-end objective structures: