I don't think kernel smoothing with factor analysis is terribly common (this post is the top google hit for this topic), but if you tell us a little more about what you want, we can probably help you out. Guessing at what you want, there's an easy version and a hard version.

The easy version is to estimate the same factor model for multiple time points of data, then use a kernel smoother on the resulting parameters. This is a two-step procedure, but each step (multiple group factor analysis, non-parametric smoothing) is well known and well researched. Output your parameters from step 1, grab any of the kernel smoothing packages in R, and you're set to go.

The hard version is to turn kernel regression into kernel factor analysis. You could work on extending the irt capabilities of OpenMx (http://openmx.psyc.virginia.edu/docs/OpenMx/2.0.0-3756/ItemFactorAnalysis.html) and defining the kernel smoother where the manifest variables are the y variables and the trait is the x variable. Alternatively, you could find some other way of implementing either the E-M algorithm or otherwise integrate over the latent variables to supply factor values for your smoother.

A middle option is to test for parametric effects of time on factor loadings. You can treat age as a definition variable in an algebra, making each loading l0+l1*time and see if factor loadings change with time.

Regardless, good luck, and let me know what other questions you have.

I think my question revolves around the 'hard' version, as my topic tries to extend the 'easy' version.
It is about showing the trends of the parameters of the factor model, using kernel smoothing, through time.

The data set and the R code used are attached. Data set has 5 items with a variable time point. Each time point has a number of observations.

For the analysis:

I use a kernel matrix to generate kernel weights and those are then integrated in the Openmx Model as 'sampling weights'. Gaussian kernel is used as the weighting function. We use a bandwidth for the smoothing, which was 1 in this case (the sd of the kernel). With the bandwidth, for each kernel centre/initial point, a number of observations are subjected to factor analysis, according to the distribution of the kernel weights, some weights are zero, as some observations are too far from a kernel centre.

Problem:
The problem is that plots I obtain from the output are not smooth at all, which should have been smooth. An example of the plot is provided in the R code.

Perhaps you could provide some suggestions for improvement, on the code or perhaps I should do something using your suggestions from your last reply about extending the irt capabilities?

Minor issue: the variable traint was undefined, but setting nrow=2113 made the code run (figured it out from the for loop). Looking over your sample weights, it looks like your smoothing window is relatively narrow. When I tried moving it either direction, I got the same jumps in the smoothing function.

I'm limited by my relative inexperience with kernel smoothers, but I suspect that this is still an issue with smoothing window size. Your weights/the kernel smoother appears to only give large weight to the two or three closest time points, and the magnitude of the weights appears to be either very large or very small, which would lead to big jumps. Sorry I can't give better advice.

I have another question. You said, in the second post, You could work on extending the irt capabilities of OpenMx (http://openmx.psyc.virginia.edu/docs/OpenMx/2.0.0-3756/ItemFactorAnalysi...) and defining the kernel smoother where the manifest variables are the y variables and the trait is the x variable."

Do you think, given the analysis of the data set, that this is doable? Although the link provides some information about IRT models, is there a concrete example to do this?

Again, thank you for the attempt to solve my question.

The extension you're discussing is do-able theoretically, but I'm not sure exactly how to do it. The basics of kernel smoothing depend on actual values of both the exogenous and endogenous variables, and some extensions of SEM into integration and the E-M algorithm provide just that, including the omx's irt features. I unfortunately can't help much beyond that, beyond to say that it seems wasteful to try to develop another method for factor/trait scores within an optimization when alternative options exist.

If you change your data formatting in your example so that all rows are always present in the dataset, though many have zero weight, you can look at the autocorrelations of weights to see if your big jumps are caused by discrete changes in the weights. I suspect that the big jumps happen when data jumps in or out of the dataset, and smoothing those jumps will reduce your problems.

Just putting in my two cents on this, I would probably use the two-stage approach. I don't think there's a particular benefit to the one-stage method and the one-stage method is A LOT more work.

If I really had to use a one-stage method to find smoothly varying parameters across 10 time points in a factor analysis, I wouldn't use a kernel smoother. With only 10 time points, I don't see a need to jump to a kernel smoother. A polynomial will also provide a smooth-varying functional form for the parameters and is much easier to implement. If you had 100 time points, then a kernel smoother might be worth it. But with 10 time points a 9th-order polynomial will smoothly connect all the points perfectly. Lower order polynomials will smooth more and force the estimated parameters onto a more slowly-varying curve.

For implementation, I'd create some mxAlgebra's and mxMatrix's.

PolyOrder <-4#e.g. 4time<-3#e.g. 3
mxMatrix(name="PolyCoefsLoad1", nrow=1, ncol=PolyOrder, values=1, free=TRUE)#polynomial coefficients that define the functional form of the time-varying pattern for Load1
mxMatrix(name="PolyIndepVar", nrow=PolyOrder, ncol=1, values=time^(0:PolyOrder))#time is defined for each time point / group.
mxAlgebra(PolyCoefsLoad1 %*% PolyIndepVar, name=paste("Load1AtTime", time, sep=""))#gives the smoothed value for Load1 at time 3##Alternatively
nvar <-7# number of observed variables
mxMatrix(name="PolyCoefsLoad", nrow=nvar, ncol=PolyOrder, values=1, free=TRUE)#polynomial coefficients that define the functional form of the time-varying pattern for all loadings
mxMatrix(name="PolyIndepVar", nrow=PolyOrder, ncol=1, values=time^(0:PolyOrder))#time is defined for each time point / group.
mxAlgebra(PolyCoefsLoad %*% PolyIndepVar, name=paste("LoadsAtTime", time, sep=""))#gives the smoothed value for all factor loadings at time 3

Thank you for the input. I did the smoothing already by using weights (through a weighting function) but I was wondering why the weights are not working. Attached are the data set and the code.

Hi metavid,

I don't think kernel smoothing with factor analysis is terribly common (this post is the top google hit for this topic), but if you tell us a little more about what you want, we can probably help you out. Guessing at what you want, there's an easy version and a hard version.

The easy version is to estimate the same factor model for multiple time points of data, then use a kernel smoother on the resulting parameters. This is a two-step procedure, but each step (multiple group factor analysis, non-parametric smoothing) is well known and well researched. Output your parameters from step 1, grab any of the kernel smoothing packages in R, and you're set to go.

The hard version is to turn kernel regression into kernel factor analysis. You could work on extending the irt capabilities of OpenMx (http://openmx.psyc.virginia.edu/docs/OpenMx/2.0.0-3756/ItemFactorAnalysis.html) and defining the kernel smoother where the manifest variables are the y variables and the trait is the x variable. Alternatively, you could find some other way of implementing either the E-M algorithm or otherwise integrate over the latent variables to supply factor values for your smoother.

A middle option is to test for parametric effects of time on factor loadings. You can treat age as a definition variable in an algebra, making each loading l0+l1*time and see if factor loadings change with time.

Regardless, good luck, and let me know what other questions you have.

ryne

Good day,

Thank you for your response.

I think my question revolves around the 'hard' version, as my topic tries to extend the 'easy' version.

It is about showing the trends of the parameters of the factor model, using kernel smoothing, through time.

The data set and the R code used are attached. Data set has 5 items with a variable time point. Each time point has a number of observations.

For the analysis:

I use a kernel matrix to generate kernel weights and those are then integrated in the Openmx Model as 'sampling weights'. Gaussian kernel is used as the weighting function. We use a bandwidth for the smoothing, which was 1 in this case (the sd of the kernel). With the bandwidth, for each kernel centre/initial point, a number of observations are subjected to factor analysis, according to the distribution of the kernel weights, some weights are zero, as some observations are too far from a kernel centre.

Problem:

The problem is that plots I obtain from the output are not smooth at all, which should have been smooth. An example of the plot is provided in the R code.

Perhaps you could provide some suggestions for improvement, on the code or perhaps I should do something using your suggestions from your last reply about extending the irt capabilities?

Thank you for your input.

Regards,

metavid

metavid,

Minor issue: the variable traint was undefined, but setting nrow=2113 made the code run (figured it out from the for loop). Looking over your sample weights, it looks like your smoothing window is relatively narrow. When I tried moving it either direction, I got the same jumps in the smoothing function.

I'm limited by my relative inexperience with kernel smoothers, but I suspect that this is still an issue with smoothing window size. Your weights/the kernel smoother appears to only give large weight to the two or three closest time points, and the magnitude of the weights appears to be either very large or very small, which would lead to big jumps. Sorry I can't give better advice.

Good luck with your project.

ryne

Thank you for the effort.

I will try and consider the details you stated.

I have another question. You said, in the second post, You could work on extending the irt capabilities of OpenMx (http://openmx.psyc.virginia.edu/docs/OpenMx/2.0.0-3756/ItemFactorAnalysi...) and defining the kernel smoother where the manifest variables are the y variables and the trait is the x variable."

Do you think, given the analysis of the data set, that this is doable? Although the link provides some information about IRT models, is there a concrete example to do this?

Again, thank you for the attempt to solve my question.

metavid

Glad to help, metavid.

The extension you're discussing is do-able theoretically, but I'm not sure exactly how to do it. The basics of kernel smoothing depend on actual values of both the exogenous and endogenous variables, and some extensions of SEM into integration and the E-M algorithm provide just that, including the omx's irt features. I unfortunately can't help much beyond that, beyond to say that it seems wasteful to try to develop another method for factor/trait scores within an optimization when alternative options exist.

If you change your data formatting in your example so that all rows are always present in the dataset, though many have zero weight, you can look at the autocorrelations of weights to see if your big jumps are caused by discrete changes in the weights. I suspect that the big jumps happen when data jumps in or out of the dataset, and smoothing those jumps will reduce your problems.

As I suspected, some modifications are needed to be done to the data set due to abrupt jumps in the plots.

Thank you for your feedback and effort.

Just putting in my two cents on this, I would probably use the two-stage approach. I don't think there's a particular benefit to the one-stage method and the one-stage method is A LOT more work.

If I really had to use a one-stage method to find smoothly varying parameters across 10 time points in a factor analysis, I wouldn't use a kernel smoother. With only 10 time points, I don't see a need to jump to a kernel smoother. A polynomial will also provide a smooth-varying functional form for the parameters and is much easier to implement. If you had 100 time points, then a kernel smoother might be worth it. But with 10 time points a 9th-order polynomial will smoothly connect all the points perfectly. Lower order polynomials will smooth more and force the estimated parameters onto a more slowly-varying curve.

For implementation, I'd create some mxAlgebra's and mxMatrix's.

Good day,

Thank you for the input. I did the smoothing already by using weights (through a weighting function) but I was wondering why the weights are not working. Attached are the data set and the code.

Thank you.