We wish to test a bivariate Cholesky model, but using a CFA of raw items as input (rather than scale scores).

What springs to mind is to build a raw-data RAM-based CFA, and then use the individual-level cells representing latent variable scores as input to matrix-style models in a twin analysis.

The RAM models optimise against the raw data, and the Matrix models would take no data, and optimise against the RAM latent factors.

Does that make sense? Any examples floating around? Or alternative approaches (hierarchical factor model and Cholesky all in one, I guess?

If the number of items is not large (<10 or so) then a common pathway model could be fitted. With larger numbers of items, a marginal maximum likelihood approach might be used. However, if the number of factors is large this will also be prohibitive in terms of CPU time. At least until we get a grid version of OpenMx with an individual likelihood level of granularity. I've started work on a GPU version of the integration routine, which might also help, but probably not for a year or two.

You could, as you suggest, use factor scores as input for a second step in the analysis. This does carry the disadvantage that except under rather unusual Rasch model conditions, not all factor scores are created equal. That is, they differ in their error of measurement. Of course, this is frequently an issue with sum scores or scale scores generally, the analyses of which the literature is replete. Only the odd purist amongst us seems to mind :).