Creating Composite Scores in R

There maybe times when you need scores corrected for measurement error and latent factor loading. In these instances it can be beneficial to use  composite scores derived from one factor congeneric models. While easy they are time consuming to calculate by hand and I am unaware of any program that does it for you. The following code in R will take normal Mplus output (which must include FSCOEFFICIENT FSDETERMINACY in the output line of your mplus input syntax) and produce a vector of fit and a vector of weights. I have also included the nifty feature of calculating latent variable reliability estimates. The code is in early stages of development so only supports scales with 2 to 4 items but you will get the idea. Now this looks like a lot of work BUT in reality all you need to do is specify where your mplus output file is located in the first line of code and then simply run all the code giving you proportional weights for composite score calculation in about 5 seconds. Again R code is in italics (compile these with the non-italics instructions removed before running).

Creation of proportional factor score regression: Weights for composite score creation

#First run your model in mplus and read it into R. This is done as follows
 
out <- readLines(D:MPLUSFILENAME.out” )
 
#Next we use the grep feature to tell R to find aspects of our data
#to include in the output we will produce. This has the advantage of
#being the same no mater how long or short your Mplus input file is
#(that means you dont need to constantly count lines and change input).
 
#Lets start with chi square. This will give us the value, degrees of freedom and significance.
 
# Chi Square
 
ind1 <- grep( “Chi-Square Test of Model Fit” , out )[1]
 
chisq <- c( as.numeric( substring( out[ ind1 + 2 ] , 38 ) ) )
 
df <- c( as.numeric( substring( out[ ind1 + 3 ] , 38 ) ) )
 
p <- c( as.numeric( substring( out[ ind1 + 4 ] , 38 ) ) )
 
#Next we will get the fit
 
# CFI / TLI
 
ind2 <- grep( “CFI/TLI” , out )
 
cfi <- c( as.numeric( substring( out[ ind2 + 2 ] , 38 ) ) )
 
tli<- c(as.numeric( substring( out[ ind2 + 3 ] , 38 ) ) )
 
#RMSEA
 
ind3 <- grep( “RMSEA” , out )[1]
 
rmsea <- c( as.numeric( substring( out[ ind3 + 2 ] , 38 ) ) )
 
# SRMR
 
ind4 <- grep( “SRMR” , out )[1]
 
srmr <- c( as.numeric( substring( out[ ind4 + 2 ] , 38 ) ) )
 
#Next we calculate the latent factor reliabilities. 
#First we grab the factor loadings then the residuals
 
#Factor Loadings
 
ind6 <- grep( “STDYX Standardization” , out )
 
PEST<- c (  as.numeric( substring( out[ ind6 + 6 ] , 23, 28) ),                              
as.numeric( substring( out[ ind6 + 7 ] , 23, 28) ),                             
as.numeric( substring( out[ ind6 + 8 ] , 23, 28) ),                               
as.numeric( substring( out[ ind6 + 9 ] , 23, 28) )
 
)
 
#Residuals
 
RES<- c(    as.numeric( substring( out[ ind6 + 21 ] , 23, 28) ),
 
as.numeric( substring( out[ ind6 + 22 ] , 23, 28) ), 
as.numeric( substring( out[ ind6 + 23 ] , 23, 28) ), 
as.numeric( substring( out[ ind6 + 24 ] , 23, 28) )
)
 
#Next we calculate the reliability rounded to 2 decimal places.
 
rel <- round( (sum (PEST)^2)/ ( (sum (PEST)^2) + sum (RES) ), digits = 2)
 
#We can now wrap the fit into a list and
#move one to calculating the proportional weights
# for the composite scores calculations.
 
fit<-data.frame(chisq, df, p, cfi, tli, rmsea, srmr, fsd, rel)
 
#Now specific to the factor scores 
#we want we will get the factor determinates and regression weights.
 
#Factor score determinates
 
ind5 <- grep( ”           FACTOR DETERMINACIES”  , out )
 
fsd <- c(as.numeric( substring( out[ ind5 + 2 ] , 22, 27) ) )
 
#Factor score regression weights
 
ind7 <- grep( “SUMMARY OF FACTOR SCORES” , out )
 
FSRW <- c(  as.numeric( substring( out[ ind7 + 9 ] , 16, 21) ),
as.numeric( substring( out[ ind7 + 9 ] , 30, 35) ),  
as.numeric( substring( out[ ind7 + 9 ] , 44, 49) ),   
as.numeric( substring( out[ ind7 + 9 ] , 58, 63) ),    
as.numeric( substring( out[ ind7 + 9 ] , 72, 77) )
 
)
 
#Now we have the data we just calculate the weights
 
FSTOT<-sum(FSRW,na.rm=T)
 
FSPROP<-FSRW/FSTOT
 
#Finally we call the output we need.
 
#First we check the proportional weights add to 1 as required
 
sum (FSPROP,na.rm=T)
 
#Next we get the vector of fit values and the latent variable relatibility
 
fit
 
#Now all we need is the composite score weights
 
FSPROP
Advertisements

About Philip Parker

I am a post doc in developmental and educational psychology at a Germany university. I did my PhD at the university of Sydney in stress and well-being. Most days I am hunched over a computer yelling at statistical software or responding to journal editors who seem to always want twice the amount of content but with half the words. For fun I like to read up on the latest developments in R and programming various functions.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s