Finding information on how to generate data in R that looks like the data you would use in a research paper is hard. As such I will post a few examples on how to do so here. Many thanks to Dason from Talk Stats for getting me started (R code is in italics). Lets start with simple regression and I will post more complex examples later.

**So to generate a simple regression**:

#First create a set of random numbers with a mean of 0 and a SD of 1
age<-rnorm(100)
#To set different mean and sd use mean = x, sd = x
#For example age<-rnorm(100, mean=5, sd=1.5)
#You will want to simplify the data set to look more real
age<-round(age, digits = 2)
#create a model matrix in which age is the predictor
X = model.matrix(~age)
#To be able to easily set and change the error easily and to not have to worry about change the #code if we decide to alter the number of 'participants' in our randomly generated data
dimnames(X)[[2]]
#set the intercept and regression parameter. Intercept at 1 and effect of age on y at 1.3 (can be #whatever you like however)
beta <- c(1,1.3)
#add error keeping in mind the scale of the random variable generated. Vary this to see the effect #of error on significance levels and parameter bias.
error <- rnorm(dim(X)[1],mean = 0, sd = 0.5)
#create outcome variable with parameter estimates plus error
SC <- X%*%beta + error
#round the data to be more realistic
SC <- round(SC, digits = 2)
#package into a data frame
mydata <- data.frame(age,SC)
#See if it works
plot(age, SC)
model<-lm(SC~age)
summary(model)

### Like this:

Like Loading...

*Related*

## About Philip Parker

I am a post doc in developmental and educational psychology at a Germany university. I did my PhD at the university of Sydney in stress and well-being. Most days I am hunched over a computer yelling at statistical software or responding to journal editors who seem to always want twice the amount of content but with half the words. For fun I like to read up on the latest developments in R and programming various functions.