lundi 27 juillet 2020

Function in R that performs multiple operations over columns of two datasets

I have two datasets, each with 5 columns and 10,000 rows. I want to calculate y from values in columns between the two datasets, column 1 in data set 1 and column 1 in data set 2; then column 2 in data set 1 and column 2 in data set 2. The yneeds nonetheless to follow a set of rules before being calculated. What I did so far doesn't work, and I cannot figure it out why and if there is a easier way to do all of this.

  1. Create data from t-distributions
mx20 <- as.data.frame(replicate(10000, rt(20,19)))
mx20.50 <- as.data.frame(replicate(10000, rt(20,19)+0.5)) 
  1. Calculates the mean for each simulated sample
m20 <- apply(mx20, FUN=mean, MARGIN=2)
m20.05 <- apply(mx20.50, FUN=mean, MARGIN=2)

The steps 1 and 2_ above are repeated for five sample sizes from t-distributions rt(30,29); rt(50,49); rt(100,99); and rt(1000,999)

  1. Bind tables (create data.frame) for each t-distribution specification
tbl <- cbind(m20, m30, m50, m100, m1000)
tbl.50 <- cbind(m20.05, m30.05, m50.05, m100.05, m1000.05)
  1. Finally, I want to calculate the y as specified above. But here is where I get totally lost. Please see below my best attempt so far.

y = (mtheo-m0)/(m1-m0), where y = 0 when m1 < m0 and y = y when m1 >= m0. mtheo is a constant (e.g. 0.50), m1 is value in column 1 of tbl and m0 is value in column 1 of tbl.50.

ycalc <- function(mtheo, m1, m0) {
  ifelse(m1>=m0) {
    y = (mteo-m0)/(m1-m0)
  } ifelse(m1<m0) {
    y=0
  } returnValue(y)
} 

Aucun commentaire:

Enregistrer un commentaire