mercredi 22 mai 2019

Applying a label depending on which condition is met using R

I would like to use a simple R function where the contents of a specified data frame column are read row by row, then depending on the value, a string is applied to that row in a new column.

So far, I've tried to use a combination of loops and generating individual columns which were combined later. However, I cannot seem to get the syntax right.

The input looks like this:

head(data,10)
# A tibble: 10 x 5
   Patient T1Score T2Score T3Score T4Score
     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1       3    96.4    75      80.4    82.1
 2       5   100      85.7    53.6    55.4
 3       6    82.1    85.7    NA      NA  
 4       7    82.1    85.7    60.7    28.6
 5       8   100      76.8    64.3    57.7
 6      10    46.4    57.1    NA      75  
 7      11    71.4    NA      NA      NA  
 8      12    98.2    92.9    85.7    82.1
 9      13    78.6    89.3    37.5    42.9
10      14    89.3   100      64.3    87.5

and the function I have written looks like this:

minMax<-function(x){

  #make an empty data frame for the output to go
  output<-data.frame()

    #making sure the rest of the commands only look at what I want them to look at in the input object
  a<-x[2:5]
  #here I'm gathering the columns necessary to perform the calculation
  minValue<-apply(a,1,min,na.rm=T)
  maxValue<-apply(a,1,max,na.rm=T)

  tempdf<-as.data.frame((cbind(minValue,maxValue)))

  Difference<-tempdf$maxValue-tempdf$minValue
  referenceValue<-ave(Difference)
  referenceValue<-referenceValue[1]

  #quick aside to make the first two thirds of the output file
  output<-as.data.frame((cbind(x[1],Difference)))

    #Now I need to define the class based on the referenceValue, and here is where I run into trouble.
  apply(output, 1, FUN = 
  for (i in Difference) {
  ifelse(i>referenceValue,"HIGH","LOW")

  }
  )
  output
  } 

I also tried...

    if (i>referenceValue) {
    apply(output,1,print("HIGH"))
   }else(print("LOW")) {}
  }
  )
  output
  } 



Regardless, both end up giving me the error message,

 c("'for (i in Difference) {' is not a function, character or symbol", "'    ifelse(i > referenceValue, \"HIGH\", \"LOW\")' is not a function, character or symbol", "'}' is not a function, character or symbol") 

The expected output should look like:

Patient Difference Toxicity
3  21.430000 LOW
5  46.430000 HIGH
6   3.570000 LOW
7  57.140000 HIGH
8  42.310000 HIGH
10  28.570000 HIGH
11   0.000000 LOW
12  16.070000 LOW
13  51.790000 HIGH
14  35.710000 HIGH

Is there a better way for me to organize the last loop?

Aucun commentaire:

Enregistrer un commentaire