mercredi 4 avril 2018

R adding an if.else statement to function to return "null"

I have a question linked to another post from yesterday: R finding the first value in a data frame that falls within a given threshold.

As per previous post I have data frame with optical density (OD) over time:

time    OD
446     0.0368
446.5   0.0353
447     0.0334
447.5   0.032
448     0.0305
448.5   0.0294
449     0.0281
449.5   0.0264
450     0.0255
450.5   0.0246
451     0.0238
451.5   0.0225
452     0.0211
452.5   0.0199
453     0.0189
453.5   0.0175

I have upper and lower threshold values of OD and I need to locate these within in the data frame.

Using:

library(dplyr)

find_time = function(df, threshold){
  return_value = df %>%
    arrange(time) %>%
    filter(OD < threshold) %>%
    slice(1)
  return(return_value)
}

find_time(data, threshold)

which returns exactly what I am looking for:

  time     OD
  <dbl>  <dbl>
   446 0.0368

However, I need both the upper (0.5033239) and lower (-0.3695971) thresholds so I have changed to:

find_time = function(df, threshold_1, threshold_2){
  return_value_1 = df %>%
    arrange(time) %>%
    filter(OD > threshold_1) %>%
    slice_(1)

  return_value_2 = df %>%
        arrange(time) %>%
        filter(OD < threshold_2) %>%
        slice_(1)

  return(data.frame(return_value_1, return_value_2))
}

When I run the code I get one of two errors:

Error in data.frame(return_value, return_value_2) : 
  **arguments imply differing number of rows: 1, 0**
Called from: data.frame(return_value, return_value_2)

or the error:

[1] time   OD     time.1 OD.1  
<0 rows> (or 0-length row.names)

These errors seems to be caused by the fact that for some study subjects the OD data never reaches the upper threshold. When this is the case the function doesn't return the other value either (which does actually exist in the data) and instead produces the above errors.

I would like to add an if statement within the function so that when one of upper or lower thresholds are not found it would return "null", but still give me the value of the other (i.e. if upper threshold is not reached then return null but also give time and OD for lower threshold).

I have tried, but clearly I'm doing it horribly wrong:

find_time = function(df, threshold_1, threshold_2){
  return_value_1 = df %>%
    arrange(time) %>%
    filter(OD > threshold_1) %>%
    slice_(1)

  **if(OD > threshold_1){
    print(return_value_1)
  } else {
    print("NULL")
  }**


  return_value_2 = df %>%
    arrange(time) %>%
    filter(OD < threshold_2) %>%
    slice_(1)

  **if(OD < threshold_2){
    print(return_value_2)
  } else {
    print("NULL")
  }**

  return(data.frame(return_value_1, return_value_2))
}

Aucun commentaire:

Enregistrer un commentaire