mercredi 20 mai 2020

Applying functions to conditional rows with output in new column in R

I have a df of around 700 observations, in which I have created three different functions to calculate distances using three different methods. I want to run one of the three functions per row based on conditional statements. I'm somewhat of an R noobie, so what I've come to so far is to use an if statement inside a for loop.

What I want to do:

sightings <- data.frame(a=c(NA, 1, NA, 1, NA, 1), b=c(NA, NA, "HO", "HO", "LA", "LA"), 
                        c=c(100, 200, 300, 400, 500, 600), d = c(NA, NA, NA, NA, NA, NA))
#our df 

> sightings
   a    b   c  d
1 NA <NA> 100 NA
2  1 <NA> 200 NA
3 NA   HO 300 NA
4  1   HO 400 NA
5 NA   LA 500 NA
6  1   LA 600 NA

#desired output
   a    b   c   d
1 NA <NA> 100 100
2  1 <NA> 200 200
3 NA   HO 300 300
4  1   HO 400 467
5 NA   LA 500 500
6  1   LA 600 666

I have two functions: one for generating distance in df$d using df$b == "HO" and another for df$b == "LA" (both include multiple trigonometric functions including values from columns not mentioned here).

So in the cases where df$b == "HO" & df$a != NA I want to run the ho.function, with the output in df$d, and df$b == "LA" & df$a != NA run the la.function and output in df$d

I have devised an if statement which I've tried putting in a for loop, but to no luck (new to both for loops and if statements)

for(i in 1:nrow(df)){
  if(df$b[i] == "LA" && is.na(df$a) == FALSE){
    df$d <- la.function 
  } else if (df$b[i] == "HO" && is.na(df$a) == FALSE){
    df$d <- ho.function
  } else {
    df$d <- df$c
  }
}

The problem is that I keep getting df$d == df$c > TRUE, so it feels like R is just jumping over my first two if statements?

Anyone with any knowledge/experience with this? Am I using the functions correctly?

Aucun commentaire:

Enregistrer un commentaire