mardi 21 novembre 2017

How to Define Multiple if arguments including case when

I was wondering how can we define multiple if cases which satisfies the specific condition.

for example, if (all(x<0) and case when a<b with (diff(x>0))

Say we have a data frame,

df <- data.frame(
  gr=gl(3,5),
  percent = c(seq(0.1, 0.5,0.1), seq(-0.5,-0.1,0.1), seq(0.1, 0.5,0.1)),
  per=rep(c(0.25,-0.25,0.25),each=5),
  x=c(c(0,0,0,1,2), c(0,0.1,0,1,2),c(1,1,0,0.1,0)))

> df
   gr percent   per   x
1   1     0.1  0.25 0.0
2   1     0.2  0.25 0.0
3   1     0.3  0.25 0.0
4   1     0.4  0.25 1.0
5   1     0.5  0.25 2.0
6   2    -0.5 -0.25 0.0
7   2    -0.4 -0.25 0.1
8   2    -0.3 -0.25 0.0
9   2    -0.2 -0.25 1.0
10  2    -0.1 -0.25 2.0
11  3     0.1  0.25 1.0
12  3     0.2  0.25 1.0
13  3     0.3  0.25 0.0
14  3     0.4  0.25 0.1
15  3     0.5  0.25 0.0

I would like to define cases using this user defined function data_manip and create column named eastwood based on the conditionals.

Particularly, I have trouble with defining multiple conditions for dirty classification.

to catch dirty, I define if conditional like this; if all percent<0 & (per<percent & diff(any(x>0))

enter image description here

 data_manip <- function(x,per,percent){

      if(all(percent>0)&all(head(x,3)==0)&!isTRUE(per<percent&any(diff(x>0)))){
        "Good"
      }
      else

      if(all(percent<0)&isTRUE(per<percent&any(diff(x>0)))){
        "Dirty"
      }
      else

      if(all(percent>0)&isTRUE(per>percent&any(diff(x>0)))){
        "Dirty"
      }
    else
      NA  
    }

library(dplyr)
df %>%
  group_by(gr)%>%
  do(data.frame(.,eastwood=data_manip(.$x,.$per,.$percent)))




# A tibble: 15 x 5
# Groups:   gr [3]
       gr percent   per     x eastwood
   <fctr>   <dbl> <dbl> <dbl>   <fctr>
 1      1     0.1  0.25   0.0     Good
 2      1     0.2  0.25   0.0     Good
 3      1     0.3  0.25   0.0     Good
 4      1     0.4  0.25   1.0     Good
 5      1     0.5  0.25   2.0     Good
 6      2    -0.5 -0.25   0.0     <NA>
 7      2    -0.4 -0.25   0.1     <NA>
 8      2    -0.3 -0.25   0.0     <NA>
 9      2    -0.2 -0.25   1.0     <NA>
10      2    -0.1 -0.25   2.0     <NA>
11      3     0.1  0.25   1.0     <NA>
12      3     0.2  0.25   1.0     <NA>
13      3     0.3  0.25   0.0     <NA>
14      3     0.4  0.25   0.1     <NA>
15      3     0.5  0.25   0.0     <NA>

How can I apply multiple if statements with case_when in one line to catch dirty groups in data ? I prefer to have solution with user defined function!

Thanks in advance!

Aucun commentaire:

Enregistrer un commentaire