mardi 24 mai 2016

How to create a new column using multiple if else conditions in R? [duplicate]

This question already has an answer here:

I am working on a dataframe of outliers. Here is an example of it

Value <- c(420,440,-490,413,446,-466,454,-433,401,-414)
Residual <- c(230,240,295,253,266,286,254,233,201,214)
St_dev_Sigma <- c(20,40,30,13,46,56,54,33,11,14)

df1 <- data.frame(Value,Residual,St_dev_Sigma )

I am trying to create a new column "Outlier_Category", which tells us whether the value is +/- 3 or 4.5 or 6 sigma outside.

Here are the UCL & LCL values for 3,4.5,6

df1$UCL_3Sig <- Residual+(3*St_dev_Sigma )
df1$LCL_3Sig <- Residual-(3*St_dev_Sigma )
df1$UCL_4_5Sig <- Residual+(4.5*St_dev_Sigma )
df1$LCL_4_5_Sig <- Residual-(4.5*St_dev_Sigma )
df1$UCL_6Sig <- Residual+(6*St_dev_Sigma )
df1$LCL_6Sig <- Residual-(6*St_dev_Sigma )

I am using the below logic to create the new column

Outlier_Category = "+/-3SIGMA" if (Value >= UCL_3Sig & Value <=LCL_3Sig ) & 
                              (Value < UCL_4_5Sig & Value > LCL_4_5_Sig)
Outlier_Category = "+/-4.5SIGMA" if (Value >= UCL_4_5Sig & Value <=LCL_4_5_Sig) & 
                              (Value < UCL_6Sig & Value > LCL_6Sig )
Outlier_Category = "+/-6SIGMA" if (Value >= UCL_6Sig & Value <=LCL_6Sig )

My desired Output is

 Value Residual Res_SD UCL_3Sig LCL_3Sig UCL_4_5Sig LCL_4_5_Sig UCL_6Sig LCL_6Sig
1    420      230     20      290      170      320.0       140.0      350      110
2    440      240     40      360      120      420.0        60.0      480        0
3   -490      295     30      385      205      430.0       160.0      475      115
4    413      253     13      292      214      311.5       194.5      331      175
5    446      266     46      404      128      473.0        59.0      542      -10
6   -466      286     56      454      118      538.0        34.0      622      -50
7    454      254     54      416       92      497.0        11.0      578      -70
8   -433      233     33      332      134      381.5        84.5      431       35
9    401      201     11      234      168      250.5       151.5      267      135
10  -414      214     14      256      172      277.0       151.0      298      130
   Outlier_Category
     +/-6SIGMA
    +/-4.5SIGMA
     +/-6SIGMA
     +/-6SIGMA
     +/-3SIGMA
     +/-6SIGMA
     +/-3SIGMA
     +/-6SIGMA
     +/-6SIGMA
     +/-6SIGMA

I am trying to do it this way but it is not giving my desired output and also I am not knowing how to use 3 conditions.

df1$Outlier_Category = ifelse( (df1$Value >= df1$UCL_3Sig & df1$Value <=df1$LCL_3Sig ) & 
                            (df1$Value < df1$UCL_4_5Sig & df1$Value > df1$LCL_4_5_Sig) 
                            , "+/-3SIGMA", "+/-4.5SIGMA")

How do I get this working? I am trying to use if statements but not getting it right. Kindly please provide some directions.

Aucun commentaire:

Enregistrer un commentaire