lundi 23 mai 2016

Assign value using multiple conditions in R

I would like to apply a function that consists of classification rules to assign a value of high, medium, and low risk to a new column based one each participant's gender, age, and race.

Let's assume this is my df

   gender age      race
1    male  11 NON_WHITE
2    male   9     WHITE
3  female  36 NON_WHITE
5  female   3     WHITE
6  female  81     WHITE
7  female  14 NON_WHITE
8  female  14 NON_WHITE
9  female  79 NON_WHITE
10   male  44     WHITE

I'd like to assign a value based on gender, age, and race. For example:

High = female; any-age; NON_WHITE OR male; >=70; NON_WHITE

Medium = female; >=75; WHITE OR male; <70; NON_WHITE

Low = female; <75; WHITE OR male; any-age; WHITE

The result would be a value assigned to df$class:

  gender age      race   class
1    male  11 NON_WHITE  Medium
2    male   9     WHITE     Low
3  female  36 NON_WHITE    High
5  female   3     WHITE     Low
6  female  81     WHITE  Medium
7  female  14 NON_WHITE    High
8  female  14 NON_WHITE    High
9  female  79 NON_WHITE    High
10   male  44     WHITE     Low

I wrote a function and applied it to the dateframe:

Riskfun <- function(x) { 
if(x["gender"] == "female" & x["race"] == "NON_WHITE") 
    df$class <- "HighRisk"
if(x["gender"] == "male" & x["age"] >= 70 & x["race"] == "NON_WHITE") 
    df$class <- "HighRisk"
if(x["gender"] == "female" & x["age"] >= 75 & x["race"] == "WHITE") 
    df$class <- "MediumRisk"
if(x["gender"] == "male" & x["age"] < 70 & x["race"] == "NON_WHITE") 
    df$class <- "MediumRisk"
if(x["gender"] == "female" & x["age"] < 75 & x["race"] == "WHITE") 
    df$class <- "LowRisk"
if(x["gender"] == "male" & x["race"] == "WHITE") 
    df$class <- "LowRisk"
 }

Any thoughts or suggestions?

Aucun commentaire:

Enregistrer un commentaire