mercredi 28 novembre 2018

R - combining ifelse and substr

My sample data is:

df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")

I'd like to adhere to the following rule:

If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".

I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:

df2 <- df %>%
       mutate(WRE = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
                           substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))

As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" and if not, "NA". However, using substr function I resorted to split both of these categories.

My ideal output is:

     Label       Work
1   10M_Amts      NA
2   D2B_Exp      Work
3   D3C_Exp      Work
4   D2_Amt       Work
5   D5_Amt       Work
6   53D_Amt       NA

As you can see the new column name is Work. In excel, I would write the following:

=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}), 
"Work", "")

where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.

Thank you so much in advance!

Aucun commentaire:

Enregistrer un commentaire