My sample data is:
df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")
I'd like to adhere to the following rule:
If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".
I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:
df2 <- df %>%
mutate(WRE = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))
As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" and if not, "NA". However, using substr function I resorted to split both of these categories.
My ideal output is:
Label Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA
As you can see the new column name is Work. In excel, I would write the following:
=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}),
"Work", "")
where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.
Thank you so much in advance!
Aucun commentaire:
Enregistrer un commentaire