jeudi 3 septembre 2020

R - Creating new columns based on multiple conditions and time of event

I need to create new columns based on multiple conditions and time points from previous columns. I have the following data frame:

table <- data.frame(RowID=c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "A10", "A11", "A12", "A13", "A14", "A15"), Machine=c("Ace", "Ace", "Ace", "Ame", "Ame", "Cay", "Cay", "Cay", "Cay", "Cay", "Gap", "Gap", "Dex", "Dex", "Dex"), Time=c(1,2,3,1,2,1,2,3,4,5,1,2,1,2,3), Status=c("Good", "Good", "Bad", "Bad", "Good", "Good", "Bad", "Good", "Good", "Bad", "Good", "Good", "Bad", "Bad", "Good"))

print(table)
 RowID Machine Time Status
1     A1     Ace    1   Good
2     A2     Ace    2   Good
3     A3     Ace    3    Bad
4     A4     Ame    1    Bad
5     A5     Ame    2   Good
6     A6     Cay    1   Good
7     A7     Cay    2    Bad
8     A8     Cay    3   Good
9     A9     Cay    4   Good
10   A10     Cay    5    Bad
11   A11     Gap    1   Good
12   A12     Gap    2   Good
13   A13     Dex    1    Bad
14   A14     Dex    2    Bad
15   A15     Dex    3   Good

For every Machine, the Time shows when the reading was taken. I would like to create two new columns Verdict and Outcome. For Verdict column, I would like to label "YES" for any Machine with a "Good" status before a "Bad" (e.g. Ace and Cay), otherwise label "NO". For Outcome column, I would like to label "Event" at the first time "Bad" status appears for a Machine, "BeforeEvent" for the "Good" status right before the "Bad" status appears. For any other "Good" status that was not directly before a "Bad", to label "Before" and for any status after the first "Bad" status to be labeled "After".

The final data frame I am hoping to get is as follows:

table_new <- data.frame(RowID=c("A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "A10", "A11", "A12", "A13", "A14", "A15"), Machine=c("Ace", "Ace", "Ace", "Ame", "Ame", "Cay", "Cay", "Cay", "Cay", "Cay", "Gap", "Gap", "Dex", "Dex", "Dex"), Time=c(1,2,3,1,2,1,2,3,4,5,1,2,1,2,3), Status=c("Good", "Good", "Bad", "Bad", "Good", "Good", "Bad", "Good", "Good", "Bad", "Good", "Good", "Bad", "Bad", "Good"), Verdict=c("YES", "YES", "YES", "NO", "NO", "YES", "YES", "YES", "YES", "YES", "NO", "NO", "NO", "NO", "NO"), Outcome=c("Before", "BeforeEvent", "Event", "None", "None", "BeforeEvent", "Event", "After", "After", "After", "None", "None", "None", "None", "None"))

print(table_new)
   RowID Machine Time Status Verdict     Outcome
1     A1     Ace    1   Good     YES      Before
2     A2     Ace    2   Good     YES BeforeEvent
3     A3     Ace    3    Bad     YES       Event
4     A4     Ame    1    Bad      NO        None
5     A5     Ame    2   Good      NO        None
6     A6     Cay    1   Good     YES BeforeEvent
7     A7     Cay    2    Bad     YES       Event
8     A8     Cay    3   Good     YES       After
9     A9     Cay    4   Good     YES       After
10   A10     Cay    5    Bad     YES       After
11   A11     Gap    1   Good      NO        None
12   A12     Gap    2   Good      NO        None
13   A13     Dex    1    Bad      NO        None
14   A14     Dex    2    Bad      NO        None
15   A15     Dex    3   Good      NO        None

Would really appreciate any help with this as I will need to repeat this multiple times so would be great if it could be automated - thank you!

Aucun commentaire:

Enregistrer un commentaire