so currently I have variable a, b, and c. I have a column 'v4' that is a binary variable based off of the 'v1' column. 1 (a,b, or c) 0 (not).
Example:
v1 v2 v3 v4
a b c 1
b b c 1
d b c 0
An issues I have with my data is that sometime they have years or other characters before the value. For example, I have instances of '2020 c'. This would be correct and I would want to capture this in column 'v4'. However, if these years come after it would be incorrect. Example, 'c 2020' would appear as a 0 in column 'v4'.
Example of how I want it to look:
v1 v2 v3 v4
a b c 1
b b c 1
d b c 0
c 2020 b c 0
2020 c b c 1
1990 c b a 1
How could I made this work? Currently I am using
df1$v4 <- as.integer(grepl("(a|b|c)$", df1$v1))
this is good at capturing all instances, but I am not able to exclude the instances where the data is coming after the the variable I am trying to capture. Hopefully this makes sense.
Aucun commentaire:
Enregistrer un commentaire