thanks for your time.
I have a question about using ifelse
within the mutate
function. ifelse
is from base
R, while mutate
is from the dplyr
package.
My question is about how ifelse
handles NA
values.
I have two character vectors: example_character_vector
contains some words and occasional NA
values while the other vector, color_indicator
, contains only the words Green, Yellow, and Red.
I want to mutate my dataframe example_data_frame
to create a new override_color_indicator
variable that converts some of the yellows to greens depending on a condition in the example_character_vector
.
Example data:
example_character_vector <- c("Basic", NA, "Full", "None", NA, "None",
NA)
color_indicator <- c("Green", "Green", "Yellow", "Yellow", "Yellow",
"Red", "Red")
example_data_frame <- data.frame(example_character_vector,
color_indicator)
This example_data_frame looks like so:
example_character_vector color_indicator
1 Basic Green
2 <NA> Green
3 Full Yellow
4 None Yellow
5 <NA> Yellow
6 None Red
7 <NA> Red
I am using nested ifelse
statements within mutate
to create a new column called override_color_indicator
.
If color_indicator
is yellow and the example_character_vector
contains the word "Full", I want the override_color_indicator
to be Green (this is a special case within my data). Otherwise, I would like the override_color_indicator
to be exactly the same as the color_indicator
.
Here is my mutate:
example_data_frame <- example_data_frame %>%
mutate(override_color_indicator =
ifelse(color_indicator == "Green",
"Green",
ifelse(color_indicator == "Yellow" &
str_detect(example_character_vector, "Full"),
"Green",
ifelse(color_indicator == "Yellow" &
!str_detect(example_character_vector, "Full") |
color_indicator == "Yellow" &
is.na(character_vector),
"Yellow",
"Red"))))
(Apologies for the formatting - I tried to format this the best I could for Stack Overflow.)
This above code produces this dataframe:
example_character_vector color_indicator override_color_indicator
1 Basic Green Green
2 <NA> Green Green
3 Full Yellow Green
4 None Yellow Yellow
5 <NA> Yellow <NA>
6 None Red Red
7 <NA> Red Red
My problem here is that in line 5, an NA is introduced in the override_color_indicator
color. Instead of an NA, I would like it be "Yellow".
For clarity, this is my desired dataframe:
example_character_vector color_indicator override_color_indicator
1 Basic Green Green
2 <NA> Green Green
3 Full Yellow Green
4 None Yellow Yellow
5 <NA> Yellow Yellow
6 None Red Red
7 <NA> Red Red
I've looked quite a bit for an answer, and couldn't find one anywhere. I could just create a workaround and go back and manually assign the entries to Yellow, but I don't love that option from a programmatic standpoint.
Also, I'm just kind of curious as to why this behavior happens. I've ran into this problem a few times now.
Thanks for your time!