vendredi 5 février 2021

dplyr ifelse and if_else turn dates into doubles

I have a dataframe with some date columns that occasionally have an NA in them, because a daily or weekly questionnaire was not filled out.

glimpse(df_tmp)
Rows: 69,510
Columns: 4
$ ID                 <dbl> 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 7...
$ Visit_Date         <date> 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, ...
$ Daily_Answer_Date  <date> 2017-04-17, 2017-04-17, 2017-04-17, 2017-04-17, 2017-04-17, 2017-04-17, 2017-04-18, 2017-04-18, 2017-04-18, 2017-04-18, ...
$ Weekly_Answer_Date <date> 2017-04-24, 2017-05-14, 2017-05-21, 2017-05-29, 2017-06-11, 2017-07-02, 2017-04-24, 2017-05-14, 2017-05-21, 2017-05-29, ...

In the event that the daily or weekly answer date column has an NA in it, I'd like to replace the NA with the corresponding visit date. I started by writing

df_tmp1 <- df_tmp %>%
             mutate(Daily_Answer_Date  = ifelse(is.na("Daily_Answer_Date"),  Visit_Date, Daily_Answer_Date)) %>%
             mutate(Weekly_Answer_Date = ifelse(is.na("Weekly_Answer_Date"), Visit_Date, Weekly_Answer_Date))

but this turns the dates into doubles

glimpse(df_tmp1)
Rows: 69,510
Columns: 4
$ ID                 <dbl> 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 7...
$ Visit_Date         <date> 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, 2017-04-10, ...
$ Daily_Answer_Date  <dbl> 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17273, 17...
$ Weekly_Answer_Date <dbl> 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17...

Following Stackoverflow suggestions, I then tried

df_tmp1 <- df_tmp %>%
             mutate(Daily_Answer_Date  = dplyr::if_else(is.na("Daily_Answer_Date"),  Visit_Date, Daily_Answer_Date)) %>%
             mutate(Weekly_Answer_Date = dplyr::if_else(is.na("Weekly_Answer_Date"), Visit_Date, Weekly_Answer_Date))
Error: Problem with `mutate()` input `Daily_Answer_Date`.
x `true` must be length 1 (length of `condition`), not 69510.
i Input `Daily_Answer_Date` is `dplyr::if_else(is.na("Daily_Answer_Date"), MPHC_Visit_Date, Daily_Answer_Date)`.

$ Weekly_Answer_Date <dbl> 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17280, 17...

How can I cleanly replace NA's in Daily_Answer_Date and Weekly_Answer_Date with the corresponding Visit_Date without changing the type?

Thank you in advance

Thomas Philips

Aucun commentaire:

Enregistrer un commentaire