I'm looking to streamline my code, and minimize manual tweaks depending on the data set I run through it. I.e. I receive batches of data by country - but each country is slightly different in terms of fields and field names, so requires tweaking each time I run a new country. I would like to eliminate the tweaks and do some selective coding. (Many of the challenges I handle easily with ifelse(), but haven't been able to do a conditional mutate for example).
This is a logic question, so please let me know if I should have uploaded a data set.
In this example, I need to create a month text (Calendar_Month_txt) and year (Calendar_Year) field from a date field (e.g. 2018-03-01) for the USA, but other countries already have these included, so I don't need to create this field, just rename() them, so they align with my common data set.
Keep in mind, this is part of a much larger block of code that I need all the countries to run though...this is just the illustrative part.
# Import Data and Align Fields and Column Names
P_Region <- Raw_Data %>%
# This is for USA only...I need to comment this out when not USA
mutate(Calendar_Month_txt = ifelse(as.character(substr(Date, 6, 7)) == "01", "January",
ifelse(as.character(substr(Date, 6, 7)) == "02", "February",
ifelse(as.character(substr(Date, 6, 7)) == "03", "March",
ifelse(as.character(substr(Date, 6, 7)) == "04", "April",
ifelse(as.character(substr(Date, 6, 7)) == "05", "May",
ifelse(as.character(substr(Date, 6, 7)) == "06", "June",
ifelse(as.character(substr(Date, 6, 7)) == "07", "July",
ifelse(as.character(substr(Date, 6, 7)) == "08", "August",
ifelse(as.character(substr(Date, 6, 7)) == "09", "September",
ifelse(as.character(substr(Date, 6, 7)) == "10", "October",
ifelse(as.character(substr(Date, 6, 7)) == "11", "November",
ifelse(as.character(substr(Date, 6, 7)) == "12", "December", NA)))))))))))),
Calendar_Year = as.character((substr(Date, 1,4)))) %>%
# These I run only for non-USA, as I have created this above, so comment it out for USA
rename(Calendar_Month_txt = CalendarMonthTextFull,
Calendar_Year = CalendarYear)
I tried to use if statements within the dplyr code (I know I can do this as two separate complete blocks, but that seems like a lot of repeat code). Example:
V_USA <- TRUE
P_Region <- Raw_Data %>%
if(V_USA) {
mutate(Calendar_Month_txt = ifelse(as.character(substr(Date, 6, 7)) == "01", "January",
ifelse(as.character(substr(Date, 6, 7)) == "02", "February",
ifelse(as.character(substr(Date, 6, 7)) == "03", "March",
ifelse(as.character(substr(Date, 6, 7)) == "04", "April",
ifelse(as.character(substr(Date, 6, 7)) == "05", "May",
ifelse(as.character(substr(Date, 6, 7)) == "06", "June",
ifelse(as.character(substr(Date, 6, 7)) == "07", "July",
ifelse(as.character(substr(Date, 6, 7)) == "08", "August",
ifelse(as.character(substr(Date, 6, 7)) == "09", "September",
ifelse(as.character(substr(Date, 6, 7)) == "10", "October",
ifelse(as.character(substr(Date, 6, 7)) == "11", "November",
ifelse(as.character(substr(Date, 6, 7)) == "12", "December", NA)))))))))))),
Calendar_Year = as.character((substr(Date, 1,4))))
} else { ##### END U.S.
rename(Calendar_Month_txt = CalendarMonthTextFull,
Calendar_Year = CalendarYear) }
I tweaked various forms of this and this version was the most promising...I received the error:
Error in if (.) V_USA else { : argument is not interpretable as logical
In addition: Warning message:
In if (.) V_USA else { :
the condition has length > 1 and only the first element will be used
I suspect the error is because each data set is only one country, not all countries...so doesn't have any else.
Does anyone know a solution for this that will allow me to keep my code simple enough? Again, I can do this as two separate blocks with the 'if' and 'else' outside the dplyr pipes. Any thoughts are greatly appreciated. (I've tried to understand if mutate_if would work here, but haven't been able to really find much illustrating that...forgive me if I missed something.)
Aucun commentaire:
Enregistrer un commentaire