I have a data frame that contains phone numbers in different formats, I'm trying to clean wrongly formatted numbers and unify the format by creating a new column. phone numbers exist in 3 columns: CountryCode, AreaCode, MobileNumber. I've written the following code to create a new column based on multiple if conditions:
library(dplyr)
data <- mutate(data, Number =
if(nchar(data$MobileNumber >= 12))
{paste("+", data$MobileNumber)
} else if (nchar(data$MobileNumber >= 9))
{paste("+", data$CountryCode, data$MobileNumber)
} else if (data$CountryCode == data$AreaCode)
{paste("+", data$CountryCode, data$MobileNumber)
} else (paste("+", data$CountryCode, data$AreaCode, data$MobileNumber)))
it acts based on the condition of the first row only, giving the following warning:
Warning message:
In if (nchar(data$MobileNumber >= 12)) { :
the condition has length > 1 and only the first element will be used
I've also tried to create 3 vectors for CountryCode, AreaCode, MobileNumber then to create a function that takes the 3 vectors as input and the correctly formatted number as output using if conditions and for loop but also wasn't successful.
# x is number y is country code z is area code n is the output
x <- data$MobileNumber
y <- as.character(data$CountryCode)
z <- data$AreaCode
#cleaning function
out <- vector("character", nrow(data))
CleanNum <- function(x, y, z)
{ for(i in 1:length(x))
{ if(nchar(x[i] >= 12)) {n[i] <- paste("+", x[i])
} else if (nchar(x[i] >= 9)) {n[i] <- paste("+", y[i], x[i])
} else if (y[i] == z[i]) {n[i] <- paste("+", y[i], x[i])
} else (n[i] <- paste("+", y[i], z[i], x[i]))
out[i] <- n[i] }}
Num_vec <- CleanNum(x, y, z)
I've a little experience in R and any help is much appreciated.
Aucun commentaire:
Enregistrer un commentaire