Please bare with me as I am new to R and rarely used this platform for any help.
I have written a R script for NLP and I am working on thousands of rows of requests where I need to find a particular sequence of numbers with a condition being "it should be an 8 digit number, should start with either digit 1 or Capital letter T". I have written down certain conditions in scripts which are not providing me the desired result.
For example below is one example of the request where it contains some number sequence and I need to extract it out as per the conditions stated above.
Request:
- "Update task duration in TL# 10503944"
- "Update task duration in TL# 10503944"
The result I am after is
- "10503944"
- "T0503944" for the respective requests.
Below is the part of script given basically a function to extract the number
extr_num <- function(x){
a <- unlist(str_split(x," "))
NewdataColumn <- rep(NA, length(a))
non_alpha <- grep("[[:alpha:]]", a, invert = TRUE)
NewdataColumn[non_alpha] <- a[non_alpha]
NewdataColumn <- gsub("[^[:alnum:]]", "", NewdataColumn)
NewdataColumn <- na.omit(NewdataColumn)
if(length(NewdataColumn)==1)
{
NewdataColumn <- NewdataColumn
}
else if(length(NewdataColumn)==0)
{
NewdataColumn <- NA
}
else
{
NewdataColumn <-NewdataColumn[match(max(nchar(NewdataColumn)),nchar(NewdataColumn))]
}
if(nchar(NewdataColumn)< 6 | is.na(NewdataColumn))
{
NewdataColumn <- NA
}
else
{
NewdataColumn <- NewdataColumn
}
if(str_extract(NewdataColumn,"\\D") == "T" && !is.na(NewdataColumn))
{
NewdataColumn <- NewdataColumn[1]
}
else
{
NewdataColumn <- NA
}
if(str_extract(NewdataColumn,"\\d{8}") == "1" && !is.na(NewdataColumn))
{
NewdataColumn <- NewdataColumn[1]
}
else
{
NewdataColumn <- NA
}
return(NewdataColumn)
}
The error I'm getting here is
Error in if (str_extract(NewdataColumn, "\D") == "T" && !is.na(NewdataColumn)) { : missing value where TRUE/FALSE needed
Please note : I may have missed any post on this particular issue but I have tried my best to sort it out by adding "is.na()" function to the script. Hope this helps to answer my question.
Thanks
Suvi
Aucun commentaire:
Enregistrer un commentaire