I have been looking for a decent answer for quite sometime, in Python a for loop would solved this in a matter of seconds. I have got about 100K URL, I m trying to group them based on a specific string that they contain, I have seen many similar example to mine however nothing is quite what I need. The most popular answer is using a ifelse, which doesn't work in my case as I m using a long list, if there is a "if" option I will take it(as oppose to ifelse)
Reproducible code
list<-c("birthday","anniv")
myData <-data.frame(URL = c("/birthday/promoid:654654","/birthday/products/","/anniversary","/anniversary/?type=gifts","/celebration","/celebration"), PageView=1:6*515)
then I want to create a new column called "occasion", so I can group the URLs and expect the below
myData$occasion<-ifelse(grepl("birthday", myData$URL),"birthday",
ifelse(grepl("anniv", myData$URL),"anniv",
ifelse(grepl("anniv", myData$URL),"anniv","NA")
)
)
URL PageView occasion
1 /birthday/promoid:654654 515 birthday
2 /birthday/products/ 1030 birthday
3 /anniversary 1545 anniv
4 /anniversary/?type=gifts 2060 anniv
5 /celebration 2575 NA
6 /celebration 3090 NA
Here I have used nested ifelse, however it is unfeasible as the list of keyword will reach 10K I have looked into lapply but haven't succeeded as I have simply no idea how to assign the value to a new column
lapply(list, function(list)
sub(paste0(".*",list,".*"),list, myData$URL, ignore.case = TRUE)
)
as this give me a list
myData$Occasion<- lapply(list, function(list)
sub(paste0(".*",list,".*"),list, myData$URL, ignore.case = TRUE)
)
Aucun commentaire:
Enregistrer un commentaire