I'm writing code to scrape data from a blog. As the posts are created by two different authors, and I just want to get the data from one author, I created a function with if to try to solve this problem. But when I run the function on the blog address, I get the following error message: "ERROR: missing value where TRUE / FALSE needed". Does anyone know what this means and what can I do to resolve it?
The function code:
extract_articles_blogger_preto <- function(x){
tryCatch({
webpage <- read_html(x)
text <- html_nodes(webpage, ".cabecalho") %>% html_nodes(".corpo")
i <- 0
pular_texto <- FALSE
article <- ""
for (p in text){
if (i==0){
i <- 1
}
else if(i==1){
i <- 2
}
else if(i==2){
autor <- html_nodes(p, "a[href]") %>% html_attr("href")
i <- 3
if (str_detect(autor[2], "rainhafragil")){
pular_texto <- FALSE
} else {
pular_texto <- TRUE
}
}
else if(i==3){
if (pular_texto==FALSE){
article <- str_c(article, html_text(text), "\n")
}
i <-0
}
}
return(article)
}, error=function(e){cat("ERROR :",conditionMessage(e), "\n")})
}
#Trying to apply the function to the blog address:
extract_articles_blogger_preto("http://web.archive.org/web/20070430023653mp_/http://fragilreino.blogger.com.br/2002_12_01_archive.html")
#Error message:"missing value where TRUE / FALSE needed"
Aucun commentaire:
Enregistrer un commentaire