I'm trying to search a word from the text that I extract from the pdf file which is OCR'd format. This pdf file has multiple pages, so for each page I'm searching a word, if that word in found then I don't want the for loop to continue, I used the code but it just stop on first page. what m I missing in this code.
here is the code
for(i in 1:8){
img_file <- pdftools::pdf_convert("D:/Files_OCR/test.pdf", format = 'tiff', pages = i, dpi = 400)
text <- ocr(img_file)
ocr_text <- capture.output(cat(text))
check=sapply(ocr_text, paste0, collapse="")
if(length(which(stri_detect_fixed(tolower(check),tolower("school")))) <= 0){ print("Not Present") } else {print("Present")}
break
}
Any suggestion is appreciable.
Thanks
Aucun commentaire:
Enregistrer un commentaire