Hello, I need to develop a solution to read the contents of a PDF via OCR. I saw tesseract allow me to do this reading, but it only reads images. Does anyone know how I can convert PDF to image and feed the tesseract?
Tnks
Hello, I need to develop a solution to read the contents of a PDF via OCR. I saw tesseract allow me to do this reading, but it only reads images. Does anyone know how I can convert PDF to image and feed the tesseract?
Tnks
Text is stored in a PDF as text, unless the text itself is an image of course.
in case the PDF is a scanner, then I need to get the contents through the same OCR