PDF OCR X, character recognition for PDF

PDF is a graphical capture , for example from a scanned text or photograph, we will have to use a character recognition or OCR program to extract the content in text mode. And that’s precisely what PDFOCR X , does by extracting the content in a plain text .txt file .

The program has very few configuration options, the selection of the language to make the recognition as accurate as possible, the arrangement in one or more columns of the text and whether or not the carriage returns should be included in the output file. As for the language, you can download more languages from here. PDFOCR X has two versions, one free but which only allows us to recognize single page PDF , or the paid version, which costs $29.99 and which has no such limitation. Let me remind you that if you want to extract a page from a PDF file , you can do so from Preview , by simply dragging and dropping that page out of the window, which will create a new PDF file with it.

Depending on the resolution and quality of the images included, the program offers better or worse results. If we need more quality in the conversion to text we will have to resort to more professional character recognition programs that give formatted output and not in .txt, but for some specific cases the free version can be good for us.

