PDFSwift logoPDFSwift

OCR PDF

OCR PDF reads the images inside a scanned PDF and recognizes the text using Tesseract, all inside your browser. Works on scanned contracts, book pages, receipts, and any PDF that is just images of text. Nothing is uploaded — the OCR engine runs locally.

100% privateFast, in-browserNo account needed

How it works

  1. 1

    Upload your scanned PDF.

  2. 2

    Click Run OCR — a recognition engine loads on first use (about 10 MB, cached after).

  3. 3

    Copy the recognized text or download it as a .txt file.

Frequently asked questions

Is my scanned PDF uploaded anywhere?

No. The OCR engine and recognition data run locally in your browser. Your PDF never leaves your device.

How accurate is the OCR?

For clean, high-resolution English scans, accuracy is typically 95% or better. Low-resolution, skewed, or handwritten pages will be less accurate and may need manual cleanup.

What languages are supported?

English is recognized by default. The underlying engine (Tesseract) supports 100+ languages; additional language packs can be enabled in a future update.

Why is OCR slower than normal text extraction?

OCR has to render every page to an image and then recognize each character, which is much heavier than reading an existing text layer. Expect a few seconds per page on a modern laptop.

Related tools