🔤

Doan OCR — Image to Text Extractor

Extract text from images, screenshots, and scanned documents using the open-source Tesseract OCR engine. Supports 20 languages. Runs entirely in your browser.

Ez dago fitxategien tamaina mugarikFitxategiak pribatuak diraEz da saioa hasi beharBetiko libre

📦OCR uses ~10 MB of data on first run per language (cached afterwards).

Jaregin fitxategiak hemen edo egin klik arakatzeko

JPG, PNG, WebP, BMP ez dago fitxategien tamaina mugarik

Arakatu fitxategiak

Zure fitxategiak ez dira inoiz zure gailutik irteten. Prozesamendu guztia zure arakatzailean gertatzen da lokalean.

Optimize for your platform

Extract Text from Screenshots Convert Scanned PDFs and Photos to Text OCR for 20+ Languages Digitize Business Cards Make Images Accessible

Nola Funtzionatzen Du

OCR (Optical Character Recognition) here uses the open-source Tesseract engine — the same engine Google has maintained for decades — compiled to WebAssembly so it runs natively in your browser.

1
The OCR engine downloads as WebAssembly
On first use, Tesseract.js loads as a WebAssembly module (~3 MB). It runs inside your browser like native code, with no plugins or extensions needed.
2
A language model downloads for your chosen language
Each language has its own trained model (~5-15 MB each, depending on script complexity). You pick the language; only that language's data downloads. Your browser caches it for future use.
3
You select or paste an image
The image is read into your browser's memory. Tesseract analyzes the pixel data to identify character shapes — no upload, no API call.
4
Text is extracted with a confidence score
For each detected word, Tesseract assigns a confidence percentage. High contrast and clean fonts give 95%+ confidence; messy handwriting or low resolution drops it.

Your images stay on your device — not even temporarily uploaded for processing. Safe for ID documents, contracts, medical paperwork, or any text with sensitive content.