Unexpected White Spaces in IronOCR
Extracted text sometimes contains extra white spaces between characters, even when the source document looks clean. The usual culprit is variation in character width and spacing, often tied to the input font, and image filters alone rarely clear it up.

The following approaches have proven effective at avoiding the extra spaces.
Solution
1. Switch to an OCR-Friendly Font
Render or supply the document in Roboto where you can. It is a clean, modern typeface that reads reliably, so the engine produces fewer spurious spaces than it does with fonts like Arial.
2. Use ReadDocumentAdvanced()
Read the document with ReadDocumentAdvanced() instead of the standard read method. It gives you more control over the OCR process and handles complex text layouts better.
3. Set OcrLanguage.EnglishBest
Point the engine at the OcrLanguage.EnglishBest model. This advanced language model is more accurate than the standard English option and reduces misread spacing.
4. Train a Custom Font
When the input PDF uses an unusual or non-standard font, the default settings can struggle and insert spaces. Training a custom font teaches the engine to recognize that specific typeface, which sharpens recognition considerably. See the OCR custom font training guide for the full workflow.
With these settings applied, the extracted text comes back without the stray spacing:


