Unexpected White Spaces in IronOCR

Extracted text sometimes contains extra white spaces between characters, even when the source document looks clean. The usual culprit is variation in character width and spacing, often tied to the input font, and image filters alone rarely clear it up.

OCR result in the Visual Studio Text Visualizer, with an unexpected space inside the reference number, between the 7 and the 1

The following approaches have proven effective at avoiding the extra spaces.

Solution

1. Switch to an OCR-Friendly Font

Render or supply the document in Roboto where you can. It is a clean, modern typeface that reads reliably, so the engine produces fewer spurious spaces than it does with fonts like Arial.

2. Use ReadDocumentAdvanced()

Read the document with ReadDocumentAdvanced() instead of the standard read method. It gives you more control over the OCR process and handles complex text layouts better.

3. Set OcrLanguage.EnglishBest

Point the engine at the OcrLanguage.EnglishBest model. This advanced language model is more accurate than the standard English option and reduces misread spacing.

4. Train a Custom Font

When the input PDF uses an unusual or non-standard font, the default settings can struggle and insert spaces. Training a custom font teaches the engine to recognize that specific typeface, which sharpens recognition considerably. See the OCR custom font training guide for the full workflow.

With these settings applied, the extracted text comes back without the stray spacing:

OCR result after applying the fixes, with the spacing corrected

TipsTry the OCR-friendly font and EnglishBest model first; they fix most spacing cases without the effort of training a custom font.

Curtis Chau
Technical Writer

Curtis Chau holds a Bachelor’s degree in Computer Science (Carleton University) and specializes in front-end development with expertise in Node.js, TypeScript, JavaScript, and React. Passionate about crafting intuitive and aesthetically pleasing user interfaces, Curtis enjoys working with modern frameworks and creating well-structured, visually appealing manuals.

...

Read More
Ready to Get Started?
Nuget Downloads 6,106,091 | Version: 2026.7 just released
Still Scrolling Icon

Still Scrolling?

Want proof fast? PM > Install-Package IronOcr
run a sample watch your image become searchable text.