Improving Unsatisfactory OCR Results

OCR quality depends heavily on the input. A configuration, filter, or setting that reads one document cleanly can fall flat on another, leaving you with garbled or incomplete text. When the standard read disappoints, the IronOcr.Extensions.AdvancedScan package usually closes the accuracy gap.

The difference comes down to the engine. Standard IronOCR runs on the Tesseract engine, while IronOcr.Extensions.AdvancedScan is powered by PaddleOCR and adds machine-learning capabilities. That makes it well suited to noisy input like screenshots, and to structured sources like passports and license plates.

The package exposes purpose-built read methods for these cases:

Solution

1. Start with ReadDocumentAdvanced() and EnhanceResolution()

Before reaching for any other configuration, pair the ReadDocumentAdvanced() method with the EnhanceResolution() preprocessing filter. This combination resolves most of the accuracy problems users hit with the standard read.

Apply the filter to the OcrInput, then pass that same input to ReadDocumentAdvanced():

var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadPdf("document.pdf");
// Upscale and sharpen low-resolution input before reading
input.EnhanceResolution();
var result = ocr.ReadDocumentAdvanced(input);
Console.WriteLine(result.Text);
var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadPdf("document.pdf");
// Upscale and sharpen low-resolution input before reading
input.EnhanceResolution();
var result = ocr.ReadDocumentAdvanced(input);
Console.WriteLine(result.Text);
Imports IronOcr

Dim ocr As New IronTesseract()
Using input As New OcrInput()
    input.LoadPdf("document.pdf")
    ' Upscale and sharpen low-resolution input before reading
    input.EnhanceResolution()
    Dim result = ocr.ReadDocumentAdvanced(input)
    Console.WriteLine(result.Text)
End Using
$vbLabelText   $csharpLabel

ReadDocumentAdvanced() runs the PaddleOCR machine-learning engine to parse layout-aware, text-heavy documents, while EnhanceResolution() upscales and sharpens low-resolution input before the read. Low resolution is a common root cause of poor results, so this preprocessing step often matters as much as the engine swap.

2. Target x64 on .NET Framework

Configure .NET Framework projects to build as x64. AdvancedScan needs a significant amount of memory, and running it under a different architecture can trigger runtime errors. See the Advanced Scan on .NET Framework guide for the full setup.

Limitations

AdvancedScan delivers strong accuracy, but a few constraints apply today:

  • Fewer result features: AdvancedScan result objects expose fewer properties than the standard OcrResult object.
  • Different return types: each read method returns a different result object type, so check the API Reference for the specifics of the one you call.
  • No searchable PDFs: building searchable PDFs from AdvancedScan output is not yet supported, though the feature is in development.
  • Limited languages: the package currently supports English, Chinese, Japanese, Korean, and Latin alphabets.

Please noteSearchable PDF output is unavailable from AdvancedScan reads for now. If you depend on that feature, stay on the standard IronOCR read path.

Curtis Chau
Technical Writer

Curtis Chau holds a Bachelor’s degree in Computer Science (Carleton University) and specializes in front-end development with expertise in Node.js, TypeScript, JavaScript, and React. Passionate about crafting intuitive and aesthetically pleasing user interfaces, Curtis enjoys working with modern frameworks and creating well-structured, visually appealing manuals.

...

Read More
Ready to Get Started?
Nuget Downloads 6,106,091 | Version: 2026.7 just released
Still Scrolling Icon

Still Scrolling?

Want proof fast? PM > Install-Package IronOcr
run a sample watch your image become searchable text.