How to Debug OCR in C#

Updated:March 2, 2026

IronOCR enables you to detect OCR failures at the source, assess recognition quality at the word and character level, and monitor long-running jobs in real time. Built-in tools such as diagnostic file logging, a typed exception hierarchy, per-result confidence scoring, and the OcrProgress event support these workflows in production pipelines.

This guide walks through working examples for each: enabling diagnostic logging, handling typed exceptions, validating output with confidence scores, monitoring job progress in real time, and isolating errors in batch pipelines.

Quickstart: Enable full OCR diagnostic logging

Set LogFilePath and LoggingMode on the Installation class before the first Read call. Two properties are all it takes to capture Tesseract initialization, language pack loading, and processing details to a log file.

Install IronOCR with NuGet Package Manager
PM > Install-Package IronOcr

Copy and run this code snippet.

IronOcr.Installation.LogFilePath = "ocr.log"; IronOcr.Installation.LoggingMode = IronOcr.Installation.LoggingModes.All;

Deploy to test on your live environment

Start using IronOCR in your project today with a free trial

Minimal Workflow (5 steps)

Download a C# library for debugging OCR
Set LogFilePath to a writable file path
Set LoggingMode to All for full diagnostic capture
Run your OCR operation and reproduce the issue
Inspect the generated log file for engine warnings and processing details

How Do I Enable Diagnostic Logging?

The Installation class exposes three logging controls. Set these before calling any Read method.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-enable-logging.cs

using IronOcr;

// Write logs to a specific file
Installation.LogFilePath = "logs/ocr_diagnostics.log";

// Enable all logging channels: file + debug output
Installation.LoggingMode = Installation.LoggingModes.All;

// Or pipe logs into your existing ILogger pipeline
Installation.CustomLogger = myLoggerInstance;

$vbLabelText $csharpLabel

LoggingMode accepts flag values from the LoggingModes enum:

Table 1: LoggingModes Options
Mode	Output Target	Use Case
`None`	Disabled	Production with external monitoring
`Debug`	IDE debug output window	Local development
`File`	`LogFilePath`	Server-side log collection
`All`	Debug + File	Full diagnostic capture

The CustomLogger property supports any Microsoft.Extensions.Logging.ILogger implementation, allowing you to direct OCR diagnostics to Serilog, NLog, or other structured logging sinks in your pipeline. Use ClearLogFiles to remove accumulated log data between runs.

With logging in place, the next step is understanding which exceptions IronOCR can throw and how to handle each one.

What Exceptions Does IronOCR Throw?

IronOCR defines typed exceptions under the IronOcr.Exceptions namespace. Catching these specifically, rather than a blanket catch block, lets you route each failure type to the correct remediation path.

Table 2: IronOCR Exception Reference
Exception	Common Cause	Fix
`IronOcrInputException`	Corrupt or unsupported image/PDF	Validate file before loading into `OcrInput`
`IronOcrProductException`	Internal engine error during OCR execution	Enable logging, check log output, update to latest NuGet version
`IronOcrDictionaryException`	Missing or corrupt `.traineddata` language file	Reinstall the language pack NuGet or set `LanguagePackDirectory`
`IronOcrNativeException`	Native C++ interop failure	Install Visual C++ Redistributable; check AVX support
`IronOcrLicensingException`	Missing or expired license key	Set `LicenseKey` before calling `Read`
`LanguagePackException`	Language pack not found at expected path	Verify `LanguagePackDirectory` or reinstall the NuGet language package
`IronOcrAssemblyVersionMismatchException`	Mismatched assembly versions after partial update	Clear NuGet cache, restore packages, ensure all IronOCR packages match

Use the following try-catch block to handle each exception type separately, applying exception filters for conditional logging.

Input

A single-page vendor invoice from IronOCR Solutions to Acme Corporation, loaded via LoadPdf into OcrInput. It includes four line items, tax, and a grand total — enough text variety to give each exception handler a realistic exercise.

invoice_scan.pdf: Vendor invoice (#INV-2024-7829) used to demonstrate each typed exception handler in sequence.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-exception-handling.cs

using IronOcr;
using IronOcr.Exceptions;

var ocr = new IronTesseract();

try
{
    using var input = new OcrInput();
    input.LoadPdf("invoice_scan.pdf");

    OcrResult result = ocr.Read(input);
    Console.WriteLine($"Text: {result.Text}");
    Console.WriteLine($"Confidence: {result.Confidence:P1}");
}
catch (IronOcrInputException ex)
{
    // File could not be loaded — corrupt, locked, or unsupported format
    Console.Error.WriteLine($"Input error: {ex.Message}");
}
catch (IronOcrDictionaryException ex)
{
    // Language pack missing — common in containerized deployments
    Console.Error.WriteLine($"Language pack error: {ex.Message}");
}
catch (IronOcrNativeException ex) when (ex.Message.Contains("AVX"))
{
    // CPU does not support AVX instructions
    Console.Error.WriteLine($"Hardware incompatibility: {ex.Message}");
}
catch (IronOcrLicensingException)
{
    Console.Error.WriteLine("License key is missing or invalid.");
}
catch (IronOcrProductException ex)
{
    // Catch-all for other IronOCR engine errors
    Console.Error.WriteLine($"OCR engine error: {ex.Message}");
    Console.Error.WriteLine($"Stack trace: {ex.StackTrace}");
}

$vbLabelText $csharpLabel

Output

Success Output

The invoice loads cleanly and the engine returns a character count alongside a confidence score.

Failed Output

Order catch blocks from most specific to most general. The when clause on IronOcrNativeException filters for AVX-related failures without catching unrelated native errors. Each handler logs the exception message; the catch-all block also captures the stack trace for post-mortem analysis.

Catching the right exception tells you that something went wrong, but not how well the engine performed when it did succeed. For that, use confidence scores.

How Do I Validate OCR Output with Confidence Scores?

Every OcrResult exposes a Confidence property, a value between 0 and 1 representing the engine's statistical certainty averaged across all recognized characters. You can access this at every level of the result hierarchy: document, page, paragraph, word, and character.

Use a threshold-gated pattern to prevent low-quality results from propagating downstream.

Input

A thermal receipt with itemized line items, discounts, totals, and a barcode, loaded via LoadImage. Its narrow width, monospace font, and faint print make it a practical stress test for per-word confidence thresholds.

receipt.png: Thermal receipt scan used to demonstrate threshold-gated confidence validation and per-word accuracy drill-down.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-confidence-scoring.cs

using IronOcr;

var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadImage("receipt.png");

OcrResult result = ocr.Read(input);
double confidence = result.Confidence;

Console.WriteLine($"Overall confidence: {confidence:P1}");

// Threshold-gated decision
if (confidence >= 0.90)
{
    Console.WriteLine("ACCEPT — high confidence, processing result.");
    ProcessResult(result.Text);
}
else if (confidence >= 0.70)
{
    Console.WriteLine("FLAG — moderate confidence, queuing for review.");
    QueueForReview(result.Text, confidence);
}
else
{
    Console.WriteLine("REJECT — low confidence, logging for investigation.");
    LogRejection("receipt.png", confidence);
}

// Drill into per-page and per-word confidence for diagnostics
foreach (var page in result.Pages)
{
    Console.WriteLine($"  Page {page.PageNumber}: {page.Confidence:P1}");

    var lowConfidenceWords = page.Words
        .Where(w => w.Confidence < 0.70)
        .ToList();

    foreach (var word in lowConfidenceWords)
    {
        Console.WriteLine($"    Low-confidence word: \"{word.Text}\" ({word.Confidence:P1})");
    }
}

$vbLabelText $csharpLabel

Output

This pattern is essential in pipelines where OCR feeds into data entry, invoice processing, or compliance workflows. The per-word drill-down identifies exactly which regions of the source image caused degradation; you can then apply image quality filters or orientation corrections and re-process. For a deeper look at confidence scoring, see the confidence levels how-to.

For long-running jobs, confidence alone is not enough. You also need to know whether the engine is still making progress, and that is where the OcrProgress event comes in.

How Do I Monitor OCR Progress in Real Time?

For multi-page documents, the OcrProgress event on IronTesseract fires after each page completes. The OcrProgressEventArgs object exposes progress percent, elapsed duration, total pages, and pages complete. The example uses this three-page quarterly report as input: a structured business document spanning an executive summary, revenue breakdown, and operational metrics.

Input

A three-page Q1 2024 financial report loaded via LoadPdf. Page one covers the executive summary with KPI metrics, page two contains revenue tables by product line and region, and page three covers operational processing volumes — each page type produces distinct per-page timing you can observe in the progress callbacks.

quarterly_report.pdf: Three-page Q1 2024 financial report (executive summary, revenue breakdown, operational metrics) used to demonstrate real-time OcrProgress callbacks per page.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-progress-monitoring.cs

using IronOcr;

var ocr = new IronTesseract();

ocr.OcrProgress += (sender, e) =>
{
    Console.WriteLine(
        $"[OCR] {e.ProgressPercent}% complete | " +
        $"Page {e.PagesComplete}/{e.TotalPages} | " +
        $"Elapsed: {e.Duration.TotalSeconds:F1}s"
    );
};

using var input = new OcrInput();
input.LoadPdf("quarterly_report.pdf");

OcrResult result = ocr.Read(input);
Console.WriteLine($"Finished in {result.Pages.Count()} pages, confidence: {result.Confidence:P1}");

$vbLabelText $csharpLabel

Output

Wire this event into your logging infrastructure to track OCR job duration and detect stalls. If the elapsed duration exceeds a threshold without the progress percent advancing, the pipeline can flag the job for investigation. This is particularly useful for batch PDF processing where a single malformed page can stall the entire job.

Progress monitoring shows execution state, but a file-level failure can still stop the entire batch short if not isolated.

How Do I Handle Errors in Batch OCR Pipelines?

In production, a single file failure should not halt the entire batch. Isolate errors per file, log failures with context, and produce a summary report at the end. The example processes a folder of scan documents containing an invoice, a purchase order, and a service contract, plus one intentionally corrupted file to trigger the error path. A representative sample is shown below:

Input

A folder of PDFs passed to Directory.GetFiles — an invoice, a purchase order, a service contract, and one intentionally corrupted file. The two representative samples below show the document variety the pipeline processes in a single run.

batch-scan-01.pdf: Invoice for Bright Horizon Ltd. (INV-2024-001) — successful OCR pass.

batch-scan-02.pdf: Purchase order for TechSupply Inc. (PO-2024-042) — second document type in the same run.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-batch-pipeline.cs

using IronOcr;
using IronOcr.Exceptions;

var ocr = new IronTesseract();
Installation.LogFilePath = "batch_debug.log";
Installation.LoggingMode = Installation.LoggingModes.File;

string[] files = Directory.GetFiles("scans/", "*.pdf");
int succeeded = 0, failed = 0;
double totalConfidence = 0;
var failures = new List<(string File, string Error)>();

foreach (string file in files)
{
    try
    {
        using var input = new OcrInput();
        input.LoadPdf(file);

        OcrResult result = ocr.Read(input);
        totalConfidence += result.Confidence;
        succeeded++;

        Console.WriteLine($"OK: {Path.GetFileName(file)} — {result.Confidence:P1}");
    }
    catch (IronOcrInputException ex)
    {
        failed++;
        failures.Add((file, $"Input error: {ex.Message}"));
        Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}");
    }
    catch (IronOcrProductException ex)
    {
        failed++;
        failures.Add((file, $"Engine error: {ex.Message}"));
        Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}");
    }
    catch (Exception ex)
    {
        failed++;
        failures.Add((file, $"Unexpected: {ex.Message}"));
        Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.GetType().Name}: {ex.Message}");
    }
}

// Summary report
Console.WriteLine($"\n--- Batch Summary ---");
Console.WriteLine($"Total: {files.Length} | Passed: {succeeded} | Failed: {failed}");
if (succeeded > 0)
    Console.WriteLine($"Average confidence: {totalConfidence / succeeded:P1}");

foreach (var (f, err) in failures)
    Console.WriteLine($"  {Path.GetFileName(f)}: {err}");

$vbLabelText $csharpLabel

Output

The outer catch block handles unforeseen errors including network timeouts on shared storage, permission issues, or out-of-memory conditions on large TIFFs. Each failure records the file path and error message for the summary, while the loop continues processing remaining files. The log file at batch_debug.log captures engine-level detail for any file that triggers internal diagnostics.

For non-blocking execution in services or web applications, IronOCR supports ReadAsync, which uses the same try-catch structure.

If the pipeline runs without errors but the extracted text is still wrong, the root cause is almost always image quality rather than code. Here is how to address that.

How Do I Debug OCR Accuracy?

If confidence scores are consistently low, the issue is the source image rather than the OCR engine. IronOCR provides preprocessing tools to address this:

Apply image quality filters such as sharpen, denoise, dilate, and erode to improve text clarity
Use orientation correction to automatically deskew and rotate scanned documents
Adjust the DPI setting for low-resolution images before processing
Use computer vision to detect and isolate text regions in complex layouts
The IronOCR Utility lets you visually test filter combinations and export the optimal C# configuration

For deployment-specific issues, IronOCR maintains dedicated troubleshooting guides for Azure Functions, Docker and Linux, and general environment setup.

Where Should I Go Next?

Now that you understand how to debug IronOCR at runtime, explore:

Navigating OCR result structure and metadata including pages, blocks, paragraphs, words, and coordinates
Understanding confidence scoring at every level of the result hierarchy
Using async and multithreading with ReadAsync for high-throughput pipelines
Browsing the full API reference for the full property list

For production use, remember to obtain a license to remove watermarks and access full functionality.

Curtis Chau

Chat with engineering team now

Technical Writer

Curtis Chau holds a Bachelor’s degree in Computer Science (Carleton University) and specializes in front-end development with expertise in Node.js, TypeScript, JavaScript, and React. Passionate about crafting intuitive and aesthetically pleasing user interfaces, Curtis enjoys working with modern frameworks and creating well-structured, visually appealing manuals.

...

Ready to Get Started?

Nuget Downloads 5,558,417 | Version: 2026.3 just released

View Licenses

Still Scrolling?

Want proof fast? PM > Install-Package IronOcr
run a sample watch your image become searchable text.

View Licenses

Customer Highlight:

Developer Spotlight:

Webinars:

Start Free 30 Day Trial

On This Page

How to Debug OCR in C#

Install IronOCR with NuGet Package Manager

Copy and run this code snippet.

Deploy to test on your live environment

Minimal Workflow (5 steps)

How Do I Enable Diagnostic Logging?

What Exceptions Does IronOCR Throw?

Input

Output

Success Output

Failed Output

How Do I Validate OCR Output with Confidence Scores?

Input

Output

How Do I Monitor OCR Progress in Real Time?

Input

Output

How Do I Handle Errors in Batch OCR Pipelines?

Input

Output

How Do I Debug OCR Accuracy?

Where Should I Go Next?

Still Scrolling?

Iron Support Team

Start Free 30 Day Trial

On This Page

How to Debug OCR in C#

Install IronOCR with NuGet Package Manager

Copy and run this code snippet.

Deploy to test on your live environment

Minimal Workflow (5 steps)

How Do I Enable Diagnostic Logging?

What Exceptions Does IronOCR Throw?

Input

Output

Success Output

Failed Output

How Do I Validate OCR Output with Confidence Scores?

Input

Output

How Do I Monitor OCR Progress in Real Time?

Input

Output

How Do I Handle Errors in Batch OCR Pipelines?

Input

Output

How Do I Debug OCR Accuracy?

Where Should I Go Next?

Still Scrolling?

Next step: Start free 30-day Trial

Next step: Start free 30-day Trial

Trusted by Millions of Engineers Worldwide

Iron Support Team