How to Debug OCR in C#

IronOCR enables you to detect OCR failures at the source, assess recognition quality at the word and character level, and monitor long-running jobs in real time. Built-in tools such as diagnostic file logging, a typed exception hierarchy, per-result confidence scoring, and the OcrProgress event support these workflows in production pipelines.

This guide walks through working examples for each: enabling diagnostic logging, handling typed exceptions, validating output with confidence scores, monitoring job progress in real time, and isolating errors in batch pipelines.

Quickstart: Enable full OCR diagnostic logging

Set LogFilePath and LoggingMode on the Installation class before the first Read call. Two properties are all it takes to capture Tesseract initialization, language pack loading, and processing details to a log file.

  1. Install IronOCR with NuGet Package Manager

    PM > Install-Package IronOcr
  2. Copy and run this code snippet.

    IronOcr.Installation.LogFilePath = "ocr.log"; IronOcr.Installation.LoggingMode = IronOcr.Installation.LoggingModes.All;
  3. Deploy to test on your live environment

    Start using IronOCR in your project today with a free trial

    arrow pointer


How Do I Enable Diagnostic Logging?

The Installation class exposes three logging controls. Set these before calling any Read method.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-enable-logging.cs
using IronOcr;

// Write logs to a specific file
Installation.LogFilePath = "logs/ocr_diagnostics.log";

// Enable all logging channels: file + debug output
Installation.LoggingMode = Installation.LoggingModes.All;

// Or pipe logs into your existing ILogger pipeline
Installation.CustomLogger = myLoggerInstance;
$vbLabelText   $csharpLabel

LoggingMode accepts flag values from the LoggingModes enum:

Table 1: LoggingModes Options
ModeOutput TargetUse Case
NoneDisabledProduction with external monitoring
DebugIDE debug output windowLocal development
FileLogFilePathServer-side log collection
AllDebug + FileFull diagnostic capture

The CustomLogger property supports any Microsoft.Extensions.Logging.ILogger implementation, allowing you to direct OCR diagnostics to Serilog, NLog, or other structured logging sinks in your pipeline. Use ClearLogFiles to remove accumulated log data between runs.

With logging in place, the next step is understanding which exceptions IronOCR can throw and how to handle each one.

What Exceptions Does IronOCR Throw?

IronOCR defines typed exceptions under the IronOcr.Exceptions namespace. Catching these specifically, rather than a blanket catch block, lets you route each failure type to the correct remediation path.

Table 2: IronOCR Exception Reference
ExceptionCommon CauseFix
IronOcrInputExceptionCorrupt or unsupported image/PDFValidate file before loading into OcrInput
IronOcrProductExceptionInternal engine error during OCR executionEnable logging, check log output, update to latest NuGet version
IronOcrDictionaryExceptionMissing or corrupt .traineddata language fileReinstall the language pack NuGet or set LanguagePackDirectory
IronOcrNativeExceptionNative C++ interop failureInstall Visual C++ Redistributable; check AVX support
IronOcrLicensingExceptionMissing or expired license keySet LicenseKey before calling Read
LanguagePackExceptionLanguage pack not found at expected pathVerify LanguagePackDirectory or reinstall the NuGet language package
IronOcrAssemblyVersionMismatchExceptionMismatched assembly versions after partial updateClear NuGet cache, restore packages, ensure all IronOCR packages match

Use the following try-catch block to handle each exception type separately, applying exception filters for conditional logging.

Input

A single-page vendor invoice from IronOCR Solutions to Acme Corporation, loaded via LoadPdf into OcrInput. It includes four line items, tax, and a grand total — enough text variety to give each exception handler a realistic exercise.

invoice_scan.pdf: Vendor invoice (#INV-2024-7829) used to demonstrate each typed exception handler in sequence.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-exception-handling.cs
using IronOcr;
using IronOcr.Exceptions;

var ocr = new IronTesseract();

try
{
    using var input = new OcrInput();
    input.LoadPdf("invoice_scan.pdf");

    OcrResult result = ocr.Read(input);
    Console.WriteLine($"Text: {result.Text}");
    Console.WriteLine($"Confidence: {result.Confidence:P1}");
}
catch (IronOcrInputException ex)
{
    // File could not be loaded — corrupt, locked, or unsupported format
    Console.Error.WriteLine($"Input error: {ex.Message}");
}
catch (IronOcrDictionaryException ex)
{
    // Language pack missing — common in containerized deployments
    Console.Error.WriteLine($"Language pack error: {ex.Message}");
}
catch (IronOcrNativeException ex) when (ex.Message.Contains("AVX"))
{
    // CPU does not support AVX instructions
    Console.Error.WriteLine($"Hardware incompatibility: {ex.Message}");
}
catch (IronOcrLicensingException)
{
    Console.Error.WriteLine("License key is missing or invalid.");
}
catch (IronOcrProductException ex)
{
    // Catch-all for other IronOCR engine errors
    Console.Error.WriteLine($"OCR engine error: {ex.Message}");
    Console.Error.WriteLine($"Stack trace: {ex.StackTrace}");
}
$vbLabelText   $csharpLabel

Output

Success Output

The invoice loads cleanly and the engine returns a character count alongside a confidence score.

Terminal output showing successful OCR read of invoice_scan.pdf with character count and confidence score

Failed Output

Terminal output showing exception thrown when loading a missing PDF file

Order catch blocks from most specific to most general. The when clause on IronOcrNativeException filters for AVX-related failures without catching unrelated native errors. Each handler logs the exception message; the catch-all block also captures the stack trace for post-mortem analysis.

Catching the right exception tells you that something went wrong, but not how well the engine performed when it did succeed. For that, use confidence scores.

How Do I Validate OCR Output with Confidence Scores?

Every OcrResult exposes a Confidence property, a value between 0 and 1 representing the engine's statistical certainty averaged across all recognized characters. You can access this at every level of the result hierarchy: document, page, paragraph, word, and character.

Use a threshold-gated pattern to prevent low-quality results from propagating downstream.

Input

A thermal receipt with itemized line items, discounts, totals, and a barcode, loaded via LoadImage. Its narrow width, monospace font, and faint print make it a practical stress test for per-word confidence thresholds.

Sample thermal receipt from FoodMart showing itemized purchases, totals, and rewards points used as OCR input

receipt.png: Thermal receipt scan used to demonstrate threshold-gated confidence validation and per-word accuracy drill-down.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-confidence-scoring.cs
using IronOcr;

var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadImage("receipt.png");

OcrResult result = ocr.Read(input);
double confidence = result.Confidence;

Console.WriteLine($"Overall confidence: {confidence:P1}");

// Threshold-gated decision
if (confidence >= 0.90)
{
    Console.WriteLine("ACCEPT — high confidence, processing result.");
    ProcessResult(result.Text);
}
else if (confidence >= 0.70)
{
    Console.WriteLine("FLAG — moderate confidence, queuing for review.");
    QueueForReview(result.Text, confidence);
}
else
{
    Console.WriteLine("REJECT — low confidence, logging for investigation.");
    LogRejection("receipt.png", confidence);
}

// Drill into per-page and per-word confidence for diagnostics
foreach (var page in result.Pages)
{
    Console.WriteLine($"  Page {page.PageNumber}: {page.Confidence:P1}");

    var lowConfidenceWords = page.Words
        .Where(w => w.Confidence < 0.70)
        .ToList();

    foreach (var word in lowConfidenceWords)
    {
        Console.WriteLine($"    Low-confidence word: \"{word.Text}\" ({word.Confidence:P1})");
    }
}
$vbLabelText   $csharpLabel

Output

Terminal output showing confidence score, accept/flag/reject decision, and per-word low-confidence drill-down for the receipt image

This pattern is essential in pipelines where OCR feeds into data entry, invoice processing, or compliance workflows. The per-word drill-down identifies exactly which regions of the source image caused degradation; you can then apply image quality filters or orientation corrections and re-process. For a deeper look at confidence scoring, see the confidence levels how-to.

For long-running jobs, confidence alone is not enough. You also need to know whether the engine is still making progress, and that is where the OcrProgress event comes in.

How Do I Monitor OCR Progress in Real Time?

For multi-page documents, the OcrProgress event on IronTesseract fires after each page completes. The OcrProgressEventArgs object exposes progress percent, elapsed duration, total pages, and pages complete. The example uses this three-page quarterly report as input: a structured business document spanning an executive summary, revenue breakdown, and operational metrics.

Input

A three-page Q1 2024 financial report loaded via LoadPdf. Page one covers the executive summary with KPI metrics, page two contains revenue tables by product line and region, and page three covers operational processing volumes — each page type produces distinct per-page timing you can observe in the progress callbacks.

quarterly_report.pdf: Three-page Q1 2024 financial report (executive summary, revenue breakdown, operational metrics) used to demonstrate real-time OcrProgress callbacks per page.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-progress-monitoring.cs
using IronOcr;

var ocr = new IronTesseract();

ocr.OcrProgress += (sender, e) =>
{
    Console.WriteLine(
        $"[OCR] {e.ProgressPercent}% complete | " +
        $"Page {e.PagesComplete}/{e.TotalPages} | " +
        $"Elapsed: {e.Duration.TotalSeconds:F1}s"
    );
};

using var input = new OcrInput();
input.LoadPdf("quarterly_report.pdf");

OcrResult result = ocr.Read(input);
Console.WriteLine($"Finished in {result.Pages.Count()} pages, confidence: {result.Confidence:P1}");
$vbLabelText   $csharpLabel

Output

Terminal output showing OcrProgress event callbacks per page with percent complete and elapsed time for a three-page PDF

Wire this event into your logging infrastructure to track OCR job duration and detect stalls. If the elapsed duration exceeds a threshold without the progress percent advancing, the pipeline can flag the job for investigation. This is particularly useful for batch PDF processing where a single malformed page can stall the entire job.

Progress monitoring shows execution state, but a file-level failure can still stop the entire batch short if not isolated.

How Do I Handle Errors in Batch OCR Pipelines?

In production, a single file failure should not halt the entire batch. Isolate errors per file, log failures with context, and produce a summary report at the end. The example processes a folder of scan documents containing an invoice, a purchase order, and a service contract, plus one intentionally corrupted file to trigger the error path. A representative sample is shown below:

Input

A folder of PDFs passed to Directory.GetFiles — an invoice, a purchase order, a service contract, and one intentionally corrupted file. The two representative samples below show the document variety the pipeline processes in a single run.

:path=/static-assets/ocr/content-code-examples/how-to/debugging-batch-pipeline.cs
using IronOcr;
using IronOcr.Exceptions;

var ocr = new IronTesseract();
Installation.LogFilePath = "batch_debug.log";
Installation.LoggingMode = Installation.LoggingModes.File;

string[] files = Directory.GetFiles("scans/", "*.pdf");
int succeeded = 0, failed = 0;
double totalConfidence = 0;
var failures = new List<(string File, string Error)>();

foreach (string file in files)
{
    try
    {
        using var input = new OcrInput();
        input.LoadPdf(file);

        OcrResult result = ocr.Read(input);
        totalConfidence += result.Confidence;
        succeeded++;

        Console.WriteLine($"OK: {Path.GetFileName(file)} — {result.Confidence:P1}");
    }
    catch (IronOcrInputException ex)
    {
        failed++;
        failures.Add((file, $"Input error: {ex.Message}"));
        Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}");
    }
    catch (IronOcrProductException ex)
    {
        failed++;
        failures.Add((file, $"Engine error: {ex.Message}"));
        Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}");
    }
    catch (Exception ex)
    {
        failed++;
        failures.Add((file, $"Unexpected: {ex.Message}"));
        Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.GetType().Name}: {ex.Message}");
    }
}

// Summary report
Console.WriteLine($"\n--- Batch Summary ---");
Console.WriteLine($"Total: {files.Length} | Passed: {succeeded} | Failed: {failed}");
if (succeeded > 0)
    Console.WriteLine($"Average confidence: {totalConfidence / succeeded:P1}");

foreach (var (f, err) in failures)
    Console.WriteLine($"  {Path.GetFileName(f)}: {err}");
$vbLabelText   $csharpLabel

Output

Terminal output showing batch pipeline results with per-file character counts, confidence scores, one error from a corrupted PDF, and a summary line

The outer catch block handles unforeseen errors including network timeouts on shared storage, permission issues, or out-of-memory conditions on large TIFFs. Each failure records the file path and error message for the summary, while the loop continues processing remaining files. The log file at batch_debug.log captures engine-level detail for any file that triggers internal diagnostics.

For non-blocking execution in services or web applications, IronOCR supports ReadAsync, which uses the same try-catch structure.

If the pipeline runs without errors but the extracted text is still wrong, the root cause is almost always image quality rather than code. Here is how to address that.

How Do I Debug OCR Accuracy?

If confidence scores are consistently low, the issue is the source image rather than the OCR engine. IronOCR provides preprocessing tools to address this:

For deployment-specific issues, IronOCR maintains dedicated troubleshooting guides for Azure Functions, Docker and Linux, and general environment setup.

Where Should I Go Next?

Now that you understand how to debug IronOCR at runtime, explore:

For production use, remember to obtain a license to remove watermarks and access full functionality.

Curtis Chau
Technical Writer

Curtis Chau holds a Bachelor’s degree in Computer Science (Carleton University) and specializes in front-end development with expertise in Node.js, TypeScript, JavaScript, and React. Passionate about crafting intuitive and aesthetically pleasing user interfaces, Curtis enjoys working with modern frameworks and creating well-structured, visually appealing manuals.

...

Read More
Ready to Get Started?
Nuget Downloads 5,558,417 | Version: 2026.3 just released
Still Scrolling Icon

Still Scrolling?

Want proof fast? PM > Install-Package IronOcr
run a sample watch your image become searchable text.