How to Debug OCR in C#
IronOCR enables you to detect OCR failures at the source, assess recognition quality at the word and character level, and monitor long-running jobs in real time. Built-in tools such as diagnostic file logging, a typed exception hierarchy, per-result confidence scoring, and the OcrProgress event support these workflows in production pipelines.
This guide walks through working examples for each: enabling diagnostic logging, handling typed exceptions, validating output with confidence scores, monitoring job progress in real time, and isolating errors in batch pipelines.
Quickstart: Enable full OCR diagnostic logging
Set LogFilePath and LoggingMode on the Installation class before the first Read call. Two properties are all it takes to capture Tesseract initialization, language pack loading, and processing details to a log file.
-
Install IronOCR with NuGet Package Manager
PM > Install-Package IronOcr -
Copy and run this code snippet.
IronOcr.Installation.LogFilePath = "ocr.log"; IronOcr.Installation.LoggingMode = IronOcr.Installation.LoggingModes.All; -
Deploy to test on your live environment
Start using IronOCR in your project today with a free trial
Minimal Workflow (5 steps)
- Download a C# library for debugging OCR
- Set
LogFilePathto a writable file path - Set
LoggingModetoAllfor full diagnostic capture - Run your OCR operation and reproduce the issue
- Inspect the generated log file for engine warnings and processing details
How Do I Enable Diagnostic Logging?
The Installation class exposes three logging controls. Set these before calling any Read method.
:path=/static-assets/ocr/content-code-examples/how-to/debugging-enable-logging.cs
using IronOcr;
// Write logs to a specific file
Installation.LogFilePath = "logs/ocr_diagnostics.log";
// Enable all logging channels: file + debug output
Installation.LoggingMode = Installation.LoggingModes.All;
// Or pipe logs into your existing ILogger pipeline
Installation.CustomLogger = myLoggerInstance;
Imports IronOcr
' Write logs to a specific file
Installation.LogFilePath = "logs/ocr_diagnostics.log"
' Enable all logging channels: file + debug output
Installation.LoggingMode = Installation.LoggingModes.All
' Or pipe logs into your existing ILogger pipeline
Installation.CustomLogger = myLoggerInstance
LoggingMode accepts flag values from the LoggingModes enum:
| Mode | Output Target | Use Case |
|---|---|---|
None | Disabled | Production with external monitoring |
Debug | IDE debug output window | Local development |
File | LogFilePath | Server-side log collection |
All | Debug + File | Full diagnostic capture |
The CustomLogger property supports any Microsoft.Extensions.Logging.ILogger implementation, allowing you to direct OCR diagnostics to Serilog, NLog, or other structured logging sinks in your pipeline. Use ClearLogFiles to remove accumulated log data between runs.
With logging in place, the next step is understanding which exceptions IronOCR can throw and how to handle each one.
What Exceptions Does IronOCR Throw?
IronOCR defines typed exceptions under the IronOcr.Exceptions namespace. Catching these specifically, rather than a blanket catch block, lets you route each failure type to the correct remediation path.
| Exception | Common Cause | Fix |
|---|---|---|
IronOcrInputException | Corrupt or unsupported image/PDF | Validate file before loading into OcrInput |
IronOcrProductException | Internal engine error during OCR execution | Enable logging, check log output, update to latest NuGet version |
IronOcrDictionaryException | Missing or corrupt .traineddata language file | Reinstall the language pack NuGet or set LanguagePackDirectory |
IronOcrNativeException | Native C++ interop failure | Install Visual C++ Redistributable; check AVX support |
IronOcrLicensingException | Missing or expired license key | Set LicenseKey before calling Read |
LanguagePackException | Language pack not found at expected path | Verify LanguagePackDirectory or reinstall the NuGet language package |
IronOcrAssemblyVersionMismatchException | Mismatched assembly versions after partial update | Clear NuGet cache, restore packages, ensure all IronOCR packages match |
Use the following try-catch block to handle each exception type separately, applying exception filters for conditional logging.
Input
A single-page vendor invoice from IronOCR Solutions to Acme Corporation, loaded via LoadPdf into OcrInput. It includes four line items, tax, and a grand total — enough text variety to give each exception handler a realistic exercise.
invoice_scan.pdf: Vendor invoice (#INV-2024-7829) used to demonstrate each typed exception handler in sequence.
:path=/static-assets/ocr/content-code-examples/how-to/debugging-exception-handling.cs
using IronOcr;
using IronOcr.Exceptions;
var ocr = new IronTesseract();
try
{
using var input = new OcrInput();
input.LoadPdf("invoice_scan.pdf");
OcrResult result = ocr.Read(input);
Console.WriteLine($"Text: {result.Text}");
Console.WriteLine($"Confidence: {result.Confidence:P1}");
}
catch (IronOcrInputException ex)
{
// File could not be loaded — corrupt, locked, or unsupported format
Console.Error.WriteLine($"Input error: {ex.Message}");
}
catch (IronOcrDictionaryException ex)
{
// Language pack missing — common in containerized deployments
Console.Error.WriteLine($"Language pack error: {ex.Message}");
}
catch (IronOcrNativeException ex) when (ex.Message.Contains("AVX"))
{
// CPU does not support AVX instructions
Console.Error.WriteLine($"Hardware incompatibility: {ex.Message}");
}
catch (IronOcrLicensingException)
{
Console.Error.WriteLine("License key is missing or invalid.");
}
catch (IronOcrProductException ex)
{
// Catch-all for other IronOCR engine errors
Console.Error.WriteLine($"OCR engine error: {ex.Message}");
Console.Error.WriteLine($"Stack trace: {ex.StackTrace}");
}
Imports IronOcr
Imports IronOcr.Exceptions
Dim ocr As New IronTesseract()
Try
Using input As New OcrInput()
input.LoadPdf("invoice_scan.pdf")
Dim result As OcrResult = ocr.Read(input)
Console.WriteLine($"Text: {result.Text}")
Console.WriteLine($"Confidence: {result.Confidence:P1}")
End Using
Catch ex As IronOcrInputException
' File could not be loaded — corrupt, locked, or unsupported format
Console.Error.WriteLine($"Input error: {ex.Message}")
Catch ex As IronOcrDictionaryException
' Language pack missing — common in containerized deployments
Console.Error.WriteLine($"Language pack error: {ex.Message}")
Catch ex As IronOcrNativeException When ex.Message.Contains("AVX")
' CPU does not support AVX instructions
Console.Error.WriteLine($"Hardware incompatibility: {ex.Message}")
Catch ex As IronOcrLicensingException
Console.Error.WriteLine("License key is missing or invalid.")
Catch ex As IronOcrProductException
' Catch-all for other IronOCR engine errors
Console.Error.WriteLine($"OCR engine error: {ex.Message}")
Console.Error.WriteLine($"Stack trace: {ex.StackTrace}")
End Try
Output
Success Output
The invoice loads cleanly and the engine returns a character count alongside a confidence score.
Failed Output
Order catch blocks from most specific to most general. The when clause on IronOcrNativeException filters for AVX-related failures without catching unrelated native errors. Each handler logs the exception message; the catch-all block also captures the stack trace for post-mortem analysis.
Catching the right exception tells you that something went wrong, but not how well the engine performed when it did succeed. For that, use confidence scores.
How Do I Validate OCR Output with Confidence Scores?
Every OcrResult exposes a Confidence property, a value between 0 and 1 representing the engine's statistical certainty averaged across all recognized characters. You can access this at every level of the result hierarchy: document, page, paragraph, word, and character.
Use a threshold-gated pattern to prevent low-quality results from propagating downstream.
Input
A thermal receipt with itemized line items, discounts, totals, and a barcode, loaded via LoadImage. Its narrow width, monospace font, and faint print make it a practical stress test for per-word confidence thresholds.
receipt.png: Thermal receipt scan used to demonstrate threshold-gated confidence validation and per-word accuracy drill-down.
:path=/static-assets/ocr/content-code-examples/how-to/debugging-confidence-scoring.cs
using IronOcr;
var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadImage("receipt.png");
OcrResult result = ocr.Read(input);
double confidence = result.Confidence;
Console.WriteLine($"Overall confidence: {confidence:P1}");
// Threshold-gated decision
if (confidence >= 0.90)
{
Console.WriteLine("ACCEPT — high confidence, processing result.");
ProcessResult(result.Text);
}
else if (confidence >= 0.70)
{
Console.WriteLine("FLAG — moderate confidence, queuing for review.");
QueueForReview(result.Text, confidence);
}
else
{
Console.WriteLine("REJECT — low confidence, logging for investigation.");
LogRejection("receipt.png", confidence);
}
// Drill into per-page and per-word confidence for diagnostics
foreach (var page in result.Pages)
{
Console.WriteLine($" Page {page.PageNumber}: {page.Confidence:P1}");
var lowConfidenceWords = page.Words
.Where(w => w.Confidence < 0.70)
.ToList();
foreach (var word in lowConfidenceWords)
{
Console.WriteLine($" Low-confidence word: \"{word.Text}\" ({word.Confidence:P1})");
}
}
Imports IronOcr
Dim ocr As New IronTesseract()
Using input As New OcrInput()
input.LoadImage("receipt.png")
Dim result As OcrResult = ocr.Read(input)
Dim confidence As Double = result.Confidence
Console.WriteLine($"Overall confidence: {confidence:P1}")
' Threshold-gated decision
If confidence >= 0.9 Then
Console.WriteLine("ACCEPT — high confidence, processing result.")
ProcessResult(result.Text)
ElseIf confidence >= 0.7 Then
Console.WriteLine("FLAG — moderate confidence, queuing for review.")
QueueForReview(result.Text, confidence)
Else
Console.WriteLine("REJECT — low confidence, logging for investigation.")
LogRejection("receipt.png", confidence)
End If
' Drill into per-page and per-word confidence for diagnostics
For Each page In result.Pages
Console.WriteLine($" Page {page.PageNumber}: {page.Confidence:P1}")
Dim lowConfidenceWords = page.Words _
.Where(Function(w) w.Confidence < 0.7) _
.ToList()
For Each word In lowConfidenceWords
Console.WriteLine($" Low-confidence word: ""{word.Text}"" ({word.Confidence:P1})")
Next
Next
End Using
Output
This pattern is essential in pipelines where OCR feeds into data entry, invoice processing, or compliance workflows. The per-word drill-down identifies exactly which regions of the source image caused degradation; you can then apply image quality filters or orientation corrections and re-process. For a deeper look at confidence scoring, see the confidence levels how-to.
For long-running jobs, confidence alone is not enough. You also need to know whether the engine is still making progress, and that is where the OcrProgress event comes in.
How Do I Monitor OCR Progress in Real Time?
For multi-page documents, the OcrProgress event on IronTesseract fires after each page completes. The OcrProgressEventArgs object exposes progress percent, elapsed duration, total pages, and pages complete. The example uses this three-page quarterly report as input: a structured business document spanning an executive summary, revenue breakdown, and operational metrics.
Input
A three-page Q1 2024 financial report loaded via LoadPdf. Page one covers the executive summary with KPI metrics, page two contains revenue tables by product line and region, and page three covers operational processing volumes — each page type produces distinct per-page timing you can observe in the progress callbacks.
quarterly_report.pdf: Three-page Q1 2024 financial report (executive summary, revenue breakdown, operational metrics) used to demonstrate real-time OcrProgress callbacks per page.
:path=/static-assets/ocr/content-code-examples/how-to/debugging-progress-monitoring.cs
using IronOcr;
var ocr = new IronTesseract();
ocr.OcrProgress += (sender, e) =>
{
Console.WriteLine(
$"[OCR] {e.ProgressPercent}% complete | " +
$"Page {e.PagesComplete}/{e.TotalPages} | " +
$"Elapsed: {e.Duration.TotalSeconds:F1}s"
);
};
using var input = new OcrInput();
input.LoadPdf("quarterly_report.pdf");
OcrResult result = ocr.Read(input);
Console.WriteLine($"Finished in {result.Pages.Count()} pages, confidence: {result.Confidence:P1}");
Imports IronOcr
Dim ocr = New IronTesseract()
AddHandler ocr.OcrProgress, Sub(sender, e)
Console.WriteLine(
$"[OCR] {e.ProgressPercent}% complete | " &
$"Page {e.PagesComplete}/{e.TotalPages} | " &
$"Elapsed: {e.Duration.TotalSeconds:F1}s"
)
End Sub
Using input As New OcrInput()
input.LoadPdf("quarterly_report.pdf")
Dim result As OcrResult = ocr.Read(input)
Console.WriteLine($"Finished in {result.Pages.Count()} pages, confidence: {result.Confidence:P1}")
End Using
Output
Wire this event into your logging infrastructure to track OCR job duration and detect stalls. If the elapsed duration exceeds a threshold without the progress percent advancing, the pipeline can flag the job for investigation. This is particularly useful for batch PDF processing where a single malformed page can stall the entire job.
Progress monitoring shows execution state, but a file-level failure can still stop the entire batch short if not isolated.
How Do I Handle Errors in Batch OCR Pipelines?
In production, a single file failure should not halt the entire batch. Isolate errors per file, log failures with context, and produce a summary report at the end. The example processes a folder of scan documents containing an invoice, a purchase order, and a service contract, plus one intentionally corrupted file to trigger the error path. A representative sample is shown below:
Input
A folder of PDFs passed to Directory.GetFiles — an invoice, a purchase order, a service contract, and one intentionally corrupted file. The two representative samples below show the document variety the pipeline processes in a single run.
batch-scan-01.pdf: Invoice for Bright Horizon Ltd. (INV-2024-001) — successful OCR pass.
batch-scan-02.pdf: Purchase order for TechSupply Inc. (PO-2024-042) — second document type in the same run.
:path=/static-assets/ocr/content-code-examples/how-to/debugging-batch-pipeline.cs
using IronOcr;
using IronOcr.Exceptions;
var ocr = new IronTesseract();
Installation.LogFilePath = "batch_debug.log";
Installation.LoggingMode = Installation.LoggingModes.File;
string[] files = Directory.GetFiles("scans/", "*.pdf");
int succeeded = 0, failed = 0;
double totalConfidence = 0;
var failures = new List<(string File, string Error)>();
foreach (string file in files)
{
try
{
using var input = new OcrInput();
input.LoadPdf(file);
OcrResult result = ocr.Read(input);
totalConfidence += result.Confidence;
succeeded++;
Console.WriteLine($"OK: {Path.GetFileName(file)} — {result.Confidence:P1}");
}
catch (IronOcrInputException ex)
{
failed++;
failures.Add((file, $"Input error: {ex.Message}"));
Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}");
}
catch (IronOcrProductException ex)
{
failed++;
failures.Add((file, $"Engine error: {ex.Message}"));
Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}");
}
catch (Exception ex)
{
failed++;
failures.Add((file, $"Unexpected: {ex.Message}"));
Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.GetType().Name}: {ex.Message}");
}
}
// Summary report
Console.WriteLine($"\n--- Batch Summary ---");
Console.WriteLine($"Total: {files.Length} | Passed: {succeeded} | Failed: {failed}");
if (succeeded > 0)
Console.WriteLine($"Average confidence: {totalConfidence / succeeded:P1}");
foreach (var (f, err) in failures)
Console.WriteLine($" {Path.GetFileName(f)}: {err}");
Imports IronOcr
Imports IronOcr.Exceptions
Imports System.IO
Dim ocr As New IronTesseract()
Installation.LogFilePath = "batch_debug.log"
Installation.LoggingMode = Installation.LoggingModes.File
Dim files As String() = Directory.GetFiles("scans/", "*.pdf")
Dim succeeded As Integer = 0
Dim failed As Integer = 0
Dim totalConfidence As Double = 0
Dim failures As New List(Of (File As String, Error As String))()
For Each file As String In files
Try
Using input As New OcrInput()
input.LoadPdf(file)
Dim result As OcrResult = ocr.Read(input)
totalConfidence += result.Confidence
succeeded += 1
Console.WriteLine($"OK: {Path.GetFileName(file)} — {result.Confidence:P1}")
End Using
Catch ex As IronOcrInputException
failed += 1
failures.Add((file, $"Input error: {ex.Message}"))
Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}")
Catch ex As IronOcrProductException
failed += 1
failures.Add((file, $"Engine error: {ex.Message}"))
Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.Message}")
Catch ex As Exception
failed += 1
failures.Add((file, $"Unexpected: {ex.Message}"))
Console.Error.WriteLine($"FAIL: {Path.GetFileName(file)} — {ex.GetType().Name}: {ex.Message}")
End Try
Next
' Summary report
Console.WriteLine(vbCrLf & "--- Batch Summary ---")
Console.WriteLine($"Total: {files.Length} | Passed: {succeeded} | Failed: {failed}")
If succeeded > 0 Then
Console.WriteLine($"Average confidence: {totalConfidence / succeeded:P1}")
End If
For Each failure In failures
Console.WriteLine($" {Path.GetFileName(failure.File)}: {failure.Error}")
Next
Output
The outer catch block handles unforeseen errors including network timeouts on shared storage, permission issues, or out-of-memory conditions on large TIFFs. Each failure records the file path and error message for the summary, while the loop continues processing remaining files. The log file at batch_debug.log captures engine-level detail for any file that triggers internal diagnostics.
For non-blocking execution in services or web applications, IronOCR supports ReadAsync, which uses the same try-catch structure.
If the pipeline runs without errors but the extracted text is still wrong, the root cause is almost always image quality rather than code. Here is how to address that.
How Do I Debug OCR Accuracy?
If confidence scores are consistently low, the issue is the source image rather than the OCR engine. IronOCR provides preprocessing tools to address this:
- Apply image quality filters such as sharpen, denoise, dilate, and erode to improve text clarity
- Use orientation correction to automatically deskew and rotate scanned documents
- Adjust the DPI setting for low-resolution images before processing
- Use computer vision to detect and isolate text regions in complex layouts
- The IronOCR Utility lets you visually test filter combinations and export the optimal C# configuration
For deployment-specific issues, IronOCR maintains dedicated troubleshooting guides for Azure Functions, Docker and Linux, and general environment setup.
Where Should I Go Next?
Now that you understand how to debug IronOCR at runtime, explore:
- Navigating OCR result structure and metadata including pages, blocks, paragraphs, words, and coordinates
- Understanding confidence scoring at every level of the result hierarchy
- Using async and multithreading with
ReadAsyncfor high-throughput pipelines - Browsing the full API reference for the full property list
For production use, remember to obtain a license to remove watermarks and access full functionality.
Frequently Asked Questions
What are the common issues when debugging OCR in C#?
Common issues include incorrect OCR results, low confidence scores, and unexpected exceptions. IronOCR provides tools like logging and confidence scoring to help identify and resolve these issues.
How does IronOCR assist with error handling in C#?
IronOCR offers typed exceptions and detailed error messages, which aid in understanding and handling errors effectively during OCR operation in C# applications.
What logging features does IronOCR provide for debugging?
IronOCR includes built-in logging capabilities that help track OCR processes and identify potential issues by logging detailed information about the OCR operations.
How can confidence scoring improve OCR results?
Confidence scoring in IronOCR helps determine the accuracy of the recognized text, allowing developers to focus on low-confidence areas and improve the OCR results.
Can I track the progress of OCR tasks using IronOCR?
Yes, IronOCR provides progress tracking features that enable developers to monitor the status and duration of OCR tasks, facilitating better resource management and performance optimization.
What try-catch patterns are recommended for OCR error handling?
IronOCR suggests using production-ready try-catch patterns to handle exceptions gracefully, ensuring that OCR applications remain robust and maintainable.
How can IronOCR's built-in tools enhance OCR debugging?
IronOCR's tools, such as logging, typed exceptions, and confidence scoring, provide comprehensive support for identifying and resolving issues, thus enhancing the debugging process.
Why is error logging important in OCR applications?
Error logging is crucial as it provides insight into what went wrong during OCR processing, allowing developers to quickly diagnose and fix issues in their applications.
What role do typed exceptions play in debugging IronOCR?
Typed exceptions in IronOCR provide specific error information, making it easier for developers to understand the nature of the problem and apply appropriate solutions during debugging.
How can developers benefit from IronOCR's debugging features?
Developers can leverage IronOCR's debugging features to efficiently troubleshoot issues, enhance application stability, and improve the overall quality of the OCR results.

