Migrating from Aspose.OCR to IronOCR
This guide walks .NET developers through a complete migration from Aspose.OCR to IronOCR. It covers the package swap, namespace changes, license initialization, and four concrete code migration examples drawn from real Aspose.OCR usage patterns — recognition settings configuration, area detection modes, batch recognition with confidence-based filtering, and structured output handling. Each example shows the Aspose.OCR approach alongside the IronOCR equivalent so you can translate your existing code without guesswork.
Why Migrate from Aspose.OCR
The reasons teams leave Aspose.OCR cluster around two pressure points: the subscription billing model and the configuration burden imposed by the manual recognition pipeline.
Subscription costs compound without a ceiling. Aspose.OCR has no perpetual license tier. The Developer Small Business license is $999 per developer per year. A five-person team renewing for three years pays $14,985 before writing a single line of business logic. IronOCR's Professional tier is $2,999 once — the same coverage, forever, with no renewal obligation. The math becomes inescapable when finance asks why an OCR dependency renews annually like a SaaS subscription.
Every recognition call requires a ceremony of settings objects. Aspose.OCR separates its configuration surface across RecognitionSettings, DocumentRecognitionSettings, DetectAreasMode, and the PreprocessingFilter collection. Before you can call RecognizeImage, you construct a settings object, populate it, and pass it explicitly. IronOCR collapses that into .Read(). The difference is small per call; it accumulates across a codebase.
Area detection mode selection is manual and consequential. Aspose.OCR exposes DetectAreasMode values (COMBINE, DOCUMENT, TABLE, NONE) that the developer must choose for each document type. Wrong mode on a structured form drops recognition accuracy. IronOCR analyzes document layout automatically and exposes the structured result — paragraphs, lines, words — without requiring upfront mode declaration.
Batch processing requires managing result lists by hand. Saving a batch of recognized pages as a searchable PDF or structured data file in Aspose.OCR means accumulating RecognitionResult objects into a List<RecognitionResult>, then passing that list to SaveMultipageDocument. The list threading is your responsibility. IronOCR accepts multiple inputs through a single OcrInput object and produces one OcrResult covering all pages.
Output format switching touches multiple API surfaces. Exporting to JSON, XML, or plain text in Aspose.OCR each requires a separate SaveFormat enum value passed to SaveMultipageDocument. Filtering those results by confidence before saving requires iterating the list and inspecting each RecognitionAreasConfidence array. IronOCR exposes result.Text, result.Confidence, and result.Pages on a single result object — confidence filtering is a one-line LINQ expression.
The IronOCR licensing model eliminates renewal risk entirely. A purchased license is yours permanently. Updates are included for one year; after that, the last received version continues to work in production without compliance exposure. There is no scenario where a missed payment breaks your deployment.
The Fundamental Problem
Aspose.OCR ties recognition configuration to a settings object that must be constructed, populated, and passed on every call. The mode, language, filters, and area strategy are all properties of that object:
// Aspose.OCR: build a settings object for every recognition call
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.DOCUMENT, // must choose the right mode
RecognizeSingleLine = false,
AutoSkew = true
};
var result = api.RecognizeImage("form.jpg", settings);
string text = result.RecognitionText;
// Aspose.OCR: build a settings object for every recognition call
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.DOCUMENT, // must choose the right mode
RecognizeSingleLine = false,
AutoSkew = true
};
var result = api.RecognizeImage("form.jpg", settings);
string text = result.RecognitionText;
Imports Aspose.OCR
' Aspose.OCR: build a settings object for every recognition call
Dim api As New AsposeOcr()
Dim settings As New RecognitionSettings With {
.Language = Language.Eng,
.DetectAreasMode = DetectAreasMode.DOCUMENT, ' must choose the right mode
.RecognizeSingleLine = False,
.AutoSkew = True
}
Dim result = api.RecognizeImage("form.jpg", settings)
Dim text As String = result.RecognitionText
IronOCR uses a single .Read() call. Configuration lives on the IronTesseract instance when needed, not on a per-call object:
// IronOCR: one call, no settings object required
var text = new IronTesseract().Read("form.jpg").Text;
// IronOCR: one call, no settings object required
var text = new IronTesseract().Read("form.jpg").Text;
Imports IronOcr
Dim text As String = New IronTesseract().Read("form.jpg").Text
IronOCR vs Aspose.OCR: Feature Comparison
The table below maps the two libraries across the dimensions that matter most during a migration decision.
| Feature | Aspose.OCR | IronOCR |
|---|---|---|
| License model | Annual subscription (no perpetual option) | Perpetual one-time purchase |
| 1-developer cost | $999/year | $999 once |
| 10-developer cost | $4,995/year (Site license) | $2,999 once (Professional) |
| License expiry consequence | Cannot deploy new builds, no security patches | None — purchased version works indefinitely |
| Primary OCR class | AsposeOcr |
IronTesseract |
| Settings object required | Yes (RecognitionSettings or DocumentRecognitionSettings) |
No — optional OcrInput for advanced scenarios |
| Area detection | Manual DetectAreasMode enum selection |
Automatic layout analysis |
| Preprocessing | Manual PreprocessingFilter collection |
Automatic with optional explicit override |
| PDF input | Standard PDFs via RecognizePdf() |
Native via .Read() or OcrInput.LoadPdf() |
| Password-protected PDF | Requires Aspose.PDF (separate license) | Built-in Password: parameter |
| Searchable PDF output | SaveMultipageDocument(path, SaveFormat.Pdf, list) |
result.SaveAsSearchablePdf(path) |
| Confidence value | result.RecognitionAreasConfidence.Average() (array) |
result.Confidence (single double, 0–100) |
| Word-level structured data | Area-level geometry via RecognitionAreasRectangles |
result.Words with X, Y, Width, Height, Confidence |
| Page-level structured data | Not exposed | result.Pages with paragraphs, lines, words, characters |
| Multi-language simultaneous | Single language per call | OcrLanguage.French + OcrLanguage.German |
| Languages included | 130+ in main package | 125+ via NuGet language packs |
| Barcode reading | Not available | Built-in (ocr.Configuration.ReadBarCodes = true) |
| Thread safety | New instance per thread recommended | Fully thread-safe, single shared instance |
| TIFF multi-frame | Not natively | input.LoadImageFrames("file.tiff") |
| hOCR export | Limited | result.SaveAsHocrFile(path) |
| Cross-platform NuGet | Yes | Yes (Windows, Linux, macOS, Docker, Azure, AWS) |
| NuGet package count | 1 main + optional language packs | 1 main + optional language packs |
Quick Start: Aspose.OCR to IronOCR Migration
Step 1: Replace NuGet Package
Remove Aspose.OCR:
dotnet remove package Aspose.OCR
dotnet remove package Aspose.OCR
Install IronOCR from NuGet:
dotnet add package IronOcr
Step 2: Update Namespaces
Replace all Aspose.OCR namespace imports:
// Before (Aspose.OCR)
using Aspose.OCR;
using Aspose.OCR.Models;
using Aspose.OCR.Models.PreprocessingFilters;
// After (IronOCR)
using IronOcr;
// Before (Aspose.OCR)
using Aspose.OCR;
using Aspose.OCR.Models;
using Aspose.OCR.Models.PreprocessingFilters;
// After (IronOCR)
using IronOcr;
Imports IronOcr
Step 3: Initialize License
Remove the Aspose file-based license call and replace it with the IronOCR string key. Place this at application startup — once per process, not once per request:
// Remove Aspose license setup
// var license = new Aspose.OCR.License();
// license.SetLicense("Aspose.OCR.lic");
// Add IronOCR license at startup
IronOcr.License.LicenseKey = "YOUR-LICENSE-KEY";
// Production pattern: read from environment variable
IronOcr.License.LicenseKey = Environment.GetEnvironmentVariable("IRONOCR_LICENSE");
// Remove Aspose license setup
// var license = new Aspose.OCR.License();
// license.SetLicense("Aspose.OCR.lic");
// Add IronOCR license at startup
IronOcr.License.LicenseKey = "YOUR-LICENSE-KEY";
// Production pattern: read from environment variable
IronOcr.License.LicenseKey = Environment.GetEnvironmentVariable("IRONOCR_LICENSE");
' Remove Aspose license setup
' Dim license As New Aspose.OCR.License()
' license.SetLicense("Aspose.OCR.lic")
' Add IronOCR license at startup
IronOcr.License.LicenseKey = "YOUR-LICENSE-KEY"
' Production pattern: read from environment variable
IronOcr.License.LicenseKey = Environment.GetEnvironmentVariable("IRONOCR_LICENSE")
Code Migration Examples
Recognition Settings Configuration
Aspose.OCR centralizes all recognition behavior in a RecognitionSettings object. Language, area detection mode, single-line flag, and threshold all live there as properties. You construct it fresh for each document type or call pattern.
Aspose.OCR Approach:
// Configuring recognition settings for a structured form
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.DOCUMENT,
RecognizeSingleLine = false,
AutoSkew = true,
RecognitionAreas = new List<Rectangle>
{
new Rectangle(0, 0, 800, 100) // header zone
}
};
var result = api.RecognizeImage("structured-form.jpg", settings);
Console.WriteLine(result.RecognitionText);
// Configuring recognition settings for a structured form
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.DOCUMENT,
RecognizeSingleLine = false,
AutoSkew = true,
RecognitionAreas = new List<Rectangle>
{
new Rectangle(0, 0, 800, 100) // header zone
}
};
var result = api.RecognizeImage("structured-form.jpg", settings);
Console.WriteLine(result.RecognitionText);
Imports System
Imports System.Collections.Generic
Imports AsposeOcr
' Configuring recognition settings for a structured form
Dim api As New AsposeOcr()
Dim settings As New RecognitionSettings With {
.Language = Language.Eng,
.DetectAreasMode = DetectAreasMode.DOCUMENT,
.RecognizeSingleLine = False,
.AutoSkew = True,
.RecognitionAreas = New List(Of Rectangle) From {
New Rectangle(0, 0, 800, 100) ' header zone
}
}
Dim result = api.RecognizeImage("structured-form.jpg", settings)
Console.WriteLine(result.RecognitionText)
IronOCR Approach:
// Recognition behavior configured once on the IronTesseract instance
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.English;
// Region targeting replaces RecognitionAreas in RecognitionSettings
var headerRegion = new CropRectangle(0, 0, 800, 100);
using var input = new OcrInput();
input.LoadImage("structured-form.jpg", headerRegion);
var result = ocr.Read(input);
Console.WriteLine(result.Text);
// Recognition behavior configured once on the IronTesseract instance
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.English;
// Region targeting replaces RecognitionAreas in RecognitionSettings
var headerRegion = new CropRectangle(0, 0, 800, 100);
using var input = new OcrInput();
input.LoadImage("structured-form.jpg", headerRegion);
var result = ocr.Read(input);
Console.WriteLine(result.Text);
Imports IronOcr
' Recognition behavior configured once on the IronTesseract instance
Dim ocr As New IronTesseract()
ocr.Language = OcrLanguage.English
' Region targeting replaces RecognitionAreas in RecognitionSettings
Dim headerRegion As New CropRectangle(0, 0, 800, 100)
Using input As New OcrInput()
input.LoadImage("structured-form.jpg", headerRegion)
Dim result = ocr.Read(input)
Console.WriteLine(result.Text)
End Using
There is no settings object to build per call. Language goes on the IronTesseract instance; region targeting goes on OcrInput at load time. The DetectAreasMode and RecognizeSingleLine decisions are handled by the engine automatically. For detailed guidance on region-based OCR, see the region-based OCR how-to guide.
Area Detection Mode Migration
Aspose.OCR requires you to pick a DetectAreasMode value before each recognition call. COMBINE merges text from different layout regions, DOCUMENT treats the image as a standard document, TABLE optimizes for grid layouts. Picking the wrong mode for the document type causes misaligned or missing output.
Aspose.OCR Approach:
// Three separate calls with different modes for different document types
var api = new AsposeOcr();
// For a document with mixed prose and table content
var docSettings = new RecognitionSettings
{
DetectAreasMode = DetectAreasMode.COMBINE,
Language = Language.Eng
};
// For a pure tabular document
var tableSettings = new RecognitionSettings
{
DetectAreasMode = DetectAreasMode.TABLE,
Language = Language.Eng
};
// For a single-column text document
var linearSettings = new RecognitionSettings
{
DetectAreasMode = DetectAreasMode.DOCUMENT,
Language = Language.Eng
};
string mixedResult = api.RecognizeImage("mixed-layout.jpg", docSettings).RecognitionText;
string tableResult = api.RecognizeImage("data-table.jpg", tableSettings).RecognitionText;
string linearResult = api.RecognizeImage("text-document.jpg", linearSettings).RecognitionText;
// Three separate calls with different modes for different document types
var api = new AsposeOcr();
// For a document with mixed prose and table content
var docSettings = new RecognitionSettings
{
DetectAreasMode = DetectAreasMode.COMBINE,
Language = Language.Eng
};
// For a pure tabular document
var tableSettings = new RecognitionSettings
{
DetectAreasMode = DetectAreasMode.TABLE,
Language = Language.Eng
};
// For a single-column text document
var linearSettings = new RecognitionSettings
{
DetectAreasMode = DetectAreasMode.DOCUMENT,
Language = Language.Eng
};
string mixedResult = api.RecognizeImage("mixed-layout.jpg", docSettings).RecognitionText;
string tableResult = api.RecognizeImage("data-table.jpg", tableSettings).RecognitionText;
string linearResult = api.RecognizeImage("text-document.jpg", linearSettings).RecognitionText;
Imports AsposeOcr
' Three separate calls with different modes for different document types
Dim api As New AsposeOcr()
' For a document with mixed prose and table content
Dim docSettings As New RecognitionSettings With {
.DetectAreasMode = DetectAreasMode.COMBINE,
.Language = Language.Eng
}
' For a pure tabular document
Dim tableSettings As New RecognitionSettings With {
.DetectAreasMode = DetectAreasMode.TABLE,
.Language = Language.Eng
}
' For a single-column text document
Dim linearSettings As New RecognitionSettings With {
.DetectAreasMode = DetectAreasMode.DOCUMENT,
.Language = Language.Eng
}
Dim mixedResult As String = api.RecognizeImage("mixed-layout.jpg", docSettings).RecognitionText
Dim tableResult As String = api.RecognizeImage("data-table.jpg", tableSettings).RecognitionText
Dim linearResult As String = api.RecognizeImage("text-document.jpg", linearSettings).RecognitionText
IronOCR Approach:
// Single API surface handles all layout types automatically
var ocr = new IronTesseract();
// Same code path for every document type
var mixedResult = ocr.Read("mixed-layout.jpg").Text;
var tableResult = ocr.Read("data-table.jpg").Text;
var linearResult = ocr.Read("text-document.jpg").Text;
// For table documents, structured data is immediately available
var result = ocr.Read("data-table.jpg");
foreach (var page in result.Pages)
{
foreach (var paragraph in page.Paragraphs)
{
Console.WriteLine($"Block at ({paragraph.X}, {paragraph.Y}): {paragraph.Text}");
}
}
// Single API surface handles all layout types automatically
var ocr = new IronTesseract();
// Same code path for every document type
var mixedResult = ocr.Read("mixed-layout.jpg").Text;
var tableResult = ocr.Read("data-table.jpg").Text;
var linearResult = ocr.Read("text-document.jpg").Text;
// For table documents, structured data is immediately available
var result = ocr.Read("data-table.jpg");
foreach (var page in result.Pages)
{
foreach (var paragraph in page.Paragraphs)
{
Console.WriteLine($"Block at ({paragraph.X}, {paragraph.Y}): {paragraph.Text}");
}
}
Imports System
' Single API surface handles all layout types automatically
Dim ocr As New IronTesseract()
' Same code path for every document type
Dim mixedResult As String = ocr.Read("mixed-layout.jpg").Text
Dim tableResult As String = ocr.Read("data-table.jpg").Text
Dim linearResult As String = ocr.Read("text-document.jpg").Text
' For table documents, structured data is immediately available
Dim result = ocr.Read("data-table.jpg")
For Each page In result.Pages
For Each paragraph In page.Paragraphs
Console.WriteLine($"Block at ({paragraph.X}, {paragraph.Y}): {paragraph.Text}")
Next
Next
IronOCR eliminates the mode selection decision entirely. The engine analyzes layout and exposes the result through the Pages, Paragraphs, Lines, and Words hierarchy. The read results guide covers how to navigate the full structured result model for document layout parsing. For table-specific extraction patterns, see the table reading how-to.
Batch Recognition with Confidence-Based Filtering
Aspose.OCR batch workflows accumulate RecognitionResult objects in a list. Confidence filtering requires iterating that list and computing the per-region average before accepting a result. The list must be managed explicitly and passed to SaveMultipageDocument if you want to output multiple pages as a single file.
Aspose.OCR Approach:
// Batch recognition with confidence filtering before output
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.DOCUMENT
};
string[] documentPaths = Directory.GetFiles("invoice-archive", "*.jpg");
var acceptedResults = new List<RecognitionResult>();
var rejectedPaths = new List<string>();
foreach (var path in documentPaths)
{
var result = api.RecognizeImage(path, settings);
// Confidence is an array of per-region values — must average manually
float avgConfidence = result.RecognitionAreasConfidence != null
? result.RecognitionAreasConfidence.Average()
: 0f;
if (avgConfidence >= 0.70f)
{
acceptedResults.Add(result);
}
else
{
rejectedPaths.Add(path);
Console.WriteLine($"Rejected: {path} ({avgConfidence:P0} confidence)");
}
}
// Save accepted pages as a single searchable PDF
if (acceptedResults.Any())
{
api.SaveMultipageDocument("high-confidence-invoices.pdf",
SaveFormat.Pdf, acceptedResults);
}
Console.WriteLine($"Accepted: {acceptedResults.Count}, Rejected: {rejectedPaths.Count}");
// Batch recognition with confidence filtering before output
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.DOCUMENT
};
string[] documentPaths = Directory.GetFiles("invoice-archive", "*.jpg");
var acceptedResults = new List<RecognitionResult>();
var rejectedPaths = new List<string>();
foreach (var path in documentPaths)
{
var result = api.RecognizeImage(path, settings);
// Confidence is an array of per-region values — must average manually
float avgConfidence = result.RecognitionAreasConfidence != null
? result.RecognitionAreasConfidence.Average()
: 0f;
if (avgConfidence >= 0.70f)
{
acceptedResults.Add(result);
}
else
{
rejectedPaths.Add(path);
Console.WriteLine($"Rejected: {path} ({avgConfidence:P0} confidence)");
}
}
// Save accepted pages as a single searchable PDF
if (acceptedResults.Any())
{
api.SaveMultipageDocument("high-confidence-invoices.pdf",
SaveFormat.Pdf, acceptedResults);
}
Console.WriteLine($"Accepted: {acceptedResults.Count}, Rejected: {rejectedPaths.Count}");
Imports System
Imports System.IO
Imports System.Linq
Imports Aspose.OCR
' Batch recognition with confidence filtering before output
Dim api As New AsposeOcr()
Dim settings As New RecognitionSettings With {
.Language = Language.Eng,
.DetectAreasMode = DetectAreasMode.DOCUMENT
}
Dim documentPaths As String() = Directory.GetFiles("invoice-archive", "*.jpg")
Dim acceptedResults As New List(Of RecognitionResult)()
Dim rejectedPaths As New List(Of String)()
For Each path In documentPaths
Dim result = api.RecognizeImage(path, settings)
' Confidence is an array of per-region values — must average manually
Dim avgConfidence As Single = If(result.RecognitionAreasConfidence IsNot Nothing,
result.RecognitionAreasConfidence.Average(),
0.0F)
If avgConfidence >= 0.70F Then
acceptedResults.Add(result)
Else
rejectedPaths.Add(path)
Console.WriteLine($"Rejected: {path} ({avgConfidence:P0} confidence)")
End If
Next
' Save accepted pages as a single searchable PDF
If acceptedResults.Any() Then
api.SaveMultipageDocument("high-confidence-invoices.pdf",
SaveFormat.Pdf, acceptedResults)
End If
Console.WriteLine($"Accepted: {acceptedResults.Count}, Rejected: {rejectedPaths.Count}")
IronOCR Approach:
// Batch recognition with confidence filtering using unified result model
var ocr = new IronTesseract();
string[] documentPaths = Directory.GetFiles("invoice-archive", "*.jpg");
var acceptedResults = new List<OcrResult>();
var rejectedPaths = new List<string>();
foreach (var path in documentPaths)
{
var result = ocr.Read(path);
// Single confidence value — no averaging required
if (result.Confidence >= 70.0)
{
acceptedResults.Add(result);
}
else
{
rejectedPaths.Add(path);
Console.WriteLine($"Rejected: {path} ({result.Confidence:F1}% confidence)");
}
}
// Save accepted pages — each result becomes a page in the output PDF
if (acceptedResults.Any())
{
using var outputInput = new OcrInput();
foreach (var path in documentPaths
.Where(p => !rejectedPaths.Contains(p)))
{
outputInput.LoadImage(path);
}
var combined = ocr.Read(outputInput);
combined.SaveAsSearchablePdf("high-confidence-invoices.pdf");
}
Console.WriteLine($"Accepted: {acceptedResults.Count}, Rejected: {rejectedPaths.Count}");
// Batch recognition with confidence filtering using unified result model
var ocr = new IronTesseract();
string[] documentPaths = Directory.GetFiles("invoice-archive", "*.jpg");
var acceptedResults = new List<OcrResult>();
var rejectedPaths = new List<string>();
foreach (var path in documentPaths)
{
var result = ocr.Read(path);
// Single confidence value — no averaging required
if (result.Confidence >= 70.0)
{
acceptedResults.Add(result);
}
else
{
rejectedPaths.Add(path);
Console.WriteLine($"Rejected: {path} ({result.Confidence:F1}% confidence)");
}
}
// Save accepted pages — each result becomes a page in the output PDF
if (acceptedResults.Any())
{
using var outputInput = new OcrInput();
foreach (var path in documentPaths
.Where(p => !rejectedPaths.Contains(p)))
{
outputInput.LoadImage(path);
}
var combined = ocr.Read(outputInput);
combined.SaveAsSearchablePdf("high-confidence-invoices.pdf");
}
Console.WriteLine($"Accepted: {acceptedResults.Count}, Rejected: {rejectedPaths.Count}");
Imports IronTesseract
Imports System.IO
Imports System.Linq
' Batch recognition with confidence filtering using unified result model
Dim ocr As New IronTesseract()
Dim documentPaths As String() = Directory.GetFiles("invoice-archive", "*.jpg")
Dim acceptedResults As New List(Of OcrResult)()
Dim rejectedPaths As New List(Of String)()
For Each path In documentPaths
Dim result = ocr.Read(path)
' Single confidence value — no averaging required
If result.Confidence >= 70.0 Then
acceptedResults.Add(result)
Else
rejectedPaths.Add(path)
Console.WriteLine($"Rejected: {path} ({result.Confidence:F1}% confidence)")
End If
Next
' Save accepted pages — each result becomes a page in the output PDF
If acceptedResults.Any() Then
Using outputInput As New OcrInput()
For Each path In documentPaths.Where(Function(p) Not rejectedPaths.Contains(p))
outputInput.LoadImage(path)
Next
Dim combined = ocr.Read(outputInput)
combined.SaveAsSearchablePdf("high-confidence-invoices.pdf")
End Using
End If
Console.WriteLine($"Accepted: {acceptedResults.Count}, Rejected: {rejectedPaths.Count}")
result.Confidence is a single double ranging from 0 to 100. The Aspose RecognitionAreasConfidence array average returns a float in a range that varies by version — the comparison threshold needs adjustment during migration. The confidence scores guide documents per-word, per-line, and per-page confidence values for document validation workflows. For high-volume batch patterns, see the multithreading example.
Structured Output Handling
Aspose.OCR outputs structured data through SaveMultipageDocument with format-specific SaveFormat enum values. JSON output writes a machine-readable file; XML output writes an annotated document file. Accessing the raw structured data — word positions, line boundaries — requires iterating RecognitionAreasRectangles, which returns geometry at the region level, not the word level.
Aspose.OCR Approach:
// Structured output: JSON and XML via SaveMultipageDocument
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.COMBINE
};
var results = new List<RecognitionResult>();
foreach (var path in new[] { "page1.jpg", "page2.jpg", "page3.jpg" })
{
results.Add(api.RecognizeImage(path, settings));
}
// Export as JSON
api.SaveMultipageDocument("output.json", SaveFormat.Json, results);
// Export as XML
api.SaveMultipageDocument("output.xml", SaveFormat.Xml, results);
// Accessing area-level geometry (not word-level)
foreach (var result in results)
{
var areas = result.RecognitionAreasRectangles;
if (areas != null)
{
foreach (var area in areas)
{
Console.WriteLine($"Area at ({area.X},{area.Y}): {area.Width}x{area.Height}");
}
}
// No direct word-level collection with individual confidence values
}
// Structured output: JSON and XML via SaveMultipageDocument
var api = new AsposeOcr();
var settings = new RecognitionSettings
{
Language = Language.Eng,
DetectAreasMode = DetectAreasMode.COMBINE
};
var results = new List<RecognitionResult>();
foreach (var path in new[] { "page1.jpg", "page2.jpg", "page3.jpg" })
{
results.Add(api.RecognizeImage(path, settings));
}
// Export as JSON
api.SaveMultipageDocument("output.json", SaveFormat.Json, results);
// Export as XML
api.SaveMultipageDocument("output.xml", SaveFormat.Xml, results);
// Accessing area-level geometry (not word-level)
foreach (var result in results)
{
var areas = result.RecognitionAreasRectangles;
if (areas != null)
{
foreach (var area in areas)
{
Console.WriteLine($"Area at ({area.X},{area.Y}): {area.Width}x{area.Height}");
}
}
// No direct word-level collection with individual confidence values
}
Imports System
Imports System.Collections.Generic
Imports AsposeOcr
' Structured output: JSON and XML via SaveMultipageDocument
Dim api As New AsposeOcr()
Dim settings As New RecognitionSettings With {
.Language = Language.Eng,
.DetectAreasMode = DetectAreasMode.COMBINE
}
Dim results As New List(Of RecognitionResult)()
For Each path In New String() {"page1.jpg", "page2.jpg", "page3.jpg"}
results.Add(api.RecognizeImage(path, settings))
Next
' Export as JSON
api.SaveMultipageDocument("output.json", SaveFormat.Json, results)
' Export as XML
api.SaveMultipageDocument("output.xml", SaveFormat.Xml, results)
' Accessing area-level geometry (not word-level)
For Each result In results
Dim areas = result.RecognitionAreasRectangles
If areas IsNot Nothing Then
For Each area In areas
Console.WriteLine($"Area at ({area.X},{area.Y}): {area.Width}x{area.Height}")
Next
End If
' No direct word-level collection with individual confidence values
Next
IronOCR Approach:
// Structured output: navigate a rich result object model
var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadImage("page1.jpg");
input.LoadImage("page2.jpg");
input.LoadImage("page3.jpg");
var result = ocr.Read(input);
// Export as searchable PDF (preserves layout)
result.SaveAsSearchablePdf("output.pdf");
// Export as hOCR (XHTML with bounding box coordinates — feeds downstream tools)
result.SaveAsHocrFile("output.hocr");
// Word-level structured access — direct collection, no indirection
foreach (var page in result.Pages)
{
Console.WriteLine($"Page {page.PageNumber}: {page.Words.Length} words, " +
$"{page.Confidence:F1}% confidence");
foreach (var word in page.Words)
{
Console.WriteLine($" '{word.Text}' at ({word.X},{word.Y}) " +
$"size {word.Width}x{word.Height} — {word.Confidence:F1}%");
}
}
// Paragraph-level layout for document structure analysis
foreach (var paragraph in result.Pages[0].Paragraphs)
{
Console.WriteLine($"Paragraph at ({paragraph.X},{paragraph.Y}): {paragraph.Text}");
}
// Structured output: navigate a rich result object model
var ocr = new IronTesseract();
using var input = new OcrInput();
input.LoadImage("page1.jpg");
input.LoadImage("page2.jpg");
input.LoadImage("page3.jpg");
var result = ocr.Read(input);
// Export as searchable PDF (preserves layout)
result.SaveAsSearchablePdf("output.pdf");
// Export as hOCR (XHTML with bounding box coordinates — feeds downstream tools)
result.SaveAsHocrFile("output.hocr");
// Word-level structured access — direct collection, no indirection
foreach (var page in result.Pages)
{
Console.WriteLine($"Page {page.PageNumber}: {page.Words.Length} words, " +
$"{page.Confidence:F1}% confidence");
foreach (var word in page.Words)
{
Console.WriteLine($" '{word.Text}' at ({word.X},{word.Y}) " +
$"size {word.Width}x{word.Height} — {word.Confidence:F1}%");
}
}
// Paragraph-level layout for document structure analysis
foreach (var paragraph in result.Pages[0].Paragraphs)
{
Console.WriteLine($"Paragraph at ({paragraph.X},{paragraph.Y}): {paragraph.Text}");
}
Imports IronTesseract
' Structured output: navigate a rich result object model
Dim ocr As New IronTesseract()
Using input As New OcrInput()
input.LoadImage("page1.jpg")
input.LoadImage("page2.jpg")
input.LoadImage("page3.jpg")
Dim result = ocr.Read(input)
' Export as searchable PDF (preserves layout)
result.SaveAsSearchablePdf("output.pdf")
' Export as hOCR (XHTML with bounding box coordinates — feeds downstream tools)
result.SaveAsHocrFile("output.hocr")
' Word-level structured access — direct collection, no indirection
For Each page In result.Pages
Console.WriteLine($"Page {page.PageNumber}: {page.Words.Length} words, " &
$"{page.Confidence:F1}% confidence")
For Each word In page.Words
Console.WriteLine($" '{word.Text}' at ({word.X},{word.Y}) " &
$"size {word.Width}x{word.Height} — {word.Confidence:F1}%")
Next
Next
' Paragraph-level layout for document structure analysis
For Each paragraph In result.Pages(0).Paragraphs
Console.WriteLine($"Paragraph at ({paragraph.X},{paragraph.Y}): {paragraph.Text}")
Next
End Using
IronOCR exposes word-level data as a direct Words collection on each page, with individual Confidence values per word. Aspose.OCR's RecognitionAreasRectangles provides region geometry without a word-level confidence breakdown. The hOCR export produces XHTML compatible with tools that consume bounding-box annotated output — see the hOCR export guide for format details. For the full structured result model, the OcrResult API reference documents every property on OcrResult.Page, OcrResult.Paragraph, OcrResult.Line, and OcrResult.Word.
Aspose.OCR API to IronOCR Mapping Reference
| Aspose.OCR | IronOCR Equivalent |
|---|---|
AsposeOcr |
IronTesseract |
RecognitionSettings |
Properties on IronTesseract + OcrInput |
DocumentRecognitionSettings |
OcrInput with LoadPdf() / LoadPdfPages() |
api.RecognizeImage(path, settings) |
ocr.Read(path) or ocr.Read(input) |
api.RecognizePdf(path, settings) |
ocr.Read(path) or ocr.Read(input) |
result.RecognitionText |
result.Text |
result.RecognitionAreasConfidence.Average() |
result.Confidence (single double, 0–100) |
result.RecognitionAreasRectangles |
result.Words (with X, Y, Width, Height, Confidence) |
RecognitionResult |
OcrResult |
Language.Eng |
OcrLanguage.English |
DetectAreasMode.COMBINE |
Automatic — no enum required |
DetectAreasMode.TABLE |
Automatic — use result.Pages[n].Paragraphs for layout |
DetectAreasMode.DOCUMENT |
Automatic — engine handles layout analysis |
settings.RecognizeSingleLine = true |
ocr.Configuration.WhiteListCharacters or single-region crop |
settings.RecognitionAreas = new List<Rectangle> { r } |
input.LoadImage(path, cropRectangle) |
settings.AutoSkew = true |
Automatic, or explicit input.Deskew() |
PreprocessingFilter.AutoSkew() |
input.Deskew() |
PreprocessingFilter.AutoDenoising() |
input.DeNoise() |
PreprocessingFilter.ContrastCorrectionFilter() |
input.Contrast() |
PreprocessingFilter.Binarize() |
input.Binarize() |
PreprocessingFilter.Threshold(value) |
input.Binarize() (auto-threshold) |
PreprocessingFilter.Median() |
input.DeNoise() |
PreprocessingFilter.Scale(factor) |
input.Scale(percent) |
PreprocessingFilter.Invert() |
input.Invert() |
PreprocessingFilter.Rotate(angle) |
input.Rotate(angle) |
api.SaveMultipageDocument(path, SaveFormat.Pdf, list) |
result.SaveAsSearchablePdf(path) |
api.SaveMultipageDocument(path, SaveFormat.Docx, list) |
Via hOCR export: result.SaveAsHocrFile(path) |
api.SaveMultipageDocument(path, SaveFormat.Json, list) |
Navigate result.Pages and serialize directly |
api.PreprocessImage(path, filters) |
input.GetPages()[0].SaveAsImage(path) |
api.CalculateSkew(imagePath) |
input.Deskew() (auto-applies detected angle) |
new Aspose.OCR.License().SetLicense("file.lic") |
IronOcr.License.LicenseKey = "key" |
settings.ThreadsCount = n |
Thread-safe by default; use Parallel.ForEach |
Common Migration Issues and Solutions
Issue 1: DetectAreasMode Has No Direct Equivalent
Aspose.OCR: Code sets settings.DetectAreasMode = DetectAreasMode.TABLE or DetectAreasMode.COMBINE expecting specific layout behavior. Removing the enum leaves the question of how IronOCR handles the same layout.
Solution: Remove the enum entirely. IronOCR performs layout analysis automatically. If you need to inspect the detected layout structure, navigate result.Pages[n].Paragraphs — each paragraph carries an X, Y, Width, Height bounding box and the text it contains. For explicit table extraction, see the table reading how-to:
// No mode to set — read directly and inspect the structure
var result = new IronTesseract().Read("data-table.jpg");
foreach (var paragraph in result.Pages[0].Paragraphs)
{
Console.WriteLine($"Block ({paragraph.X},{paragraph.Y}): {paragraph.Text}");
}
// No mode to set — read directly and inspect the structure
var result = new IronTesseract().Read("data-table.jpg");
foreach (var paragraph in result.Pages[0].Paragraphs)
{
Console.WriteLine($"Block ({paragraph.X},{paragraph.Y}): {paragraph.Text}");
}
Imports IronTesseract
Dim result = New IronTesseract().Read("data-table.jpg")
For Each paragraph In result.Pages(0).Paragraphs
Console.WriteLine($"Block ({paragraph.X},{paragraph.Y}): {paragraph.Text}")
Next
Issue 2: RecognitionAreasConfidence Threshold Mismatch
Aspose.OCR: Existing code compares result.RecognitionAreasConfidence.Average() against a threshold such as 0.75f. IronOCR's result.Confidence is on a different scale.
Solution: result.Confidence is a 0–100 percentage. Multiply your Aspose threshold by 100 to convert: 0.75f becomes 75.0. Then update all comparison logic:
// Aspose.OCR threshold pattern
// if (result.RecognitionAreasConfidence.Average() >= 0.75f)
// IronOCR equivalent — multiply old threshold by 100
var result = new IronTesseract().Read("document.jpg");
if (result.Confidence >= 75.0)
{
Console.WriteLine($"High confidence result: {result.Text}");
}
// Aspose.OCR threshold pattern
// if (result.RecognitionAreasConfidence.Average() >= 0.75f)
// IronOCR equivalent — multiply old threshold by 100
var result = new IronTesseract().Read("document.jpg");
if (result.Confidence >= 75.0)
{
Console.WriteLine($"High confidence result: {result.Text}");
}
Imports IronOcr
Dim result = New IronTesseract().Read("document.jpg")
If result.Confidence >= 75.0 Then
Console.WriteLine($"High confidence result: {result.Text}")
End If
Issue 3: SaveMultipageDocument JSON/XML Output Has No Direct Method
Aspose.OCR: api.SaveMultipageDocument("out.json", SaveFormat.Json, results) writes a JSON file with recognition metadata. Teams consuming this output downstream need to find the equivalent.
Solution: IronOCR does not have a SaveFormat.Json equivalent method. The replacement is to navigate result.Pages and serialize with System.Text.Json. This gives you full control over the schema:
using var input = new OcrInput();
input.LoadImage("document.jpg");
var result = new IronTesseract().Read(input);
// Build your own structured JSON from the result model
var pageData = result.Pages.Select(p => new
{
PageNumber = p.PageNumber,
Confidence = p.Confidence,
Text = p.Text,
Words = p.Words.Select(w => new
{
Text = w.Text,
X = w.X,
Y = w.Y,
Width = w.Width,
Height = w.Height,
Confidence = w.Confidence
}).ToArray()
}).ToArray();
File.WriteAllText("output.json",
System.Text.Json.JsonSerializer.Serialize(pageData,
new System.Text.Json.JsonSerializerOptions { WriteIndented = true }));
using var input = new OcrInput();
input.LoadImage("document.jpg");
var result = new IronTesseract().Read(input);
// Build your own structured JSON from the result model
var pageData = result.Pages.Select(p => new
{
PageNumber = p.PageNumber,
Confidence = p.Confidence,
Text = p.Text,
Words = p.Words.Select(w => new
{
Text = w.Text,
X = w.X,
Y = w.Y,
Width = w.Width,
Height = w.Height,
Confidence = w.Confidence
}).ToArray()
}).ToArray();
File.WriteAllText("output.json",
System.Text.Json.JsonSerializer.Serialize(pageData,
new System.Text.Json.JsonSerializerOptions { WriteIndented = true }));
Imports IronOcr
Imports System.IO
Imports System.Text.Json
Using input As New OcrInput()
input.LoadImage("document.jpg")
Dim result = New IronTesseract().Read(input)
' Build your own structured JSON from the result model
Dim pageData = result.Pages.Select(Function(p) New With {
.PageNumber = p.PageNumber,
.Confidence = p.Confidence,
.Text = p.Text,
.Words = p.Words.Select(Function(w) New With {
.Text = w.Text,
.X = w.X,
.Y = w.Y,
.Width = w.Width,
.Height = w.Height,
.Confidence = w.Confidence
}).ToArray()
}).ToArray()
File.WriteAllText("output.json",
JsonSerializer.Serialize(pageData,
New JsonSerializerOptions With {.WriteIndented = True}))
End Using
For XHTML output with embedded coordinates that downstream tools can parse, result.SaveAsHocrFile("output.hocr") is the closer semantic equivalent.
Issue 4: DocumentRecognitionSettings.StartPage Uses 0-Based Index
Aspose.OCR: DocumentRecognitionSettings.StartPage = 2 means the third page (0-based). This is the most common off-by-one error in Aspose-to-IronOCR migrations.
Solution: IronOCR uses 1-based page indexing throughout. Add 1 to every StartPage value, and recalculate the end page accordingly. Write a targeted test against a known multi-page PDF to catch this before production:
// Aspose.OCR: StartPage = 2 means the 3rd page (0-based)
// var settings = new DocumentRecognitionSettings { StartPage = 2, PagesNumber = 3 };
// IronOCR: page 3 is index 3 (1-based), range of 3 ends at page 5
using var input = new OcrInput();
input.LoadPdfPages("document.pdf", 3, 5);
var result = new IronTesseract().Read(input);
// Aspose.OCR: StartPage = 2 means the 3rd page (0-based)
// var settings = new DocumentRecognitionSettings { StartPage = 2, PagesNumber = 3 };
// IronOCR: page 3 is index 3 (1-based), range of 3 ends at page 5
using var input = new OcrInput();
input.LoadPdfPages("document.pdf", 3, 5);
var result = new IronTesseract().Read(input);
Imports IronOcr
' Aspose.OCR: StartPage = 2 means the 3rd page (0-based)
' Dim settings As New DocumentRecognitionSettings With {.StartPage = 2, .PagesNumber = 3}
' IronOCR: page 3 is index 3 (1-based), range of 3 ends at page 5
Using input As New OcrInput()
input.LoadPdfPages("document.pdf", 3, 5)
Dim result = New IronTesseract().Read(input)
End Using
Issue 5: RecognizeSingleLine Has No Direct Flag
Aspose.OCR: settings.RecognizeSingleLine = true tells the engine to treat the entire image as a single text line. This is used for label recognition, field extraction, and other fixed-format inputs.
Solution: Use a CropRectangle to isolate the text line precisely, which prevents the engine from running full layout detection on a single-line image. For machine-readable zones or label formats, the reading specific documents guide covers the appropriate approach:
// Aspose.OCR: single-line flag
// var settings = new RecognitionSettings { RecognizeSingleLine = true };
// IronOCR: crop to the line region — layout detection skips automatically
var lineRegion = new CropRectangle(10, 45, 600, 30); // x, y, width, height
using var input = new OcrInput();
input.LoadImage("label.jpg", lineRegion);
var result = new IronTesseract().Read(input);
Console.WriteLine(result.Text.Trim());
// Aspose.OCR: single-line flag
// var settings = new RecognitionSettings { RecognizeSingleLine = true };
// IronOCR: crop to the line region — layout detection skips automatically
var lineRegion = new CropRectangle(10, 45, 600, 30); // x, y, width, height
using var input = new OcrInput();
input.LoadImage("label.jpg", lineRegion);
var result = new IronTesseract().Read(input);
Console.WriteLine(result.Text.Trim());
Imports IronOcr
Dim lineRegion As New CropRectangle(10, 45, 600, 30) ' x, y, width, height
Using input As New OcrInput()
input.LoadImage("label.jpg", lineRegion)
Dim result = New IronTesseract().Read(input)
Console.WriteLine(result.Text.Trim())
End Using
Issue 6: Per-Thread Aspose.OCR Instances Not Required
Aspose.OCR: Documentation recommends creating a new AsposeOcr() instance per thread to avoid thread-safety issues in parallel processing. Existing code creates instances inside Parallel.ForEach lambdas.
Solution: IronTesseract is thread-safe. A single instance handles parallel workloads. Remove per-thread instantiation and share one instance:
// Aspose.OCR: per-thread instance due to thread-safety concerns
// Parallel.ForEach(paths, path => { var api = new AsposeOcr(); ... });
// IronOCR: single shared instance, fully thread-safe
var ocr = new IronTesseract();
Parallel.ForEach(documentPaths, path =>
{
var result = ocr.Read(path);
Console.WriteLine($"{Path.GetFileName(path)}: {result.Confidence:F1}%");
});
// Aspose.OCR: per-thread instance due to thread-safety concerns
// Parallel.ForEach(paths, path => { var api = new AsposeOcr(); ... });
// IronOCR: single shared instance, fully thread-safe
var ocr = new IronTesseract();
Parallel.ForEach(documentPaths, path =>
{
var result = ocr.Read(path);
Console.WriteLine($"{Path.GetFileName(path)}: {result.Confidence:F1}%");
});
Imports System.IO
Imports IronOcr
Imports System.Threading.Tasks
Dim ocr As New IronTesseract()
Parallel.ForEach(documentPaths, Sub(path)
Dim result = ocr.Read(path)
Console.WriteLine($"{Path.GetFileName(path)}: {result.Confidence:F1}%")
End Sub)
Aspose.OCR Migration Checklist
Pre-Migration Tasks
Audit all Aspose.OCR references in the codebase:
grep -rn "using Aspose.OCR" --include="*.cs" .
grep -rn "AsposeOcr\|RecognitionSettings\|DocumentRecognitionSettings" --include="*.cs" .
grep -rn "DetectAreasMode\|SaveFormat\|RecognitionResult" --include="*.cs" .
grep -rn "RecognitionAreasConfidence\|RecognitionText\|RecognizePdf" --include="*.cs" .
grep -rn "PreprocessingFilter\|SaveMultipageDocument" --include="*.cs" .
grep -rn "Aspose.OCR.License\|SetLicense" --include="*.cs" .
grep -rn "using Aspose.OCR" --include="*.cs" .
grep -rn "AsposeOcr\|RecognitionSettings\|DocumentRecognitionSettings" --include="*.cs" .
grep -rn "DetectAreasMode\|SaveFormat\|RecognitionResult" --include="*.cs" .
grep -rn "RecognitionAreasConfidence\|RecognitionText\|RecognizePdf" --include="*.cs" .
grep -rn "PreprocessingFilter\|SaveMultipageDocument" --include="*.cs" .
grep -rn "Aspose.OCR.License\|SetLicense" --include="*.cs" .
Document each occurrence by category: recognition calls, settings objects, preprocessing pipelines, output calls, and license initialization. Note all DetectAreasMode values in use — these drive which layout migration path applies. Record all SaveFormat enum values — each non-PDF format needs the custom serialization approach from Issue 3 above.
Code Migration
- Remove
Aspose.OCRNuGet package from all projects in the solution - Install
IronOcrNuGet package in all projects - Replace
using Aspose.OCR;andusing Aspose.OCR.Models;withusing IronOcr; - Replace
new Aspose.OCR.License().SetLicense("file.lic")withIronOcr.License.LicenseKey = "key"at app startup - Replace
new AsposeOcr()withnew IronTesseract() - Remove all
RecognitionSettingsandDocumentRecognitionSettingsconstruction blocks - Remove all
DetectAreasModeenum references — no equivalent needed - Replace
api.RecognizeImage(path, settings)withocr.Read(path) - Replace
api.RecognizePdf(path, settings)withocr.Read(path)orocr.Read(input)usinginput.LoadPdf() - Replace
result.RecognitionTextwithresult.Text - Replace
result.RecognitionAreasConfidence.Average()withresult.Confidenceand multiply old threshold by 100 - Replace
api.SaveMultipageDocument(path, SaveFormat.Pdf, list)withresult.SaveAsSearchablePdf(path) - Replace
api.SaveMultipageDocument(path, SaveFormat.Json, list)with direct serialization ofresult.Pages - Convert all
PreprocessingFilterchains toOcrInputmethod calls (see API mapping table) - Update
DocumentRecognitionSettings.StartPagefrom 0-based to 1-based (add 1 to every value) - Remove per-thread
AsposeOcrinstantiation — share a singleIronTesseractinstance
Post-Migration Testing
- Run OCR on a representative sample from each document type in production use and compare character counts against baseline Aspose.OCR output
- Verify confidence values: IronOCR returns 0–100; confirm all threshold comparisons use the new scale
- Test page range selection on a 10+ page PDF using a 1-based page number and verify the correct pages are returned
- Test password-protected PDF ingestion without Aspose.PDF installed — confirm no dependency exception
- Confirm
result.Textmatches expected output for eachDetectAreasModethat was in use (COMBINE, TABLE, DOCUMENT) - Test the JSON output path: serialize
result.Pagesand validate the schema against any downstream consumers - Run the parallel batch processor and verify no threading exceptions (shared
IronTesseractinstance) - Confirm searchable PDF output opens in a PDF viewer and text is selectable at the correct locations
- Validate that the license key initializes without error in each deployment environment (ASP.NET startup, Azure Function, Docker container)
- Check that preprocessed TIFF inputs still produce expected output through
input.LoadImageFrames()
Key Benefits of Migrating to IronOCR
Predictable total cost of ownership from day one. The $2,999 Professional license covers 10 developers across 10 projects with no annual renewal. Finance closes the OCR line item once. The budget impact of adding engineers, launching new projects, or extending the product lifecycle is zero — there is no license tier math to run, no renewal date to track, and no compliance risk from a missed payment. The IronOCR licensing page documents what each tier covers.
Recognition calls that express intent, not infrastructure. After migration, every recognition call is ocr.Read("document"). The RecognitionSettings construction, DetectAreasMode selection, and PreprocessingFilter population that prefixed every Aspose.OCR call disappear entirely. New engineers reading the OCR layer of the codebase see the business intent — "read this document" — rather than a configuration object being assembled before the actual work begins. The IronTesseract API reference covers every available configuration property.
Encrypted PDF support without a second product. Enterprise document pipelines routinely handle password-protected PDFs. After migration, input.LoadPdf("doc.pdf", Password: "secret") handles them natively. There is no Aspose.PDF subscription to manage, no decryption-to-image pipeline to maintain, and no second renewal date to track. Every PDF format in the pipeline goes through one package, one license, and one code path. The PDF input how-to covers page ranges, non-contiguous selection, and stream input.
Structured result data at word and character level. result.Pages[n].Words returns a collection where each word carries its bounding box, its text, and its individual confidence score. Aspose.OCR's area-level geometry covers regions, not individual tokens. After migration, document layout parsers, form field extractors, and invoice processing pipelines can access per-word positioning without additional processing steps. See the OCR results features page for the complete result hierarchy.
Single-package deployment across all platforms. IronOCR ships as one NuGet package that runs on Windows, Linux, macOS, Docker, Azure App Service, and AWS Lambda without platform-specific configuration. Aspose.OCR works across platforms but may require native library adjustments in container environments. After migration, the Dockerfile for an OCR service is a standard .NET base image with no OCR-specific setup steps beyond the package install. Deployment guides cover Docker, Azure, AWS, and Linux.
Barcode reading in the same pass as OCR. Setting ocr.Configuration.ReadBarCodes = true extracts barcodes, QR codes, and Code 128 symbols from the same image in one engine pass. Aspose.OCR has no barcode capability — a separate barcode library would be required. After migration, documents that mix printed text with barcodes (shipping labels, inventory forms, event tickets) are handled by a single call with results on result.Barcodes. See the barcode OCR how-to for supported symbologies.
Frequently Asked Questions
Why should I migrate from Aspose.OCR for .NET to IronOCR?
Common drivers include eliminating COM interop complexity, replacing file-based license management, avoiding per-page billing, enabling Docker/container deployment, and adopting a NuGet-native workflow that integrates with standard .NET tooling.
What are the main code changes when migrating from Aspose.OCR for .NET to IronOCR?
Replace Aspose.OCR initialization sequences with IronTesseract instantiation, remove COM lifecycle management (explicit Create/Load/Close patterns), and update result property names. The result is significantly fewer boilerplate lines.
How do I install IronOCR to begin the migration?
Run 'Install-Package IronOcr' in Package Manager Console or 'dotnet add package IronOcr' in the CLI. Language packs are separate packages: 'dotnet add package IronOcr.Languages.French' for French, for example.
Does IronOCR match the OCR accuracy of Aspose.OCR for .NET for standard business documents?
IronOCR achieves high accuracy for standard business content including invoices, contracts, receipts, and typed forms. Image preprocessing filters (deskew, noise removal, contrast enhancement) further improve recognition on degraded input.
How does IronOCR handle the language data that Aspose.OCR for .NET installs separately?
Language data in IronOCR is distributed as NuGet packages. 'dotnet add package IronOcr.Languages.German' installs German support. No manual file placement or directory paths are involved.
Does migrating from Aspose.OCR for .NET to IronOCR require changes to deployment infrastructure?
IronOCR requires fewer infrastructure changes than Aspose.OCR for .NET. There are no SDK binary paths, license file placements, or license server configurations. The NuGet package contains the complete OCR engine, and the license key is a string set in application code.
How do I configure IronOCR licensing after migration?
Assign IronOcr.License.LicenseKey = "YOUR-KEY" in application startup code. In Docker or Kubernetes, store the key as an environment variable and read it in startup. Use License.IsValidLicense to validate before accepting traffic.
Can IronOCR process PDFs the same way Aspose.OCR does?
Yes. IronOCR reads both native and scanned PDFs. Instantiate IronTesseract, call ocr.Read(input) where input is a PDF path or OcrPdfInput, and iterate the OcrResult pages. No separate PDF rendering pipeline is required.
How does IronOCR handle threading in high-volume processing?
IronTesseract is safe to instantiate per-thread. Spin up one instance per thread in a Parallel.ForEach or Task pool, run OCR concurrently, and dispose each instance when done. No global state or locking is required.
What output formats does IronOCR support after text extraction?
IronOCR returns structured results including text, word coordinates, confidence scores, and page structure. Export options include plain text, searchable PDF, and structured result objects for downstream processing.
Is IronOCR pricing more predictable than Aspose.OCR for .NET for scaling workloads?
IronOCR uses flat-rate perpetual licensing with no per-page or volume charges. Whether you process 10,000 or 10 million pages, the license cost remains constant. Volume and team licensing options are on the IronOCR pricing page.
What happens to my existing tests after migrating from Aspose.OCR for .NET to IronOCR?
Tests that assert on extracted text content should continue to pass after migration. Tests that validate API call patterns or COM object lifecycle will need updating to reflect IronOCR's simpler initialization and result model.

