Test in production without watermarks.
Works wherever you need it to.
Get 30 days of fully functional product.
Have it up and running in minutes.
Full access to our support engineering team during your product trial
OCR stands for "Optical Character Recognition". This is the process of converting paper documents or images into readable text. Various methods exist to do this such as scanning or manual input through a keyboard. This is done to convert any scanned files and PDFs to their original text format. This process has proven invaluable in criminal cases where documents are too damaged for manual transcription but can be scanned and interpreted by OCR software.
With the advancement of technology and the ubiquitous adoption of digital tools, OCR has also been implemented in other fields such as document conversion on apps like Google Docs, as well as in academia and the world of business. There are two main types of OCR, "static" and "dynamic". The most common type is static OCR in which the entire document is scanned at once. Dynamic OCR, on the other hand, scans one line at a time and can process more sophisticated layouts such as tabular data.
This article will discuss the comparison between two of the most prevalent applications and document libraries for OCR and PDF. These are:
To use OCR with PDFTron SDK, we have to install a separate OCR module add-on utility. This helps the SDK to detect text from documents. It can make text selectable and searchable. The PDFTron SDK supports up to 100 languages. The PDFTron OCR engine is supported by an open-source LSTM neural network from Tesseract. PDFTron OCR supports multiple image types of formats for text detection. PDF files with only raster images are also accepted for OCR with the output quality dependent upon the input image. The best images for OCR are grayscale images with 300 DPI resolution.
Iron Software provides software engineers with IronOCR for .NET to read text content from photos and PDFs in .NET apps and websites. The software helps to scan photos for text and barcodes, supports numerous worldwide languages, and outputs as plain text or structured data. Iron Software's OCR Library can be utilized in Web, console, MVC, and numerous .NET desktop applications. In the case of commercial deployment, direct assistance from the development team is provided alongside purchased licenses.
Open Visual Studio and locate the File menu. Select new project, then select Console Application. To generate PDF documents, we will be using the Console Application in this article.
Enter the project name and select the file path in the appropriate text box. Then, click the Create button. Also, select the required .NET Framework.
The Visual Studio project will now generate the structure for the selected application.
The structure of the project will now be generated by Visual Studio. The program.cs file will open if you have selected Windows, console, and web application so you can input the code and build/run your application.
For the next step, we need to add the library to test the code.
The PDFTron OCR installation is completed manually and can be downloaded as a zip from the given link. Unzip and configure it with the project. The guide will aid in helping you execute PDFTron samples using the free integrated trial of the PDFTron SDK into a .NET Framework application using Windows. Support from solution engineers and unlimited trial usage are included in the free trial.
Visual Studio: Make sure that the .NET Desktop Development and .NET Framework 4.5.1+ development tools workload is part of your installation. This guide will use Visual Studio 2017 and PDFTron's C# .NET PDF Library for Windows. Download the library using this link, .NET PDF SDK Download.
Extract the folder from the .zip file. PDFNET_BASE is used in this guide to select the path into the folder that you extracted.
// Set the base path where the extracted files are located
PDFNET_BASE = "path/to/extraction/folder/PDFNetDotNet4/";
// Set the base path where the extracted files are located
PDFNET_BASE = "path/to/extraction/folder/PDFNetDotNet4/";
' Set the base path where the extracted files are located
PDFNET_BASE = "path/to/extraction/folder/PDFNetDotNet4/"
Navigate to the location of the extracted contents. Find and enter the Samples folder (PDFNET_BASE/Samples). This folder holds numerous sample codes for features supported by the PDFTron SDK.
This is called the "PDFTron Hello World" application. It is easy to integrate the rest of the PDFTron SDK if you can open, save, and close a PDF document.
The IronOCR Library can be installed in four ways.
These are:
The Visual Studio software provides the NuGet Package manager option to install the package directly to the solution. The screenshot demonstrates how to open the NuGet Package Manager.
This will provide a search box to display the list of the packages from the NuGet website. In the package manager, we need to search for the keyword "IronOCR", as in the below screenshot:
From the above image, we can see the list of the related search results. To install the package into the solution, we must select the required option.
Install-Package IronOcr
The package will now install directly into the current project which will then be ready to use.
For the third method, we can download the NuGet package straight from the website
You can directly download the latest package from the website by clicking this link. Follow the provided instructions to add the package to the project once the latest package is downloaded.
IronOCR and PDFtron OCR both have OCR technology that will convert images into text searching.
Convert PDF to DOCX, DOC, HTML, SVG, TIFF, PNG, JPEG, XPS, EPUB, TXT, and many other formats.
// Create a new PDF document
PDFDoc doc = new PDFDoc("sample.pdf");
// Convert PDF document to SVG
Convert.ToSvg(doc, "output.svg");
// Convert PDF document to XPS
Convert.ToXps("sample.pdf", "output.xps");
// Convert PDF document to multipage TIFF
Convert.TiffOutputOptions tiff_options = new Convert.TiffOutputOptions();
tiff_options.SetDPI(200);
tiff_options.SetDither(true);
tiff_options.SetMono(true);
Convert.ToTiff("sample.pdf", "output.tiff", tiff_options);
// Convert PDF to XOD
Convert.ToXod("sample.pdf", "output.xod");
// Convert PDF to HTML
Convert.ToHtml("sample.pdf", "output.html");
// Create a new PDF document
PDFDoc doc = new PDFDoc("sample.pdf");
// Convert PDF document to SVG
Convert.ToSvg(doc, "output.svg");
// Convert PDF document to XPS
Convert.ToXps("sample.pdf", "output.xps");
// Convert PDF document to multipage TIFF
Convert.TiffOutputOptions tiff_options = new Convert.TiffOutputOptions();
tiff_options.SetDPI(200);
tiff_options.SetDither(true);
tiff_options.SetMono(true);
Convert.ToTiff("sample.pdf", "output.tiff", tiff_options);
// Convert PDF to XOD
Convert.ToXod("sample.pdf", "output.xod");
// Convert PDF to HTML
Convert.ToHtml("sample.pdf", "output.html");
' Create a new PDF document
Dim doc As New PDFDoc("sample.pdf")
' Convert PDF document to SVG
Convert.ToSvg(doc, "output.svg")
' Convert PDF document to XPS
Convert.ToXps("sample.pdf", "output.xps")
' Convert PDF document to multipage TIFF
Dim tiff_options As New Convert.TiffOutputOptions()
tiff_options.SetDPI(200)
tiff_options.SetDither(True)
tiff_options.SetMono(True)
Convert.ToTiff("sample.pdf", "output.tiff", tiff_options)
' Convert PDF to XOD
Convert.ToXod("sample.pdf", "output.xod")
' Convert PDF to HTML
Convert.ToHtml("sample.pdf", "output.html")
// Create an IronTesseract object
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
// Add an image for OCR
Input.AddImage(@"3.png");
// Read the text from the image
var Result = Ocr.Read(Input);
// Print the text to the console
Console.WriteLine(Result.Text);
Console.ReadKey();
}
// Create an IronTesseract object
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
// Add an image for OCR
Input.AddImage(@"3.png");
// Read the text from the image
var Result = Ocr.Read(Input);
// Print the text to the console
Console.WriteLine(Result.Text);
Console.ReadKey();
}
' Create an IronTesseract object
Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Using Input = New OcrInput()
' Add an image for OCR
Input.AddImage("3.png")
' Read the text from the image
Dim Result = Ocr.Read(Input)
' Print the text to the console
Console.WriteLine(Result.Text)
Console.ReadKey()
End Using
Demonstrated above is the process of converting image files into text with the Tesseract 5 API. The above line of code was utilized to create an object for Iron Tesseract. Additionally, to ensure we can add one or more picture files, we produced an OcrInput object which will require the available picture path. In the Iron Tesseract object, the function "Read" can be used to obtain the images by parsing the picture file and extracting the produced outcome into the OCR result. It is able to extract text from photos and convert it to a string.
Tesseract can be utilized to add multi-frame images using the "AddMultiFrameTiff" method for this process. Each frame in the image is read and treated as a distinct page by the Tesseract Library. Each frame of the image is read before proceeding onto the subsequent frame until each frame has been successfully scanned. The TIFF image format is the only supported format in this method.
The successful conversion of the data into editable text is displayed in the above image, a result of the IronOCR accuracy.
IronOCR and PDFTron OCR convert PDF files into editable text. PDFTron OCR provides a list of options to the user such as saving the page, editing the image, recognizing the page, etc. Additionally, it can provide save options such as document, text, HTML format, etc. IronOCR also allows us to save a converted OCR file into HTML, text, PDF, etc.
The full sample code shows how to use PDFTron OCR for direct, high-quality conversion between PDF, XPS, EMF, SVG, TIFF, PNG, JPEG, and other image formats.
// Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.
using System;
using pdftron;
using pdftron.Common;
using pdftron.Filters;
using pdftron.SDF;
using pdftron.PDF;
// This code demonstrates conversion of documents to formats such as SVG, PDF, EMF, or XPS.
namespace ConvertTestCS
{
class Testfile
{
public string inputFile, outputFile;
public bool requiresWindowsPlatform;
public Testfile(string inFile, string outFile, bool requiresWindowsPlatform_)
{
inputFile = inFile;
outputFile = outFile;
requiresWindowsPlatform = requiresWindowsPlatform_;
}
};
class Class1
{
private static pdftron.PDFNetLoader pdfNetLoader = pdftron.PDFNetLoader.Instance();
static Class1() {}
// Relative paths to the input and output folders.
const string inputPath = "../../../../TestFiles/";
const string outputPath = "../../../../TestFiles/Output/";
static bool ConvertSpecificFormats()
{
bool err = false;
try
{
using (PDFDoc pdfdoc = new PDFDoc())
{
// Convert from XPS to PDF
Console.WriteLine("Converting from XPS");
pdftron.PDF.Convert.FromXps(pdfdoc, inputPath + "simple-xps.xps");
pdfdoc.Save(outputPath + "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
Console.WriteLine("Saved xps2pdf v2.pdf");
}
}
catch (PDFNetException e)
{
Console.WriteLine(e.Message);
err = true;
}
// Other format conversions...
return err;
}
static Boolean ConvertToPdfFromFile()
{
System.Collections.ArrayList testfiles = new System.Collections.ArrayList();
testfiles.Add(new ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", false));
// Add other test files...
bool err = false;
foreach (Testfile file in testfiles)
{
try
{
using (pdftron.PDF.PDFDoc pdfdoc = new PDFDoc())
{
// Conditions and conversions...
pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath + file.inputFile);
pdfdoc.Save(outputPath + file.outputFile, SDFDoc.SaveOptions.e_linearized);
Console.WriteLine("Converted file: " + file.inputFile);
Console.WriteLine("to: " + file.outputFile);
}
}
// Catch exceptions...
}
return err;
}
static void Main(string [] args)
{
PDFNet.Initialize(PDFTronLicense.Key);
bool err = false;
err = ConvertToPdfFromFile();
if (err)
{
Console.WriteLine("ConvertFile failed");
}
else
{
Console.WriteLine("ConvertFile succeeded");
}
err = ConvertSpecificFormats();
if (err)
{
Console.WriteLine("ConvertSpecificFormats failed");
}
else
{
Console.WriteLine("ConvertSpecificFormats succeeded");
}
// Finalization...
PDFNet.Terminate();
Console.WriteLine("Done.");
}
}
}
// Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.
using System;
using pdftron;
using pdftron.Common;
using pdftron.Filters;
using pdftron.SDF;
using pdftron.PDF;
// This code demonstrates conversion of documents to formats such as SVG, PDF, EMF, or XPS.
namespace ConvertTestCS
{
class Testfile
{
public string inputFile, outputFile;
public bool requiresWindowsPlatform;
public Testfile(string inFile, string outFile, bool requiresWindowsPlatform_)
{
inputFile = inFile;
outputFile = outFile;
requiresWindowsPlatform = requiresWindowsPlatform_;
}
};
class Class1
{
private static pdftron.PDFNetLoader pdfNetLoader = pdftron.PDFNetLoader.Instance();
static Class1() {}
// Relative paths to the input and output folders.
const string inputPath = "../../../../TestFiles/";
const string outputPath = "../../../../TestFiles/Output/";
static bool ConvertSpecificFormats()
{
bool err = false;
try
{
using (PDFDoc pdfdoc = new PDFDoc())
{
// Convert from XPS to PDF
Console.WriteLine("Converting from XPS");
pdftron.PDF.Convert.FromXps(pdfdoc, inputPath + "simple-xps.xps");
pdfdoc.Save(outputPath + "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
Console.WriteLine("Saved xps2pdf v2.pdf");
}
}
catch (PDFNetException e)
{
Console.WriteLine(e.Message);
err = true;
}
// Other format conversions...
return err;
}
static Boolean ConvertToPdfFromFile()
{
System.Collections.ArrayList testfiles = new System.Collections.ArrayList();
testfiles.Add(new ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", false));
// Add other test files...
bool err = false;
foreach (Testfile file in testfiles)
{
try
{
using (pdftron.PDF.PDFDoc pdfdoc = new PDFDoc())
{
// Conditions and conversions...
pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath + file.inputFile);
pdfdoc.Save(outputPath + file.outputFile, SDFDoc.SaveOptions.e_linearized);
Console.WriteLine("Converted file: " + file.inputFile);
Console.WriteLine("to: " + file.outputFile);
}
}
// Catch exceptions...
}
return err;
}
static void Main(string [] args)
{
PDFNet.Initialize(PDFTronLicense.Key);
bool err = false;
err = ConvertToPdfFromFile();
if (err)
{
Console.WriteLine("ConvertFile failed");
}
else
{
Console.WriteLine("ConvertFile succeeded");
}
err = ConvertSpecificFormats();
if (err)
{
Console.WriteLine("ConvertSpecificFormats failed");
}
else
{
Console.WriteLine("ConvertSpecificFormats succeeded");
}
// Finalization...
PDFNet.Terminate();
Console.WriteLine("Done.");
}
}
}
' Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.
Imports System
Imports pdftron
Imports pdftron.Common
Imports pdftron.Filters
Imports pdftron.SDF
Imports pdftron.PDF
' This code demonstrates conversion of documents to formats such as SVG, PDF, EMF, or XPS.
Namespace ConvertTestCS
Friend Class Testfile
Public inputFile, outputFile As String
Public requiresWindowsPlatform As Boolean
Public Sub New(ByVal inFile As String, ByVal outFile As String, ByVal requiresWindowsPlatform_ As Boolean)
inputFile = inFile
outputFile = outFile
requiresWindowsPlatform = requiresWindowsPlatform_
End Sub
End Class
Friend Class Class1
Private Shared pdfNetLoader As pdftron.PDFNetLoader = pdftron.PDFNetLoader.Instance()
Shared Sub New()
End Sub
' Relative paths to the input and output folders.
Private Const inputPath As String = "../../../../TestFiles/"
Private Const outputPath As String = "../../../../TestFiles/Output/"
Private Shared Function ConvertSpecificFormats() As Boolean
Dim err As Boolean = False
Try
Using pdfdoc As New PDFDoc()
' Convert from XPS to PDF
Console.WriteLine("Converting from XPS")
pdftron.PDF.Convert.FromXps(pdfdoc, inputPath & "simple-xps.xps")
pdfdoc.Save(outputPath & "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused)
Console.WriteLine("Saved xps2pdf v2.pdf")
End Using
Catch e As PDFNetException
Console.WriteLine(e.Message)
err = True
End Try
' Other format conversions...
Return err
End Function
Private Shared Function ConvertToPdfFromFile() As Boolean
Dim testfiles As New System.Collections.ArrayList()
testfiles.Add(New ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", False))
' Add other test files...
Dim err As Boolean = False
For Each file As Testfile In testfiles
Try
Using pdfdoc As pdftron.PDF.PDFDoc = New PDFDoc()
' Conditions and conversions...
pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath & file.inputFile)
pdfdoc.Save(outputPath & file.outputFile, SDFDoc.SaveOptions.e_linearized)
Console.WriteLine("Converted file: " & file.inputFile)
Console.WriteLine("to: " & file.outputFile)
End Using
End Try
' Catch exceptions...
Next file
Return err
End Function
Shared Sub Main(ByVal args() As String)
PDFNet.Initialize(PDFTronLicense.Key)
Dim err As Boolean = False
err = ConvertToPdfFromFile()
If err Then
Console.WriteLine("ConvertFile failed")
Else
Console.WriteLine("ConvertFile succeeded")
End If
err = ConvertSpecificFormats()
If err Then
Console.WriteLine("ConvertSpecificFormats failed")
Else
Console.WriteLine("ConvertSpecificFormats succeeded")
End If
' Finalization...
PDFNet.Terminate()
Console.WriteLine("Done.")
End Sub
End Class
End Namespace
Management of PDF files can be completed using the OCRInput function. Every page in a document will be read by the Iron Tesseract class. The text will then be extracted from the pages. A second function called "AddPDF" will allow us to open protected documents and ensures that we can add PDFs to our list of documents (password if it is protected). To open a password-protected PDF document, utilize the below code snippet:
var Ocr = new IronTesseract(); // Configure nothing
using (var Input = new OcrInput())
{
// Add PDF with password
Input.AddPdf("example.pdf", "password");
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
var Ocr = new IronTesseract(); // Configure nothing
using (var Input = new OcrInput())
{
// Add PDF with password
Input.AddPdf("example.pdf", "password");
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Dim Ocr = New IronTesseract() ' Configure nothing
Using Input = New OcrInput()
' Add PDF with password
Input.AddPdf("example.pdf", "password")
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
Reading and extracting contents from one page in a PDF file can be achieved by utilizing the "Addpdfpage" function. Only specify the exact page number from which we want to extract text. "AddPdfPage" will allow you to extract text from multiple pages that you specify. IEnumerable
IEnumerable<int> numbers = new List<int> {2,8,10};
// Create an IronTesseract object
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// Single page
Input.AddPdfPage("example.pdf",10);
// Multiple pages
Input.AddPdfPages("example.pdf", numbers);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
// Save result to a text file
Result.SaveAsTextFile("ocrtext.txt");
}
IEnumerable<int> numbers = new List<int> {2,8,10};
// Create an IronTesseract object
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// Single page
Input.AddPdfPage("example.pdf",10);
// Multiple pages
Input.AddPdfPages("example.pdf", numbers);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
// Save result to a text file
Result.SaveAsTextFile("ocrtext.txt");
}
Dim numbers As IEnumerable(Of Integer) = New List(Of Integer) From {2, 8, 10}
' Create an IronTesseract object
Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
' Single page
Input.AddPdfPage("example.pdf",10)
' Multiple pages
Input.AddPdfPages("example.pdf", numbers)
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
' Save result to a text file
Result.SaveAsTextFile("ocrtext.txt")
End Using
Use the SaveAsTextFile function to directly store the result in a text file format so you can directly download the file to the output directory path. To save the file into HTML format, use SaveAsHocrFile.
We can use the PDFTron SDK to extract images from PDF files, along with their positioning information and DPI. Instead of converting PDF images to a Bitmap, you can also extract uncompressed/compressed image data directly using elements.GetImageData() (described in the PDF Data Extraction code sample). Learn more about our C# PDF Library and the PDF Parsing and Content Extraction Library.
IronOCR has an impressive number of features that will allow you to read QR codes and barcodes directly from scanned documents. The code snippet below demonstrates how you can scan the barcode from a given image or document.
// Create an IronTesseract object
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
// Add an image with a barcode
Input.AddImage("barcode.gif");
// Read the image
var Result = Ocr.Read(Input);
// Iterate over all barcodes found and display their values
foreach (var Barcode in Result.Barcodes)
{
Console.WriteLine(Barcode.Value);
}
}
// Create an IronTesseract object
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
// Add an image with a barcode
Input.AddImage("barcode.gif");
// Read the image
var Result = Ocr.Read(Input);
// Iterate over all barcodes found and display their values
foreach (var Barcode in Result.Barcodes)
{
Console.WriteLine(Barcode.Value);
}
}
' Create an IronTesseract object
Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.ReadBarCodes = True
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Using Input = New OcrInput()
' Add an image with a barcode
Input.AddImage("barcode.gif")
' Read the image
Dim Result = Ocr.Read(Input)
' Iterate over all barcodes found and display their values
For Each Barcode In Result.Barcodes
Console.WriteLine(Barcode.Value)
Next Barcode
End Using
The above code helps read the barcode from a given image or PDF document. Numerous barcodes can be read at the same time in a single image or page. IronOCR has a distinctive method that will read the barcode, Ocr.Configuration.ReadBarCodes.
The data is stored in an object called OCRResult after scanning the input. The property in OCRResult is called Barcodes which will have a list of all the available barcode data. We can obtain each individual data related to the barcode details by utilizing a for-each loop. Two operations are completed in a single process- the scanning and reading of the value of the barcode.
Support for threading options is also available, and multiple OCR processes can be completed at the same time. Additionally, IronOCR can recognize a precise area from a specified region.
// Create an IronTesseract object
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// Define the content area to be scanned
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
// Add image specifying the content area
Input.Add("document.png", ContentArea);
// Perform OCR operation
var Result = Ocr.Read(Input);
// Print the text
Console.WriteLine(Result.Text);
}
// Create an IronTesseract object
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// Define the content area to be scanned
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
// Add image specifying the content area
Input.Add("document.png", ContentArea);
// Perform OCR operation
var Result = Ocr.Read(Input);
// Print the text
Console.WriteLine(Result.Text);
}
' Create an IronTesseract object
Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
' Define the content area to be scanned
Dim ContentArea = New System.Drawing.Rectangle() With {
.X = 215,
.Y = 1250,
.Height = 280,
.Width = 1335
}
' Add image specifying the content area
Input.Add("document.png", ContentArea)
' Perform OCR operation
Dim Result = Ocr.Read(Input)
' Print the text
Console.WriteLine(Result.Text)
End Using
The code snippet above demonstrates how to perform OCR on a distinct region. You are only required to specify the rectangular region in the PDF/image as IronOCR's Tesseract engine will aid in recognizing the text.
A 30-Day Money-Back Guarantee: Once a license is purchased, you will get 30-days money back guarantee. Within the 30-days, if you wish to return the product, you will receive your money back.
Easy Integration: The integration of IronOCR with any project and environment is so effortless that it can be achieved in a single line of code simply by adding it as a NuGet Package. On the other hand, another way to integrate the environment is to directly download it from the web.
Perpetual Licensing: Each license purchased does not require renewal.
Free Support and Product Updates: Every license will have support directly from the group behind the product and will come with a year of free product updates. Purchasing extensions is available at any time.
Immediate Licenses: Once payment is received, registered license keys will be sent out immediately.
All licenses are perpetual and apply to development, staging, and production.
This package allows a single software developer in an organization to utilize Iron Software in one location. Iron Software can be utilized in a single intranet application, web application, or desktop software program. It is prohibited to share licenses outside of an organization or an agency/client relationship as they are non-transferable. This license type, like all other license types, expressly excludes all rights not expressly granted under the Agreement without OEM redistribution and utilizing the Iron Software as SaaS without purchasing additional coverage.
Pricing: Starts from $749 per year.
This license allows a predetermined number of software developers in an organization to utilize Iron Software in numerous locations, with up to a maximum of ten. Iron Software can be used in as many websites, intranet applications, or desktop software applications as you like. Licenses are non-transferable, and they cannot be shared outside of an organization or an agency/client relationship. This license type, like all other license types, expressly excludes all rights not expressly granted under the Agreement, including OEM redistribution and utilizing the Iron Software as SaaS without purchasing additional coverage. This license can be integrated with a single project up to a maximum of 10.
Pricing: Starts from $999 per year.
This license allows an unlimited number of software developers in an organization to utilize Iron Software in an unlimited number of locations. Iron Software can be used in as many intranet applications, desktop software applications, or websites as you like. Licenses are non-transferable, and they cannot be shared outside of an organization or an agency/client relationship. This license type, like all other license types, expressly excludes all rights not granted under the Agreement, including OEM redistribution and utilizing the Iron Software as a SaaS without purchasing additional coverage.
Pricing: Starts from $3,999 per year.
Royalty-Free Redistribution — This allows you to distribute the Iron Software as part of several differently packaged commercial products (without having to pay royalties) based on the number of projects covered by the base license. This will allow the deployment of Iron Software within SaaS software services, which is based on the number of projects covered by the base license.
Pricing: Starts from $1,599 per year.
PDFTron custom licenses are tailored to match your application and business requirements. Pricing is dependent on your feature scope.
The IronOCR Lite license is an undefined package that includes one developer with one year of support, and costs around $749. The IronOCR professional license including 10-developer packages and one year of support costs $999, while again, PDFTron packages are undefined. To buy a package, you must contact the support center to get a quote.
The IronOCR Lite and Professional packages include OEM or SaaS service with a 5-year support option. The Lite version includes a one-developer package with 5-year support and Saas and OEM service costs $2,897 with a customized support option. The IronOCR Professional version includes a 10-developer package with 5-year support, Saas, and OEM service and costs $3,397. PDFTron's 10-developer package with one year of support, Saas, and OEM service does not have a defined price.
IronOCR in the .NET Framework context supplies Tesseract that is straightforward to use with the support of photos and PDF documents achieved in numerous ways. It also provides several settings for improving the Tesseract OCR performance. A diverse number of languages are supported, with the ability to have numerous languages in a single operation. Visit their website to learn more regarding the Tesseract OCR.
PDFTron is a software application that uses different engines to recognize images and PDF documents. It also provides various settings to improve the performance of the OCR process and the choice to select multiple languages. PDFTron has limitations on the usage of page conversions. It also has various prices for different operating systems.
IronOCR is a competitive software product and can offer greater accuracy than competing brands. Similar products have at times failed to recognize low-quality images resulting in unknown characters. On the other hand, IronOCR not only provides accurate results but allows us to recognize barcode data and read the value of barcodes from images.
The IronOCR packages provide competitive licensing and support at a single price for all platforms. By comparison, PDFTron's OCR products are all exclusively custom-selected which tend to be more expensive. Pricing varies between both products with IronOCR starting at a price of $749, while due to custom selection, PDFTron's starting price is undefined. In conclusion, IronOCR provides a wider assortment of features for a lower price.
So, what are you waiting for? The free trial is open to all. Obtain the License here and start immediately!
OCR stands for 'Optical Character Recognition'. It is the process of converting paper documents or images into readable text, often for the purpose of converting scanned files and PDFs to their original text format.
There are two main types of OCR: static and dynamic. Static OCR scans the entire document at once, while dynamic OCR scans one line at a time and can handle more complex layouts such as tabular data.
PDFTron OCR features include making images searchable, converting PDFs to searchable formats, detecting important information from documents, and supporting up to 100 languages. It uses an open-source LSTM neural network from Tesseract.
IronOCR supports reading text, QR codes, and barcodes from PDFs and images, supports 127 languages, corrects low-quality scans, and provides structured data output. It is compatible with multiple operating systems and supports multithreading.
PDFTron OCR is installed manually by downloading a zip file, unzipping it, and configuring it with your project. It requires Visual Studio with .NET Desktop Development and .NET Framework 4.5.1+ tools.
IronOCR can be installed via Visual Studio using the NuGet Package Manager, the Visual Studio Command-Line, direct download from the NuGet website, or from the IronOCR website.
IronOCR offers perpetual licenses with options for a Lite License, Professional License, and Unlimited License. Each license includes perpetual use, free support, and product updates. Pricing starts from $liteLicense for the Lite License.
PDFTron offers custom licenses tailored to application needs. Pricing depends on the feature scope and includes multi-domain pricing, favorable multi-year discounts, and options for OEM redistribution.
IronOCR offers competitive pricing and features, supporting a wide range of languages and formats with a focus on accuracy. PDFTron provides customizable options but tends to be more expensive due to its custom licensing.
IronOCR offers a free trial available to everyone. You can obtain a license and start using it immediately by visiting their licensing page.