A Comparison between IronOCR and Leadtools OCR

If you are looking for information about OCR, you are in the right place. This blog will discuss OCR and related software, and examine how they perform when applied to OCR-related tasks. Let's begin with the question: what is OCR?

Optical character recognition (OCR) uses an optical scanner and specialized software to identify and digitally encode written or printed text. A computer can read static photographs of text and turn them into editable, searchable data using OCR software.

OCR usually consists of three steps: opening and scanning a document in OCR software, recognizing the document in OCR software, and storing the OCR-produced document in the format of your choice.

Today, we discuss two OCR software packages and compare their pros and cons, as well as how to integrate and export their SDK in C#. The software packages under discussion are:

  • LeadTools OCR

  • IronOCR

1. LeadTools OCR

LEADTOOLS OCR comes from the award-winning line of development toolkits developed and published by LEAD Technologies Inc. LEADTOOLS is a collection of comprehensive toolkits for integrating raster, document, medical, multimedia, and vector imagery into desktop, server, tablet, and mobile applications. File formats (150+), image compression, image processing, color conversion, color processing, image display, special effects, scanning/capture, common dialogs, printing, DICOM, PACS, OCR, barcodes, forms recognition, PDF, document clean-up, annotations and more are all supported by LEADTOOLS. Millions of lines of code are practically at the fingertips of application developers using a LEADTOOLS toolkit. LEADTOOLS is a toolset built to provide you with the most potent image technology available, no matter your programming needs.

LEADTOOLS is a comprehensive toolkit to integrate recognition, documentation, medical, imaging, and multimedia technology into desktop, server, tablet, and mobile systems, powered by unique artificial intelligence and machine-learning algorithms. In order to improve your apps, why not make good use of more than 30 years of image development knowledge and support for 150+ file types.

The LEADTOOLS OCR class library provides programming software for the quick and efficient incorporation of document optical character recognition (OCR) technology into software programs. Programmers can conduct character recognition on document pictures, and output recognized text to over 20 file formats using the LEADTOOLS OCR Class Library.

1.1. LeadTools OCR Features

The Lead toolkit provides an award-winning line of multimedia technologies for end-users and developers, and is able to perform all types of OCR functions to satisfy its broad range of clients.

1.1.1. Multi-thread

The Lead technology engine provides the end-user with support for multi-thread and server-based OCR operations.

1.1.2. Multiple OCR Documents

The LeadTools Document SDK allows users to create multiple OCR documents in their application. Each document contains its own list of pages.

1.1.3. Languages.

The award-winning LeadTools line of technologies provides services in more then 40 different languages, and allows you to choose which language you wish to employ when recognizing OCR pages.

1.1.4. Dictionaries

LeadTools OCR is an awesome doc scanner app that allow its end-user to access the dictionaries for all supported languages. Moreover, users can access more than one dictionary in one document.

1.1.5. Document Management System

Recognize a variety of documents, including facsimiles, photocopies, and documents with complex layouts.

1.1.6. Character Recognition

With improved accuracy and speed, the LEADTOOLS OCR Application can conduct Optical Character Recognition (OCR) on pictures, extract text from photos, and convert images to multiple document formats. To modify and share text from a picture, use OCR to extract it and then copy it.

1.1.7. Artificial Intelligence

Lead Technologies uses AI to improve recognition on documents of the same type — wonderful news for the end-user.

1.1.8. Zone Recognition

The Leadtools Document SDK, from the award-winning line of OCR toolkits, uses powerful zone-recognition software that takes it to a whole new level of zone recognition.

  • Document pages can be shown with or without their zones.
  • Zones can be imported from, and exported to, files.
  • Recognize a page as a single zone.
  • Within each page, manually specify and identify several zones.
  • When constructing multi-layered zones and identifying regions such as tables, rulers, photos, and text, use automated area segmentation.
  • For each zone, provide various, specialized choices, such as OMR, MRZ, and MICR zones.

1.1.9. Cloud Services

This is a high-capacity, scalable Web API. Its user-friendly interface allows you to easily incorporate powerful OCR, barcodes, MICR, and document conversion into any program.

Note:

NuGet's official site shows how much .NET developers prefer IronOCR over LeadTools. LeadTools has 77.8 K downloads, but on the other hand, IronOCR has more than 320 K downloads.

2. IronOCR

IronOCR is a C# software library that enables .NET platform programmers to detect and read text from images and PDF documents. It is a pure .NET OCR package that uses the world's most potent Tesseract engine. IronOCR thrives when working with real-world graphics and flawed documents such as photos or low-resolution processing with digital noise or defects. With little or no setup, Tesseract 5 (as well as 4 and 3) runs out of the box on Windows, macOS, Linux, Azure, AWS, Lambda, Mono, and Xamarin Mac. There are no native binaries to deal with. Framework and Core are compatible.

IronOCR supports more languages than any other OCR engine anywhere, helping programmers to make meaningful image creations, and enabling the extraction of multimedia data from it. IronOCR supports 125 international languages, but only English is installed as standard in IronOCR .

The service provided by the IronOCR toolkit is easy to integrate, easy to process, and more interactive than any other OCR engine. It offers solutions to .NET developers and allows them to control and connect with their documents digitally, as well as manipulate them however they see fit.

2.1. IronOCR features

IronOCR provides a unique set of features and functions to integrate, sign, export, read graphics and extract details from images, regardless of the technical background of users or the level of sophistication of the hardware being used.

2.1.1. Accuracy

The IronOCR SDK takes work accuracy to a whole new level for OCR libraries, thanks to its accuracy rate of 99.8% that significantly outperforms other OCR libraries.

2.1.2. Fixing Low Quality Scans and Images

The IronOCR class gives C# developers granular control. They provide OCR (images and PDF to text) capability to their developers and finely-tuned performance in each unique case.

Working with real-world instances, a perfect balance between speed and accuracy may be reached by establishing variables. Clean Background Noise, Enhance Contrast, Enhance Resolution, Language, Strategy, Rotate And Straighten, Color Space, Detect White Text On Dark Backgrounds and Input Image Type are just some of the options available.

Below are the examples of before-and-after images of low-quality scans being fixed:

Before

After

2.1.3. Languages

IronOCR provides solutions in 125+ international languages to help developers all over the world.

2.1.4. OCR Text Extraction

Iron Tesseract can read a variety of picture types and PDF files. With the traditional free Tesseract engines, this feature is not possible. If scans are of poor quality, OCR input allows you to get the relevant properties automatically repaired.

2.1.5. Image Optimization Filters

The OCRInput class gives C# programmers granular control over input. Developers then preprocess image input for speed and accuracy. This eliminates the standard method of preparing photos for OCR using Photoshop Batch Scripts or ImageMagick.

2.1.6. OCR Region of an Image

In terms of performing OCR on an image with speed and accuracy, IronOCR takes it to a whole new level. IronOCR allows its end-users to select a specific area or region in the image and perform OCR on that region. The region is known as ContentAreas or CropAreas.

2.1.7. OCRResult Class

IronOCR returns an advanced result object for each page it scans using Tesseract 3,4 or 5. This contains location data, images, text, statistical confidence, alternative symbol choices, font-names, font-sizes decoration, font weights, and a position for each of the following:

  • Pages
  • Paragraphs
  • Lines of Text
  • Words
  • Individual Characters
  • Barcodes

2.1.8. Multiple Languages for 1 Document

IronOCR enables developers to use more then one language for a single document. This feature is of great help for .NET service providers.

Note:

IronOCR is part of an award-winning product line. By winning this award, Iron Software demonstrates that it does indeed provide an awesome doc scanner app that provides superb recognition, as well as excellent document-related conversion and manipulation.

3. Starting a new project in Visual Studio

Open the Visual Studio software, go to file menu and select new project. Then, select console application. In this article, we are going to use a console application to generate PDF documents.

Enter the project name and select the path in the appropriate text box. Next, click the create button, and then select the required .NET framework, as in the screenshot below:

The Visual Studio project will now generate the structure for the selected application, and, if you have selected the console, Windows, and web application, it will now open the program.cs file where you can enter the code and build/run the application.

Next, we can add the library to test the program.

4. Install the IronOCR Library

The IronOCR library can be downloaded and installed in four ways. These are:

  1. Using the Visual Studio NuGet Package Manager
  2. Direct download from the NuGet webpage.
  3. Direct download from the IronOCR webpage.
  4. Using the Visual Studio Command-Line.

4.1. Using the Visual Studio NuGet Manager

You can integrate IronOCR in a C# project using the Visual Studio NuGet Package Manager.

  1. Expand Tools.
  2. Extend the NuGet Package Manager.
  3. Click on Manage NuGet Packages for Solutions.

After this, a new window will appear in the search bar: type IronOCR.

By using this method, developers can install the IronOCR library and any language pack of the developer's choice.

4.2. Direct download from the NuGet webpage

IronOCR can be directly downloaded from the NuGet website by following these instructions:

  1. Navigate to the link "https://www.nuget.org/packages/IronOCR/"
  2. Select the download package option from the menu on the right-hand side.
  3. Double-click the downloaded package. It will be installed automatically.
  4. Next, reload the solution and start using it on the project.

4.3. Direct download from the IronOCR webpage

Developers can download the IronOCR library directly from IronOCR website by using this Link.

  1. Right-click the project from the solution window.
  2. Then, select option reference and browse the location of the downloaded reference.
  3. Next, click OK to add the reference.

4.4. Using the Visual Studio Command-Line

  1. In Visual Studio, go to Tools-> NuGet Package manager -> Package manager console
  2. Enter the following line in the package manager console tab:
  3. Install-Package IronOCR

The package will now download/install in the current project and be ready to use.

5. Install the LeadTools OCR

Developers can download the LeadTools OCR SDK in three different ways as shown below. We will discuss them all.

  1. Using the Visual Studio NuGet Package Manager.
  2. Using the NuGet Website.
  3. Downloading from the LeadTools Website.

5.1. Using the Visual Studio NuGet Manager

You can install LeadTools OCR in a C# project using the Visual Studio NuGet Package Manager:

  1. Expand Tools.
  2. Extend the NuGet Package Manager.
  3. Click on Manage NuGet Packages for Solutions.

After this, a new window will appear; in the search bar type LeadTools OCR.

By following these steps, developers can install the LeadTools OCR library and any language pack of the developer's choice.

5.2. Using the NuGet Website

LeadTools OCR can be downloaded directly from the NuGet website by following these instructions:

  1. Navigate to the Link "https://www.nuget.org/packages/Leadtools.Ocr/"
  2. Select the download package option from the menu on the right-hand side.
  3. Double-click the downloaded package. It will be installed automatically.
  4. Next, reload the solution and start using it in the project.

5.3. Download from the LeadTools Website

Developers can directly download the Leadtools Document SDK from their website without any hassle. Simply go to their website and download the one of the packs containing the OCR library.

6. Multi-thread OCR

Both the sets of software under discussion provide services for multi-thread OCR engines. Under this heading we will look at their performance and speed.

6.1. The LeadTools Multi-Thread OCR

LeadTools supports running more than one instance of OCR at a time, depending upon eacg system's physical cores. This feature of Lead Technologies saves a lot of time for .NET developers.

// Create an instance of an OCR document from the engine 
IOcrDocument ocrDocument= ocrEngineInstance.DocumentManager.CreateDocument(); 
// Add page, zone them, recognize them and save them 
// to the final document: 
ocrDocument.Pages.AddPages(imageFileName, null); 
ocrDocument.Recognize(null); 
ocrDocument.Save(documentFileName, DocumentFormat.Pdf, null); 
// Create an instance of an OCR document from the engine 
IOcrDocument ocrDocument= ocrEngineInstance.DocumentManager.CreateDocument(); 
// Add page, zone them, recognize them and save them 
// to the final document: 
ocrDocument.Pages.AddPages(imageFileName, null); 
ocrDocument.Recognize(null); 
ocrDocument.Save(documentFileName, DocumentFormat.Pdf, null); 
' Create an instance of an OCR document from the engine 
Dim ocrDocument As IOcrDocument= ocrEngineInstance.DocumentManager.CreateDocument()
' Add page, zone them, recognize them and save them 
' to the final document: 
ocrDocument.Pages.AddPages(imageFileName, Nothing)
ocrDocument.Recognize(Nothing)
ocrDocument.Save(documentFileName, DocumentFormat.Pdf, Nothing)
VB   C#

6.2. The IronOCR Multi-Thread OCR

Using the multi-thread feature by IronOCR is quite easy and time-saving for developers. Iron Tesseract will automatically attempt to use all threads available on all cores, and will tactfully consider responsiveness on the main/GUI thread.

using IronOcr;

var Ocr = new IronTesseract();

using (var Input = new OcrInput())
{
    Input.AddPdf("scan.pdf")

    // Image processing is automatically multi-threaded
    Input.Deskew();

    // OCR reading is automatically multi threaded too
    var Result = Ocr.Read(Input);     
}
using IronOcr;

var Ocr = new IronTesseract();

using (var Input = new OcrInput())
{
    Input.AddPdf("scan.pdf")

    // Image processing is automatically multi-threaded
    Input.Deskew();

    // OCR reading is automatically multi threaded too
    var Result = Ocr.Read(Input);     
}
Imports IronOcr

Private Ocr = New IronTesseract()

Using Input = New OcrInput()
	Input.AddPdf("scan.pdf") Input.Deskew()

	' OCR reading is automatically multi threaded too
	Dim Result = Ocr.Read(Input)
End Using
VB   C#

7. Create Searchable PDFs

Creating searchable PDFs with ease is every C# developer's dream. In this section we will discuss this process using both the IronOCR SDK and the Lead technologies OCR SDK.

7.1. Create Searchable PDFs with IronOCR

IronOCR's awesome doc scanner app allows developers to take the creation of searchable PDFs to a whole new level by offering support in detecting text characters in images and turning them into meaningful PDF text. The code example for users is below:

using IronOcr;

  var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    Input.Add(@"images\page1.png")
    Input.Add(@"images\page2.bmp")
    Input.Add(@"images\page3.tiff")

    Input.Deskew();

    var Result = Ocr.Read(Input);
    Result.SaveAsSearchablePdf("searchable.pdf");
}
using IronOcr;

  var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    Input.Add(@"images\page1.png")
    Input.Add(@"images\page2.bmp")
    Input.Add(@"images\page3.tiff")

    Input.Deskew();

    var Result = Ocr.Read(Input);
    Result.SaveAsSearchablePdf("searchable.pdf");
}
Imports IronOcr

  Private Ocr = New IronTesseract()
Using Input = New OcrInput()
	Input.Add("images\page1.png") Input.Add("images\page2.bmp") Input.Add("images\page3.tiff") Input.Deskew()

	Dim Result = Ocr.Read(Input)
	Result.SaveAsSearchablePdf("searchable.pdf")
End Using
VB   C#

7.2. Create Searchable PDFs with LeadTools OCR

Lead Technologies offers an awesome doc scanner app from their award-winning line of software. However, for the end-user, the code is a little more complicated than that used for IronOCR.

private void saveAsSearchablePDFToolStripMenuItem_Click(object sender, EventArgs e) 
{ 
   try 
   { 
      // Create a document   
      using (IOcrDocument ocrDocument = _ocrEngine.DocumentManager.CreateDocument(null, OcrCreateDocumentOptions.AutoDeleteFile)) 
      { 
         // Create IOcrPage from loaded image 
         _ocrPage = _ocrEngine.CreatePage(_viewer.Image, OcrImageSharingMode.AutoDispose); 
         // Recognize Text 
         _ocrPage.Recognize(null); 
         // Add the page  
         ocrDocument.Pages.Add(_ocrPage); 
         // Save page as documentation 
         SaveFileDialog saveDlg = new SaveFileDialog(); 
         saveDlg.InitialDirectory = @"C:\LEADTOOLS22\Resources\Images"; 
         saveDlg.Filter = "Adobe Portable Document Format|*.pdf"; 
         if (saveDlg.ShowDialog(this) != DialogResult.OK) 
            return; 
         ocrDocument.Save(saveDlg.FileName, DocumentFormat.Pdf, null); 
         MessageBox.Show($"OCR output saved to {saveDlg.FileName}"); 
      } 
   } 
   catch (Exception ex) 
   { 
      MessageBox.Show(ex.ToString()); 
   } 
} 
private void saveAsSearchablePDFToolStripMenuItem_Click(object sender, EventArgs e) 
{ 
   try 
   { 
      // Create a document   
      using (IOcrDocument ocrDocument = _ocrEngine.DocumentManager.CreateDocument(null, OcrCreateDocumentOptions.AutoDeleteFile)) 
      { 
         // Create IOcrPage from loaded image 
         _ocrPage = _ocrEngine.CreatePage(_viewer.Image, OcrImageSharingMode.AutoDispose); 
         // Recognize Text 
         _ocrPage.Recognize(null); 
         // Add the page  
         ocrDocument.Pages.Add(_ocrPage); 
         // Save page as documentation 
         SaveFileDialog saveDlg = new SaveFileDialog(); 
         saveDlg.InitialDirectory = @"C:\LEADTOOLS22\Resources\Images"; 
         saveDlg.Filter = "Adobe Portable Document Format|*.pdf"; 
         if (saveDlg.ShowDialog(this) != DialogResult.OK) 
            return; 
         ocrDocument.Save(saveDlg.FileName, DocumentFormat.Pdf, null); 
         MessageBox.Show($"OCR output saved to {saveDlg.FileName}"); 
      } 
   } 
   catch (Exception ex) 
   { 
      MessageBox.Show(ex.ToString()); 
   } 
} 
Private Sub saveAsSearchablePDFToolStripMenuItem_Click(ByVal sender As Object, ByVal e As EventArgs)
   Try
	  ' Create a document   
	  Using ocrDocument As IOcrDocument = _ocrEngine.DocumentManager.CreateDocument(Nothing, OcrCreateDocumentOptions.AutoDeleteFile)
		 ' Create IOcrPage from loaded image 
		 _ocrPage = _ocrEngine.CreatePage(_viewer.Image, OcrImageSharingMode.AutoDispose)
		 ' Recognize Text 
		 _ocrPage.Recognize(Nothing)
		 ' Add the page  
		 ocrDocument.Pages.Add(_ocrPage)
		 ' Save page as documentation 
		 Dim saveDlg As New SaveFileDialog()
		 saveDlg.InitialDirectory = "C:\LEADTOOLS22\Resources\Images"
		 saveDlg.Filter = "Adobe Portable Document Format|*.pdf"
		 If saveDlg.ShowDialog(Me) <> DialogResult.OK Then
			Return
		 End If
		 ocrDocument.Save(saveDlg.FileName, DocumentFormat.Pdf, Nothing)
		 MessageBox.Show($"OCR output saved to {saveDlg.FileName}")
	  End Using
   Catch ex As Exception
	  MessageBox.Show(ex.ToString())
   End Try
End Sub
VB   C#

8. Compatibility

This section is about the services these software packages provide to support different platforms. Both of these software packages provide support for many platforms and operating systems.

8.1 IronOCR Compatibility

The IronOCR .NET SDK is the best OCR SDK that is compatible with almost all the .NET platforms and operating systems that support the C# programming language. IronOCR also provides support for different image formats such as JPEG, JPG, tiff and many more.

.NET Languages:

  • C#
  • VB.NET
  • F#

Platforms:

  • .NET 5
  • .NET Core 2x & 3x
  • .NET Standard 2
  • .NET Framework 4x

App Types:

  • Console, Web, & Desktop

OS:

  • Microsoft Windows
  • Linux (Debian, CentOS, Ubuntu)
  • Mac-OS
  • Docker (Windows, Linux, Azure)
  • Azure (VPS, Webapps, Websites, Functions)

IDEs:

  • Microsoft Visual Studio
  • Jetbrains ReSharper & Rider

8.2 LeadTools Compatibility

Lead Technologies also provides support for the integration of its various products and apps across different platforms. Lead Technologies also provides excellent SDK support for its users and developers.

Operating Systems for Deployment

  • Windows
  • Mac-OS 10.10
  • iOS 8.0
  • Android 4.0 +

Component Type

  • .NET
    • C#, VB, C++/CLI, XAML
    • WinForms, WPF, Web Forms
  • Web Services
  • iOS & mac-OS
  • Android
  • Linux

Compatible Containers

  • Microsoft Visual Studio
  • .NET Framework 4.5
  • .NET Framework 4.0
  • .NET Framework 3.5
  • .NET Framework 3.0
  • .NET Framework 2.0

9. Licensing

Licenses are required for the use of any of the software discussed above. Both sets of software require the holding of licenses before logging in to the environment. Once you are logged in, only then can you begin to access their whole new level of software technologies.

9.1. LeadTools Licensing

LeadTools provide two (2) key licensing components in the SDK license:

  • A "Development License" permits a programmer to use the SDK for development.
  • A "Deployment License" allowing the customer to deploy or distribute the end-user application created using the SDK redistributable files that contain LeadTool's intellectual property.
Development License

To develop with LEADTOOLS, you'll need a Development License. The Development License can be purchased directly from LEAD or through a LEAD-authorized reseller or distributor.

The Development License enables a customer to install the SDK on a development machine, and use it to create an end-user application by including specific redistributable libraries and files into the application using the SDK sample code and documentation.

Deployment License

The customer's use of the SDK-developed end-user application ("End User Software") is governed by the Deployment License.

Unlike a standard end-user application license agreement, which prohibits any copying of the application, an SDK license allows the user to copy and redistribute a portion of the SDK. In order to reproduce LEAD's intellectual property and deploy it with end-user software produced using the LEAD SDK, LEAD's clients must obtain the necessary deployment license.

9.1.2. Pricing

LeadTools does not provide free licenses for its developers. Instead, it provides comprehensive developer-based licensing. To see the Lead Technologies OCR SDK price structure, visit here.

9.2. IronOCR Licensing

IronOCR is a library that provides a developer's license for free. IronPDF also has a distinct pricing structure; the Lite bundle starts at $499 with no hidden fees. The redistribution of SaaS and OEM products is also possible. All licenses come with a 30-day money-back guarantee, a year of software support and upgrades, dev/staging/production validity, and a perpetual license (one-time purchase). To see IronOCR's entire price structure and licensing details, go here.

You can get the redistribution of SaaS and OEM products royalty-free service for just a $1599 single-time purchase.

10. Summary and Conclusion

10.1. Summary

IronOCR is a .NET SDK library that uses the world's most powerful Tesseract engine called Iron Tesseract. It support a total of 125+ international languages. IronOCR is an awesome doc scanner app with a lot of imaging features such as OCR region of an image, text extraction from images, fixing a low resolution image and performing OCR on a specific region of an image, and many other related features. IronOCR focuses on providing speed with accuracy, and its accuracy rate of 99.8% is higher any other OCR Tesseract out there. IronOCR works out of the box with no need to tune performance or heavily modify input images. On top of all of that, you can always get all five of the Iron Software products for the price of just two. Click here to see the webpage.

  1. IronPDF
  2. IronOCR
  3. IronXL
  4. Iron Barcode
  5. IronWebScraper

The LeadTools OCR is a toolkit from LeadTools that provides most recognition features quickly and efficiently. Programmers can conduct character recognition on document pictures, and output recognized text to over 20 file formats using the LEADTOOLS OCR class library. Its library can be integrated with most of the programing languages and nearly all of the platforms available out there. Its features include:

  • Fast and Accurate OCR with multithreaded support
  • Broad OCR language character-set support, including Latin, Cyrillic, East Asian and Arabic
  • Save OCR results to over 40 output formats including searchable PDF, PDF/A, Word and XML
  • Full-page and zonal OCR
  • Built-in and custom spelling dictionaries to improve OCR results
  • Powerful document image cleanup and preprocessing functions to improve OCR results from scanned images

10.2. Conclusion

IronOCR and LeadTools OCR are both top-of-the-line tools and provide all the features that a C# or .NET developer could wish for. IronOCR is easier to use and code than its competitor. Both sets of software do not incur ongoing costs, but IronOCR is a lot more price-efficient than the LeadTools OCR Library. IronOCR provides more accuracy then any of its competitors out there. IronOCR provides international language support for 125+ languages. On the other hand, LeadTools only provides support for 40+ languages. Taking into account all the various aspects of performance, the only conclusion we can draw is that IronOCR holds significant advantages over LeadTools OCR.

You can download the software product from this link.