A Comparison Between IronOCR and Nanonets OCR

Optical Character Recognition (OCR) provides the ability to convert an image file into machine-encoded text. This is incredibly useful given that scanned documents are saved as image files, and the data in these image files cannot be searched, edited, or saved in text format using a normal text editor or even a word processing application. OCR processing helps to convert these images into machine-readable text for further processing by users.

In this modern age, documents shared over the internet are usually in digital format, mostly in the form of PDFs or images. There are numerous online resources available that convert images to text. However, most businesses require this functionality in their software applications. Bearing this in mind, there are many libraries that provide OCR processing technology to be embedded in software applications.

In this article, we are going to discuss two of the most popular OCR libraries for C#. These are:

  • IronOCR
  • Nanonets OCR API

Introduction

IronOCR - C# Library

IronOCR for .NET is a C# library that enables users to scan, search, and read images and PDFs. It takes an image or PDF file as input and uses the latest Tesseract 5 custom-built .NET OCR engine to output text, structured data, or searchable PDF documents. Tesseract is available in 125+ languages, and IronOCR offers cross-platform support in .NET Core, Standard, from 2.0 up to 7.

IronOCR is a user-friendly API that allows C# developers to automatically convert images to text using the IronTesseract class. The library prioritizes speed, accuracy, and ease of use.

Another powerful feature of IronOCR is its ability to scan barcodes and QR codes from all image files and read their text. Additional important features of IronOCR are listed below.

Features

  • International Languages: 127+ and custom languages support.
  • Text and Barcode Reading: Read text and numbers from multiple languages at once.
  • Specialist Documents: Read text from specialty documents such as receipts, checks, and invoices.
  • Read from Many Formats: Read from images (PNG, JPG, GIF, TIFF, BMP), System.Drawing Objects, streams, PDF documents, and more.
  • Preprocessing Features: Includes preprocessing features such as the Filter Wizard, image correction, orientation correction, and color correction.
  • Simple Data Output: Outputs in .NET Text Strings, Barcode, QR, and Image format.

Now, let's have a look at Nanonets OCR API.

Nanonets OCR API

Nanonets OCR API is a REST API that provides real-time data extraction tailored to your business needs for automated workflows. The OCR API is AI-powered and can securely capture, categorize, and extract data from unstructured documents within seconds. With Nanonets, you can automate manual data entry, reducing the manual effort required.

Nanonets understands documents using machine learning, even those that do not follow a standard template. You can upload any unstructured document and capture only the desired information based on different fields. Unlike traditional OCR, the Nanonets OCR model can be trained for better results. As your business grows, the Nanonets intelligent document processing OCR model also grows and learns with every new document, providing fast and accurate results.

Additionally, Nanonets provides a Python package that enables easy integration and data capture in Python applications without requiring API requests. Other features include:

GDPR compliance

  • Automated data capture
  • Validation of extracted data
  • Model training and retraining capabilities
  • Fast API response time
  • On-premises deployment available in multiple languages
  • Continuous AI learning model
  • No template setup required
  • Multiple export options available

The rest of the article goes as follows:

  1. Creating the Visual Studio Project
  2. Installing IronOCR
  3. Installing Nanonets OCR API
  4. Image to Text
  5. Barcode and QR to Text
  6. PDF to Text
  7. Licensing
  8. Conclusion

1. Creating the Visual Studio Project

In this tutorial, we are going to use the latest version of Visual Studio 2022. If you do not already have it downloaded and installed, you can do so from the Visual Studio website.

Now, we need to create a console project to get started with both libraries. Follow the steps to create a project:

  • Open your Visual Studio 2022.
  • Click on Create a new Project.

    A Comparison Between IronOCR and Nanonets OCR: Figure 1

  • Select C# Console Application from the given options.

    A Comparison Between IronOCR and Nanonets OCR: Figure 2

  • Configure your new project with a name and location. For example, call it "OCRProject".
  • Click Next.

    A Comparison Between IronOCR and Nanonets OCR: Figure 3

  • Under additional information, select .NET 6.0 Framework, as it is the most stable version.

    A Comparison Between IronOCR and Nanonets OCR: Figure 4

  • Now, click Create and the project will be created in your specified location.

Next, we will install the libraries in our project for comparison.

2. Installing IronOCR

There are multiple ways to install the IronOCR library. Let's have a look at them one by one.

2.1. Using Visual Studio NuGet Package Manager

NuGet is the package manager for downloading and installing dependencies in your project. Its packages contain compiled code and (DLLs) and the manifest file. Access it using the following method:

  • Click on the tools tab.
  • Extend the NuGet Package Manager option.
  • Click Manage NuGet Packages for Solutions.

    A Comparison Between IronOCR and Nanonets OCR: Figure 5

Alternatively:

  • Right-click Solution Explorer.
  • Click Manage NuGet Packages.

    A Comparison Between IronOCR and Nanonets OCR: Figure 6

Now, the NuGet Package Manager window will open. Browse for IronOCR and click Install.

A Comparison Between IronOCR and Nanonets OCR: Figure 7

2.2. Download from NuGet Website

IronOCR can also be downloaded directly from the NuGet official website. Follow the following steps:

  1. Click on the link: "".
  2. Click the "download package" option on the right-hand side of the page.
  3. Open the downloaded package and it will start installing.
  4. Finally, reload the solution and it's done.

2.3. Download using the IronOCR Webpage

Simply visit the Iron Software website and navigate to the IronOCR for .NET webpage. Scroll to the bottom and click Download DLL or Download Windows installer.

A Comparison Between IronOCR and Nanonets OCR: Figure 8

A zip file will be downloaded. Extract the project file or run the Windows installer. Follow the below steps to add it to your project.

  1. Right-click the dependencies of the project in Visual Studio from the solution explorer.
  2. Then, select the option Add Project Reference.
  3. Browse the downloaded DLL file location.
  4. Finally, click on OK to add a project reference.

2.4. Using Command prompt in Visual Studio

  1. Navigate to the Tools tab in Visual Studio.
  2. Extend the NuGet Package Manager option.
  3. Select Package Manager Console and type the following command:
Install-Package IronOcr

This will automatically download and install IronOCR in your project.

Now, we are ready to use IronOCR in our project.

2.5. Adding the Necessary IronOCR Namespaces

There is only one Namespace required, which needs to be added on top of the source code file where we need to access IronOCR's functions.

using IronOcr;
using IronOcr;
Imports IronOcr
VB   C#

Now, let's install Nanonets OCR API.

3. Installing Nanonets OCR

Here are the corrected paragraphs:

Nanonets can be used in multiple ways to capture data. It provides an online OCR facility that can be used for instant data extraction, reducing turnaround times. As a REST API, it can be integrated into multiple programming languages. Here, we will demonstrate how to integrate it into a C# programming language.

To automate data capture using the Nanonets OCR API in C#, you will need the following:

  1. Sign up for Nanonets - You can sign up for a free trial using either your Gmail account or a registered email with Nanonets.
  2. Create an OCR Model - This will generate a model ID that will be used later when making API calls.
  3. Get a free API key - Move to the Accounts Info tab and click on API Keys. Here, you can add new keys or use an existing one.

3.1. Adding RestSharp Namespace

RestSharp is a simple Rest and HTTP client library for .NET. It is used to send and receive API requests and handle responses. This library is needed to execute Nanonets API code since it is also a REST API.

To install RestSharp, open the NuGet Package Manager for your solution, browse for RestSharp, and install it. Alternatively, you can open the Package Manager Console and type the following command:

PM> Install-Package RestSharp

Now, everything is set up and ready to use.

4. Images to Text

Reading data from images can be quite a tedious task. Images resolution and quality play an important role while extracting content. Both the IronOCR and Nanonets provide optical character recognition functionality to extract text from images.

4.1. Using IronOCR

IronOCR makes it very easy for developers to read the contents of an image file with its powerful IronTessaract class. We will use the following code to read text from a PNG image file:

var Ocr = new IronTesseract();
using (var Input = new OcrInput()){
  Input.AddImage("test-files/employmentapp.png");
  var Result = Ocr.Read(Input);
  Console.WriteLine(Result.Text);
}
var Ocr = new IronTesseract();
using (var Input = new OcrInput()){
  Input.AddImage("test-files/employmentapp.png");
  var Result = Ocr.Read(Input);
  Console.WriteLine(Result.Text);
}
Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
  Input.AddImage("test-files/employmentapp.png")
  Dim Result = Ocr.Read(Input)
  Console.WriteLine(Result.Text)
End Using
VB   C#

Input Image

A Comparison Between IronOCR and Nanonets OCR: Figure 9

Output

A Comparison Between IronOCR and Nanonets OCR: Figure 10

The output of IronOCR matches the original image given to it. The code is clean and easy to understand without any technicalities.

4.2. Using Nanonets OCR

Nanonets also provides the facility to extract text from images. To do this, an API call is made with the authentication key, and then the image is uploaded to the Nanonets server. The fast OCR tool will then return the extracted text as a response to the application. Here is an example of the code:

var client = new RestClient("https://app.nanonets.com/api/v2/OCR/FullText");
client.Timeout = -1;
var request = new RestRequest(Method.Post.ToString());
request.AddHeader("Authorization", "Basic " + Convert.ToBase64String(Encoding.Default.GetBytes("REPLACE_YOUR_API_KEY:")));
request.AddFile("file", "FILE_PATH");
RestResponse response = client.Execute(request);
Console.WriteLine(response.Content);
var client = new RestClient("https://app.nanonets.com/api/v2/OCR/FullText");
client.Timeout = -1;
var request = new RestRequest(Method.Post.ToString());
request.AddHeader("Authorization", "Basic " + Convert.ToBase64String(Encoding.Default.GetBytes("REPLACE_YOUR_API_KEY:")));
request.AddFile("file", "FILE_PATH");
RestResponse response = client.Execute(request);
Console.WriteLine(response.Content);
Dim client = New RestClient("https://app.nanonets.com/api/v2/OCR/FullText")
client.Timeout = -1
Dim request = New RestRequest(Method.Post.ToString())
request.AddHeader("Authorization", "Basic " & Convert.ToBase64String(Encoding.Default.GetBytes("REPLACE_YOUR_API_KEY:")))
request.AddFile("file", "FILE_PATH")
Dim response As RestResponse = client.Execute(request)
Console.WriteLine(response.Content)
VB   C#

A Comparison Between IronOCR and Nanonets OCR: Figure 11

The output is not perfect. The image contained structured data, only some of which is properly fetched. With another simple text image, the output was fine. Note that the model can be trained for more accurate results.

5. Barcode and QR code to Text

5.1. Using IronOCR

IronOCR provides a useful feature for reading images that includes the ability to detect and read barcodes and QR codes. To enable this feature, set the ReadBarcodes configuration property to true before processing the image. Once the OCR processing is complete, iterate through the OCR results to extract the value of each detected barcode. Below is an example code snippet for reading barcodes with IronOCR:

var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true;
using (var input = new OcrInput()) {    
input.AddImage("test-files/Barcode.png");    
var Result = Ocr.Read(input);    
  foreach (var Barcode in Result.Barcodes){
          Console.WriteLine(Barcode.Value);
  }
}
var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true;
using (var input = new OcrInput()) {    
input.AddImage("test-files/Barcode.png");    
var Result = Ocr.Read(input);    
  foreach (var Barcode in Result.Barcodes){
          Console.WriteLine(Barcode.Value);
  }
}
Dim Ocr = New IronTesseract()
Ocr.Configuration.ReadBarCodes = True
Using input = New OcrInput()
input.AddImage("test-files/Barcode.png")
Dim Result = Ocr.Read(input)
  For Each Barcode In Result.Barcodes
		  Console.WriteLine(Barcode.Value)
  Next Barcode
End Using
VB   C#

INPUT IMAGE

OUTPUT

All three barcodes in the input image are read successfully, and their hidden text is displayed.

5.2. Using Nanonets OCR

Nanonets OCR API provides the facility to detect QR codes. However, this functionality is only available in the Enterprise plan, and you will need to contact sales to use it. Additionally, Nanonets allows you to detect specific parts of documents or receipts. It also provides other features such as accounts payable, invoice processing, and accounting automation.

6. PDF to Text

6.1. Using IronOCR

Reading PDF files is just as simple as reading image files with IronOCR. The only change required is to use the AddPDF method instead of AddImage in the code for reading images. The code is as follows:

var Ocr = new IronTesseract();
using (var Input = new OcrInput()) {
  Input.AddPdf("test-files/example.pdf");
  var Result = Ocr.Read(Input);
  Console.WriteLine(Result.Text);
}
var Ocr = new IronTesseract();
using (var Input = new OcrInput()) {
  Input.AddPdf("test-files/example.pdf");
  var Result = Ocr.Read(Input);
  Console.WriteLine(Result.Text);
}
Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
  Input.AddPdf("test-files/example.pdf")
  Dim Result = Ocr.Read(Input)
  Console.WriteLine(Result.Text)
End Using
VB   C#

The extracted text is in the same format as the PDF file.

6.2. Using Nanonets OCR

Reading data from PDF files is also available in the Nanonets OCR API. The code is almost identical to the image text detection code, except for the URL used in the request. Let's take a look at the code:

var client = new RestClient("https://app.nanonets.com/api/v2/OCR/Model/{{model_id}}/LabelFile/?async=false");
var request = new RestRequest(Method.Post.ToString());
request.AddHeader("authorization", "Basic " + Convert.ToBase64String(Encoding.Default.GetBytes("REPLACE_YOUR_API_KEY:")));
request.AddHeader("accept", "Multipart/form-data");
request.AddFile("file", "test-files/example.pdf");
RestResponse response = client.Execute(request);
Console
var client = new RestClient("https://app.nanonets.com/api/v2/OCR/Model/{{model_id}}/LabelFile/?async=false");
var request = new RestRequest(Method.Post.ToString());
request.AddHeader("authorization", "Basic " + Convert.ToBase64String(Encoding.Default.GetBytes("REPLACE_YOUR_API_KEY:")));
request.AddHeader("accept", "Multipart/form-data");
request.AddFile("file", "test-files/example.pdf");
RestResponse response = client.Execute(request);
Console
Dim client = New RestClient("https://app.nanonets.com/api/v2/OCR/Model/{{model_id}}/LabelFile/?async=false")
Dim request = New RestRequest(Method.Post.ToString())
request.AddHeader("authorization", "Basic " & Convert.ToBase64String(Encoding.Default.GetBytes("REPLACE_YOUR_API_KEY:")))
request.AddHeader("accept", "Multipart/form-data")
request.AddFile("file", "test-files/example.pdf")
Dim response As RestResponse = client.Execute(request)
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'Console
VB   C#

In the above code, replace the model_id with your OCR model ID. Also, replace the API key with your own API key. Then, replace the PDF file path with the path to your own file.

The output is similar to IronOCR but extra spaces and new lines are included in the output of Nanonets OCR.

7. Licensing

IronOCR is free for development purposes, but it needs to be licensed for commercial use. It also provides a free trial to test all of its potential for your needs. The lite package starts at $749 with a 30-day money-back guarantee. IronOCR provides one year of product support and updates for free, and then $399 per year thereafter. All licenses are perpetual, meaning there is only a one-time purchase and no hidden charges. You can also choose royalty-free redistribution coverage for SaaS and OEM products for just a $1999 one-time purchase. For more information on license packages and pricing plans, please visit the following link.

Nanonets OCR API offers three different packages. You can sign up for free for its starter package. The first 500 pages are free, after which $0.3 per page is charged. You only pay for what you use. For more detailed information on pricing, you can visit this link.

8. Conclusion

IronOCR provides C# developers with the advanced Tesseract API available on most platforms. It can be deployed on Windows, Linux, Mac, Azure, AWS, and Lambda, and supports .NET Framework projects as well as .NET Standard and .NET Core. IronOCR also enables reading barcodes in OCR scans, and even exporting OCR as HTML and searchable PDFs. For more information on C# Tesseract OCR, click here.

Nanonets OCR API offers a variety of OCR tools. It provides ready-to-use OCR solutions for multiple document types like invoices, receipts, bills, forms, and ID cards to automate data capture. No template setup is required, there are no hidden charges, and it enables a 90% time saving and 10x productivity using Nanonets OCR API.

IronOCR licenses are developer-based, which means that you should always purchase a license based on the number of developers who will use the product. Nanonets pricing plans are based on the number of images or PDF pages to extract information and analyze the data. The Pro and Enterprise plans are on a monthly basis per model, and the prices increase when the number of models and pages increases compared to IronOCR licenses. Moreover, IronOCR licenses are a one-time purchase and can be used for a lifetime, and they support OEM and SaaS distribution.

In overall comparison, both APIs provide AI and ML-based OCR functionalities. IronOCR has a slight advantage over Nanonets because it can be used offline and provides more reliable results even for unstructured documents. IronOCR offers the facility to use custom-trained data with fast integration for more accurate results. Nanonets OCR provides the facility to train the model based on key fields, and it can be difficult to detect if not trained properly. Moreover, IronOCR provides multilingual support and supports up to 127+ international languages.

Now you can get five Iron products for the price of two as part of the complete Iron Suite. Visit this link to explore more.

IronOCR also provides a free trial with a money-back guarantee. You can download IronOCR from this link.