跳至页脚内容
与其他组件比较

IronOCR 和 Abbyy Finereader 之间的比较

In this article, we are going to compare two of the common libraries and applications for using OCR for PDF documents and images. These are:

  • ABBYY FineReader PDF Software
  • IronOCR

1. Introduction

1.1 ABBYY FineReader PDF — Introduction and Features

ABBYY FineReader PDF is an optical character recognition (OCR) application created by ABBYY. It allows us to convert image documents (pictures, scans, PDF files), and screen captures can be converted to editable file formats such as Microsoft Word, Microsoft Excel, Microsoft PowerPoint, Rich Text Format, HTML, PDF/A, searchable PDF, CSV, and text (plain text).

ABBYY FineReader is a desktop application available for both Windows, Linux, and macOS. It also allows the creation of editable formats for pdf files. We can also read PDFs just like we can with Adobe Acrobat. ABBYY FineReader integrates scanned documents into digital workflows.

Manage and complete documents in a simple and efficient manner to save time and effort. Work with any document in the same methodical way, whether it was created digitally or converted from paper. You can alter the text, tables, and full layout of your PDF without having to convert it first.

ABBYY FineReader PDF can create PDFs from more than 25 different file formats, straight from paper documents or by printing to a PDF printer from practically any application. PDF/A-1 to PDF/A-3 are supported for long-term archiving, and PDF/UA ensures that content is accessible when using assistive software such as screen readers. It also empowers professionals to maximize efficiency in the digital workplace.

Create and update your own interactive PDF forms using ABBYY FineReader to successfully collect information and standardize documents. Create forms by combining interactive fields of various types, setting actions, editing existing PDF forms, or adding form elements to a conventional PDF.

ABBYY FineReader can instantly convert paper documents, scans, and scanned PDFs into searchable PDFs, allowing you to retrieve documents from digital archives and access the information they contain. FineReader PDF supports all compliance levels and variants of the PDF/A format, the industry standard for long-term archiving, from PDF/A-1 through PDF/A-3.

ABBYY’s latest AI-based OCR technology, FineReader PDF, makes it easier to digitize, retrieve, edit, protect, share, and collaborate on all kinds of documents in the same workflow. FineReader also includes document comparison, which helps us to compare original documents, as well as converted PDFs and image files.

1.2 IronOCR — Introduction and Features

IronOCR provides software for engineers who use IronOCR for .NET to read text content from photos and PDFs in .NET apps and Web sites. It scans photos for text and barcodes, and supports numerous worldwide languages; it can then provide output as either plain text or structured data. The OCR library from Iron Software can be used in MVC, Web, console, and desktop .NET applications. For commercial deployments, licensing is provided with direct assistance from the development team.

  • Using the latest Tesseract 5 engine, IronOCR reads text, barcodes, and QR codes from any picture or PDF format. This library quickly adds OCR to desktop, console, and web applications.
  • IronOCR supports 125 international languages. It also supports custom language and word lists.
  • IronOCR is able to read more than 20 barcode formats and QR codes.
  • IronOCR supports multipage gifs and tiff image formats.
  • IronOCR provides correction for low-quality scanned images.
  • IronOCR supports multithreading — it executes one or more processes at a time.
  • IronOCR can provide structured data output for pages, paragraphs, lines, words, characters, etc.
  • IronOCR supports a variety of operating systems such as Windows, Linux, macOS, etc.

2. Creating a New Project in Visual Studio

Open the Visual Studio software and go to the "file menu". Select "new project", then select "Console Application".

Abbyy Finereader Ocr Alternatives 1 related to 2. Creating a New Project in Visual Studio

Enter the project name and select the file path in the appropriate text box. Then, click the create button and select the required .NET Framework, as in the screenshot below.

Abbyy Finereader Ocr Alternatives 2 related to 2. Creating a New Project in Visual Studio

The Visual Studio project will now generate the structure for the selected application. If you have selected the console, Windows, and web application, it will now open the Program.cs file where you can enter the code and build/run the application.

Abbyy Finereader Ocr Alternatives 3 related to 2. Creating a New Project in Visual Studio

Next, we can add the library to test the code.

3. Install

3.1 Install ABBYY FineReader PDF

We can download the ABBYY FineReader here.

Abbyy Finereader Ocr Alternatives 4 related to 3.1 Install ABBYY FineReader PDF

The above image shows that there are two versions, Individual and Business, that you can download as per your requirements. Select the "download free trial" option. It will redirect you to a form as in the image below:

Abbyy Finereader Ocr Alternatives 5 related to 3.1 Install ABBYY FineReader PDF

We will need to fill out the form to get the EXE file location. Click the download option to download the file.

Once the file download is completed, we can double-click the EXE file to start the installation. Once completed, it will display a popup message, and it is now ready to use.

3.2 Install IronOCR

IronOCR Library can be downloaded and installed in four ways.

These are:

  • Using Visual Studio
  • Using the Visual Studio Command-Line
  • Direct download from the NuGet website
  • Direct download from the IronPDF website

3.2.1 Using Visual Studio

The Visual Studio software provides the NuGet Package Manager option to install the package directly to the solution. The below screenshot shows how to open the NuGet Package Manager.

Abbyy Finereader Ocr Alternatives 6 related to 3.2.1 Using Visual Studio

It provides a search box to show the list of packages from the NuGet website. In the package manager, we need to search for the keyword IronOCR, as in the screenshot below:

Abbyy Finereader Ocr Alternatives 7 related to 3.2.1 Using Visual Studio

From the above image, we will get the list of related search items. We need to select the required option to install the package to the solution.

3.2.2 Using the Visual Studio Command-Line

In Visual Studio, go to Tools -> NuGet Package Manager -> Package Manager Console

Enter the following line in the Package Manager Console tab:

Install-Package IronOcr

This command will download and install the package in the current project, and it will be ready to use.

3.2.3 Direct download from the NuGet website

The third way is to download the NuGet package directly from the website.

  • Navigate to the Link.
  • Select the download package option from the menu on the right-hand side.
  • Double-click the downloaded package. It will be installed automatically.
  • Next, reload the solution and start using it in the project.

3.2.4 Direct download from the IronOCR website

Click the link here to download the latest package direct from the website. Once downloaded, follow the steps below to add the package to the project.

  • Right-click the project from the solution window.
  • Then, select the "Add Reference" option and browse the location of the downloaded reference.
  • Next, click OK to add the reference.

4. OCR Image

Both IronOCR and ABBYY FineReader have OCR technology that will convert the image into text.

4.1 Using ABBYY FineReader PDF

Next, open the ABBYY FineReader PDF app which will open with multiple options, as in the image below.

Abbyy Finereader Ocr Alternatives 8 related to 4.1 Using ABBYY FineReader PDF

Next, select the option "Open" from the OCR Editor options. This will prompt an option to select image files:

Abbyy Finereader Ocr Alternatives 9 related to 4.1 Using ABBYY FineReader PDF

After selecting a file, it will automatically start scanning the image into editable text, and then show the result in the window as in the screenshot below:

Abbyy Finereader Ocr Alternatives 10 related to 4.1 Using ABBYY FineReader PDF

The above image shows the source image converted into editable text. However, the result is not too accurate. Some of the numbers are not recognized by the ABBYY FineReader PDF app. This is clearly shown in the comparison windows — on the left side is the source image, and on the right side is the OCR converted text.

4.2 Using IronOCR

// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Configure OCR language and Tesseract version
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;

// Create a new OcrInput object to manage input images
using (var Input = new OcrInput())
{
    // Add an image to the input for processing
    Input.AddImage(@"3.png");

    // Perform OCR to read text from the image
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);
    Console.ReadKey();
}
// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Configure OCR language and Tesseract version
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;

// Create a new OcrInput object to manage input images
using (var Input = new OcrInput())
{
    // Add an image to the input for processing
    Input.AddImage(@"3.png");

    // Perform OCR to read text from the image
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);
    Console.ReadKey();
}
' Create an instance of IronTesseract for OCR operations
Dim Ocr = New IronTesseract()

' Configure OCR language and Tesseract version
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5

' Create a new OcrInput object to manage input images
Using Input = New OcrInput()
	' Add an image to the input for processing
	Input.AddImage("3.png")

	' Perform OCR to read text from the image
	Dim Result = Ocr.Read(Input)

	' Output the extracted text to the console
	Console.WriteLine(Result.Text)
	Console.ReadKey()
End Using
$vbLabelText   $csharpLabel

The Tesseract 5 API, which allows us to convert image files into text, is demonstrated above. We're creating an instance of IronTesseract in the above code snippet. We're also using an OcrInput object that will allow us to add one or more image files. We must provide the path of the available image inside the code when utilizing the OcrInput object method AddImage. Any number of images can be added. The function Read in the IronTesseract object that we constructed earlier may be utilized to perform OCR by parsing the image file and extracting the result into the OCR result. It is capable of extracting text from images and converting it to a string.

We can also use Tesseract to add multi-frame images. AddMultiFrameTiff is a different method for this operation. The Tesseract library reads each frame in the image, and each frame is treated as a distinct page. The process will read the first frame of the image and then proceed onto the next frame, and so on, until all of the image's frames have been scanned. Only the tiff image format is supported by this method.

Abbyy Finereader Ocr Alternatives 11 related to 4.2 Using IronOCR

The above image is the output of the IronOCR result, which is accurate and shows the data correctly converted into editable text.

5. OCR PDF File

IronOCR and ABBYY FineReader PDF help to convert a PDF file into editable text. ABBYY FineReader PDF provides a list of options to the user such as save the page, edit the image, recognize page, etc. It also provides save options such as txt, document, HTML format, etc. IronOCR also allows us to save converted OCR files into HTML, txt, pdf, etc.

5.1 Using ABBYY FineReader PDF

Open the ABBYY FineReader PDF software. This will open a page like the image below, offering multiple options.

Abbyy Finereader Ocr Alternatives 12 related to 5.1 Using ABBYY FineReader PDF

Next, select the option "Open" from the OCR Editor options. This will prompt an option to select the image/PDF. We can select either a PDF or an image, or we can select both files.

Abbyy Finereader Ocr Alternatives 13 related to 5.1 Using ABBYY FineReader PDF

After selecting the file, click the OK button. It will automatically start scanning the image into editable text and show the result in a window like the screenshot below.

Abbyy Finereader Ocr Alternatives 14 related to 5.1 Using ABBYY FineReader PDF

The above image shows the source PDF converted into editable text. However, the result is not completely accurate. Some of the numbers are not recognized by the ABBYY FineReader PDF application. This is clearly shown in the comparison windows — on the left side is the source PDF, and on the right side is the OCR converted text.

5.2 Using IronOCR

We can also use OCRInput to manage PDF files. Every page of the documents will be read by the Iron Tesseract class. The text will then be extracted from the pages. We may also open protected documents using a second function called AddPdf, which allows us to add PDFs to our document list (password if it is protected). The following code demonstrates how to open a password-protected PDF document:

// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Create OcrInput to manage input PDFs
using (var Input = new OcrInput())
{
    // Add a password-protected PDF to the input
    Input.AddPdf("example.pdf", "password");

    // Perform OCR to read text from the PDF
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);
}
// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Create OcrInput to manage input PDFs
using (var Input = new OcrInput())
{
    // Add a password-protected PDF to the input
    Input.AddPdf("example.pdf", "password");

    // Perform OCR to read text from the PDF
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);
}
' Create an instance of IronTesseract for OCR operations
Dim Ocr = New IronTesseract()

' Create OcrInput to manage input PDFs
Using Input = New OcrInput()
	' Add a password-protected PDF to the input
	Input.AddPdf("example.pdf", "password")

	' Perform OCR to read text from the PDF
	Dim Result = Ocr.Read(Input)

	' Output the extracted text to the console
	Console.WriteLine(Result.Text)
End Using
$vbLabelText   $csharpLabel

The following methods are also provided by Iron Tesseract:

  • AddPdfPage
  • AddPdfPages

We may read and extract content from a single page in a PDF document using AddPdfPage. Only the page number from which we wish to extract text needs to be specified. AddPdfPages allows us to extract text from multiple pages that we specify. In IEnumerable<int>, we simply need to specify the number of pages. We must also include the file location as well as the extension of the file. This is demonstrated in the following code example:

// Define numbers representing pages to extract from the PDF
IEnumerable<int> numbers = new List<int> { 2, 8, 10 };

// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Create OcrInput to manage input PDFs
using (var Input = new OcrInput())
{
    // Add a specific page from PDF for OCR
    // Input.AddPdfPage("example.pdf", 10);

    // Add multiple specific pages from PDF for OCR
    // Input.AddPdfPages("example.pdf", numbers);

    // Perform OCR to read text from the specified pages
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);

    // Save the extracted text to a file
    Result.SaveAsTextFile("ocrtext.txt");
}
// Define numbers representing pages to extract from the PDF
IEnumerable<int> numbers = new List<int> { 2, 8, 10 };

// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Create OcrInput to manage input PDFs
using (var Input = new OcrInput())
{
    // Add a specific page from PDF for OCR
    // Input.AddPdfPage("example.pdf", 10);

    // Add multiple specific pages from PDF for OCR
    // Input.AddPdfPages("example.pdf", numbers);

    // Perform OCR to read text from the specified pages
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);

    // Save the extracted text to a file
    Result.SaveAsTextFile("ocrtext.txt");
}
' Define numbers representing pages to extract from the PDF
Dim numbers As IEnumerable(Of Integer) = New List(Of Integer) From {2, 8, 10}

' Create an instance of IronTesseract for OCR operations
Dim Ocr = New IronTesseract()

' Create OcrInput to manage input PDFs
Using Input = New OcrInput()
	' Add a specific page from PDF for OCR
	' Input.AddPdfPage("example.pdf", 10);

	' Add multiple specific pages from PDF for OCR
	' Input.AddPdfPages("example.pdf", numbers);

	' Perform OCR to read text from the specified pages
	Dim Result = Ocr.Read(Input)

	' Output the extracted text to the console
	Console.WriteLine(Result.Text)

	' Save the extracted text to a file
	Result.SaveAsTextFile("ocrtext.txt")
End Using
$vbLabelText   $csharpLabel

Using the SaveAsTextFile function, we can store the result as a text file, allowing us to download the file to the output directory path. Also, we can save the file as an HTML file using SaveAsHocrFile.

6. Other Features

6.1 Using ABBYY FineReader PDF

FineReader has some additional options such as: Draw Text Area, Draw Picture Area, Draw Table Area, Draw Recognize Area, etc. These help the user to improve the performance of the OCR. Further, in addition to performing OCR, the application also enables users to complete operations such as combining PDFs, splitting PDFs, editing PDFs, etc.

6.2 Using IronOCR

IronOCR has unique features which allow us to read barcodes and QR codes from scanned documents. The below code shows how we can read barcodes from a given image or document.

// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Configure OCR language and barcode reading
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;

// Create OcrInput to manage input images
using (var Input = new OcrInput())
{
    // Add an image containing barcodes
    Input.AddImage("barcode.gif");

    // Perform OCR to read text and barcodes from the image
    var Result = Ocr.Read(Input);

    // Iterate through detected barcodes and output their values
    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
    }
}
// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Configure OCR language and barcode reading
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;

// Create OcrInput to manage input images
using (var Input = new OcrInput())
{
    // Add an image containing barcodes
    Input.AddImage("barcode.gif");

    // Perform OCR to read text and barcodes from the image
    var Result = Ocr.Read(Input);

    // Iterate through detected barcodes and output their values
    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
    }
}
' Create an instance of IronTesseract for OCR operations
Dim Ocr = New IronTesseract()

' Configure OCR language and barcode reading
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.ReadBarCodes = True
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5

' Create OcrInput to manage input images
Using Input = New OcrInput()
	' Add an image containing barcodes
	Input.AddImage("barcode.gif")

	' Perform OCR to read text and barcodes from the image
	Dim Result = Ocr.Read(Input)

	' Iterate through detected barcodes and output their values
	For Each Barcode In Result.Barcodes
		Console.WriteLine(Barcode.Value)
	Next Barcode
End Using
$vbLabelText   $csharpLabel

The code above helps read barcodes from a given image or PDF document. It is able to read more than one barcode from a page/image. To read the barcode, IronOCR has a unique setting Ocr.Configuration.ReadBarCodes which helps read the barcode; the default value is set to false.

After reading the input, the data will be saved into the object called OCRResult; this has a property called Barcodes that assembles all the available barcode data into a list. By using the foreach loop, we can get all the barcodes' details one-by-one. Also, it will scan the barcode and read the value of the barcode — two operations completed in one process!

Furthermore, threading options are supported too, meaning we can perform multiple OCR processes at the same time. IronOCR is also able to recognize a specific area from a specified region.

// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Create OcrInput to manage input images
using (var Input = new OcrInput())
{
    // Define a specific rectangular area on the image for OCR
    var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };

    // Add an image specifying the area to be processed
    Input.Add("document.png", ContentArea);

    // Perform OCR to read text from the specified area
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);
}
// Create an instance of IronTesseract for OCR operations
var Ocr = new IronTesseract();

// Create OcrInput to manage input images
using (var Input = new OcrInput())
{
    // Define a specific rectangular area on the image for OCR
    var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };

    // Add an image specifying the area to be processed
    Input.Add("document.png", ContentArea);

    // Perform OCR to read text from the specified area
    var Result = Ocr.Read(Input);

    // Output the extracted text to the console
    Console.WriteLine(Result.Text);
}
' Create an instance of IronTesseract for OCR operations
Dim Ocr = New IronTesseract()

' Create OcrInput to manage input images
Using Input = New OcrInput()
	' Define a specific rectangular area on the image for OCR
	Dim ContentArea = New System.Drawing.Rectangle() With {
		.X = 215,
		.Y = 1250,
		.Height = 280,
		.Width = 1335
	}

	' Add an image specifying the area to be processed
	Input.Add("document.png", ContentArea)

	' Perform OCR to read text from the specified area
	Dim Result = Ocr.Read(Input)

	' Output the extracted text to the console
	Console.WriteLine(Result.Text)
End Using
$vbLabelText   $csharpLabel

The above is a sample code for performing OCR on a specific region. We only need to specify the rectangular region on the image or PDF — the Tesseract engine in IronOCR enables the recognition of the text.

7. Conclusion

When employing IronOCR in the .NET Framework context, Tesseract is straightforward and easy to use. It supports photos and PDF documents in a variety of ways. It also provides a number of settings for improving the Tesseract OCR library's performance. Various languages are supported, as well as numerous languages in a single operation. To discover more about the Tesseract OCR, visit their website.

ABBYY FineReader PDF is a software application that uses an artificial intelligence engine to recognize an image/PDF document. It also provides various settings to improve the performance of the OCR process. Further, it provides the option to select multiple languages. ABBYY FineReader PDF does have some limitations on the usage of the page conversions. There are different prices for different operating systems. To know more about the ABBYY FineReader PDF price details, click here.

In our testing, IronOCR demonstrated strong performance compared to ABBYY FineReader PDF. In the specific test cases presented in this comparison, some characters and numbers in low-quality images were not recognized as accurately by FineReader, while IronOCR provided more accurate results for those particular scenarios. IronOCR also offers the additional capability to recognize barcode data and read barcode values from images. The IronOCR package provides a lifetime license, and there are no ongoing costs. The IronOCR package supports multiple platforms at a single price. To know more about IronOCR price details, click here.

请注意ABBYY FineReader PDF Software is a registered trademark of its respective owner. This site is not affiliated with, endorsed by, or sponsored by ABBYY FineReader PDF Software. All product names, logos, and brands are property of their respective owners. Comparisons are for informational purposes only and reflect publicly available information at the time of writing.

常见问题解答

是什么使得IronOCR成为ABBYY FineReader的更佳替代方案?

IronOCR被认为优于其准确的OCR性能、处理低质量图像的能力以及综合功能,如读取条形码和QR码。它还提供终身许可证,没有持续费用,使其具有成本效益。

IronOCR如何处理低质量图像?

IronOCR提供高级图像校正功能,可以增强低分辨率或低质量扫描图像的质量,从而提高OCR结果的准确性。

IronOCR支持哪些平台?

IronOCR支持多个平台,包括Windows、Linux和macOS,并具有单一的终身许可证。

IronOCR能否对条形码执行OCR?

是的,IronOCR可以通过配置库以使用其OCR功能检测和提取条形码值,从而读取图像中的条形码。

使用IronOCR的多线程功能有哪些好处?

IronOCR中的多线程功能允许多个OCR进程同时运行,大大提高了性能和处理速度。

IronOCR支持哪些语言?

IronOCR支持125种不同语言的OCR,使其成为全球应用的多功能工具。

IronOCR的许可与ABBYY FineReader的许可相比如何?

IronOCR提供终身许可证,没有持续费用,而ABBYY FineReader的价格可能会根据操作系统有所不同,可能涉及持续费用。

如何将IronOCR集成到我的C#项目中?

您可以使用Visual Studio、Visual Studio命令行或从NuGet网站下载将IronOCR集成到您的C#项目中。

使用IronOCR可以转换哪些文件格式?

IronOCR可以将图像和PDF转换为多种可编辑格式,包括Microsoft Word、Excel和可搜索的PDF。

为什么在读取QR码时首选IronOCR?

由于其高精度和强大的功能集,包括处理广泛的图像格式和质量级别的能力,IronOCR在读取QR码时被优选。

Kannaopat Udonpant
软件工程师
在成为软件工程师之前,Kannapat 在日本北海道大学完成了环境资源博士学位。在攻读学位期间,Kannapat 还成为了车辆机器人实验室的成员,隶属于生物生产工程系。2022 年,他利用自己的 C# 技能加入 Iron Software 的工程团队,专注于 IronPDF。Kannapat 珍视他的工作,因为他可以直接从编写大多数 IronPDF 代码的开发者那里学习。除了同行学习外,Kannapat 还喜欢在 Iron Software 工作的社交方面。不撰写代码或文档时,Kannapat 通常可以在他的 PS5 上玩游戏或重温《最后生还者》。