OCR工具

Windows OCR 引擎与 Tesseract：详细比较

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

坎那帕·乌东攀

2024年四月3日

在当今的数字时代，光学字符识别（OCR）技术已经成为各个行业的重要组成部分，使图像和扫描文档能转变为可编辑和可搜索的文本。

在众多可用的OCR软件中，如Google Cloud Vision（Cloud Vision API）、Adobe Acrobat Pro DC、ABBYY Finereader等，Windows OCR Engine与Tesseract，以及IronOCR脱颖而出，作为突出的竞争者，各自提供独特的功能和能力，以帮助文档分析。

本文旨在对这三种 OCR 引擎进行全面的比较分析，评估它们的准确性、性能和易于集成性。

1.OCR 引擎简介

OCR 引擎是一种软件工具，用于识别和提取图像、PDF 和其他扫描文档中的纯文本。它们采用复杂的算法和机器学习技术来准确识别字符并将其转换为机器可读的文本文件。Windows OCR Engine、Tesseract 和 IronOCR 代表了三种广泛使用的 OCR 解决方案，它们各有优势和应用。

2.Windows OCR 引擎

Windows OCR Engine 集成在 Windows 操作系统中，为从输入图像和扫描文档中提取文本提供了一个方便且用户友好的解决方案。利用先进的图像处理技术，它可以准确识别各种语言和字体风格的文本。 Windows OCR 引擎可通过 Windows Runtime API 访问，从而实现与 Windows 应用程序的无缝集成，并具备命令行工具的功能。

2.1 Windows OCR 引擎的主要功能

语言支持：Windows OCR 引擎支持多种语言，因此适用于多语言文档。
图像处理：它采用了复杂的图像处理算法，即使在低质量图像中也能提高印刷文字识别的准确性。
与 Windows 应用程序集成：Windows OCR 引擎可与 Windows 应用程序无缝集成，使开发人员能够将 OCR 功能完全融入其软件中。

2.2 代码示例

using System;
using System.IO;
using System.Text;
using System.Threading.Tasks;
class Program
{
    static async Task Main(string [] args)
    {
        // Provide the path to the image file
        string imagePath = "sample.png";
        try
        {
            // Instantiate the program class
            Program program = new Program();
            // Call the ExtractText method to extract text from the image
            string extractedText = await program.ExtractText(imagePath);
            // Display the extracted text
            Console.WriteLine("Extracted Text:");
            Console.WriteLine(extractedText);
        }
        catch (Exception ex)
        {
            Console.WriteLine("An error occurred: " + ex.Message);
        }
    }
    public async Task<string> ExtractText(string image)
    {
        // Initialize StringBuilder to store extracted text
        StringBuilder text = new StringBuilder();
        try
        {
            // Open the image file stream
            using (var fileStream = System.IO.File.OpenRead(image))
            {
                Console.WriteLine("Extracted Text:");
                // Create a BitmapDecoder from the image file stream
                var bmpDecoder = await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(fileStream.AsRandomAccessStream());
                // Get the software bitmap from the decoder
                var softwareBmp = await bmpDecoder.GetSoftwareBitmapAsync();
                // Create an OCR engine from user profile languages
                var ocrEngine = Windows.Media.Ocr.OcrEngine.TryCreateFromUserProfileLanguages();
                // Recognize text from the software bitmap
                var ocrResult = await ocrEngine.RecognizeAsync(softwareBmp);
                // Append each line of recognized text to the StringBuilder
                foreach (var line in ocrResult.Lines)
                {
                    text.AppendLine(line.Text);
                }
            }
        }
        catch (Exception ex)
        {
            throw ex; // Propagate the exception
        }
        // Return the extracted text
        return text.ToString();
    }
}

using System;
using System.IO;
using System.Text;
using System.Threading.Tasks;
class Program
{
    static async Task Main(string [] args)
    {
        // Provide the path to the image file
        string imagePath = "sample.png";
        try
        {
            // Instantiate the program class
            Program program = new Program();
            // Call the ExtractText method to extract text from the image
            string extractedText = await program.ExtractText(imagePath);
            // Display the extracted text
            Console.WriteLine("Extracted Text:");
            Console.WriteLine(extractedText);
        }
        catch (Exception ex)
        {
            Console.WriteLine("An error occurred: " + ex.Message);
        }
    }
    public async Task<string> ExtractText(string image)
    {
        // Initialize StringBuilder to store extracted text
        StringBuilder text = new StringBuilder();
        try
        {
            // Open the image file stream
            using (var fileStream = System.IO.File.OpenRead(image))
            {
                Console.WriteLine("Extracted Text:");
                // Create a BitmapDecoder from the image file stream
                var bmpDecoder = await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(fileStream.AsRandomAccessStream());
                // Get the software bitmap from the decoder
                var softwareBmp = await bmpDecoder.GetSoftwareBitmapAsync();
                // Create an OCR engine from user profile languages
                var ocrEngine = Windows.Media.Ocr.OcrEngine.TryCreateFromUserProfileLanguages();
                // Recognize text from the software bitmap
                var ocrResult = await ocrEngine.RecognizeAsync(softwareBmp);
                // Append each line of recognized text to the StringBuilder
                foreach (var line in ocrResult.Lines)
                {
                    text.AppendLine(line.Text);
                }
            }
        }
        catch (Exception ex)
        {
            throw ex; // Propagate the exception
        }
        // Return the extracted text
        return text.ToString();
    }
}

Imports System
Imports System.IO
Imports System.Text
Imports System.Threading.Tasks
Friend Class Program
	Shared Async Function Main(ByVal args() As String) As Task
		' Provide the path to the image file
		Dim imagePath As String = "sample.png"
		Try
			' Instantiate the program class
			Dim program As New Program()
			' Call the ExtractText method to extract text from the image
			Dim extractedText As String = Await program.ExtractText(imagePath)
			' Display the extracted text
			Console.WriteLine("Extracted Text:")
			Console.WriteLine(extractedText)
		Catch ex As Exception
			Console.WriteLine("An error occurred: " & ex.Message)
		End Try
	End Function
	Public Async Function ExtractText(ByVal image As String) As Task(Of String)
		' Initialize StringBuilder to store extracted text
		Dim text As New StringBuilder()
		Try
			' Open the image file stream
			Using fileStream = System.IO.File.OpenRead(image)
				Console.WriteLine("Extracted Text:")
				' Create a BitmapDecoder from the image file stream
				Dim bmpDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(fileStream.AsRandomAccessStream())
				' Get the software bitmap from the decoder
				Dim softwareBmp = Await bmpDecoder.GetSoftwareBitmapAsync()
				' Create an OCR engine from user profile languages
				Dim ocrEngine = Windows.Media.Ocr.OcrEngine.TryCreateFromUserProfileLanguages()
				' Recognize text from the software bitmap
				Dim ocrResult = Await ocrEngine.RecognizeAsync(softwareBmp)
				' Append each line of recognized text to the StringBuilder
				For Each line In ocrResult.Lines
					text.AppendLine(line.Text)
				Next line
			End Using
		Catch ex As Exception
			Throw ex ' Propagate the exception
		End Try
		' Return the extracted text
		Return text.ToString()
	End Function
End Class

$vbLabelText $csharpLabel

2.2.1 产出

Windows OCR 引擎与 Tesseract (OCR 功能比较)：图 1 - Windows OCR 引擎代码的控制台输出

3.魔方

Tesseract，是由谷歌开发的开源OCR引擎，以其准确性和多功能性获得了广泛的普及。它支持 100 多种语言，可以处理各种图像格式，包括 TIFF、JPEG 和 PNG。 Tesseract OCR 引擎采用深度学习算法和神经网络来实现高水平的文本识别准确性，因此适用于各种应用。

3.1 Tesseract 的主要功能

语言支持：Tesseract 引擎支持 100 多种语言，包括阿拉伯语和中文等复杂脚本。
图像预处理：它提供广泛的图像预处理功能，包括纠偏、二值化和降噪，以提高文本识别的准确性。
定制选项：Tesseract 允许用户微调 OCR 参数，并针对特定用例训练自定义模型，从而提高准确性和性能。

3.2 代码示例

using Patagames.Ocr;
using (var api = OcrApi.Create())
{
    api.Init(Patagames.Ocr.Enums.Languages.English);
    string plainText = api.GetTextFromImage(@"C:\Users\buttw\source\repos\ironqr\ironqr\bin\Debug\net5.0\Iron.png");
    Console.WriteLine(plainText);
}

using Patagames.Ocr;
using (var api = OcrApi.Create())
{
    api.Init(Patagames.Ocr.Enums.Languages.English);
    string plainText = api.GetTextFromImage(@"C:\Users\buttw\source\repos\ironqr\ironqr\bin\Debug\net5.0\Iron.png");
    Console.WriteLine(plainText);
}

Imports Patagames.Ocr
Using api = OcrApi.Create()
	api.Init(Patagames.Ocr.Enums.Languages.English)
	Dim plainText As String = api.GetTextFromImage("C:\Users\buttw\source\repos\ironqr\ironqr\bin\Debug\net5.0\Iron.png")
	Console.WriteLine(plainText)
End Using

$vbLabelText $csharpLabel

3.2.1 产出

Windows OCR 引擎与 Tesseract（OCR 功能比较）：图 2 - Tesseract 代码的控制台输出

4.IronOCR

IronOCR，由Iron Software开发的强大OCR引擎，以其卓越的准确性、易用性和多语言支持的多样性而闻名。它提供内部 OCR 功能，支持超过 127 种语言，适合全球应用。 IronOCR 利用先进的机器学习算法和云视觉技术，即使在具有挑战性的场景中，也能提供精确的文本识别结果。

4.1 IronOCR 的主要功能

高准确度：IronOCR 的文本识别准确度处于行业领先地位，可确保在各种文档类型和语言中获得可靠的结果。
支持多种语言：它支持超过 127 种语言，并提供全面的语言包，可实现无缝的多语言文本识别。
简单集成：IronOCR 提供与 .NET 应用程序的直接集成，通过直观的 API 和丰富的文档，简化开发流程，进行预处理，并对原始图像进行后处理以提取文本。

4.2 安装 IronOCR

在进入编码示例之前，让我们来看看如何使用 NuGet 包管理器安装 IronOCR。

在 Visual Studio 中进入 "工具 "菜单，选择 "NuGet 包管理器"。
1. 将出现一个新的列表，在此选择解决方案的 NuGet 包管理器。
将出现一个新窗口，转到 "浏览 "选项卡，然后点击在搜索栏中输入 "IronOcr"。
1. 将出现软件包列表选择最新的 IronOCR 软件包并点击安装。

4.3 代码示例 (C#)

using IronOcr;
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.English;
var result = ocr.Read("C:\\Users\\buttw\\source\\repos\\ironqr\\ironqr\\bin\\Debug\\net5.0\\Iron.png");
Console.WriteLine(result.Text);

using IronOcr;
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.English;
var result = ocr.Read("C:\\Users\\buttw\\source\\repos\\ironqr\\ironqr\\bin\\Debug\\net5.0\\Iron.png");
Console.WriteLine(result.Text);

Imports IronOcr
Private ocr = New IronTesseract()
ocr.Language = OcrLanguage.English
Dim result = ocr.Read("C:\Users\buttw\source\repos\ironqr\ironqr\bin\Debug\net5.0\Iron.png")
Console.WriteLine(result.Text)

$vbLabelText $csharpLabel

4.3.1 产出

Windows OCR 引擎与 Tesseract (OCR 功能对比)：图 5 - IronOCR 代码的控制台输出

5.比较评估

5.1 准确性和性能

Windows OCR 引擎和 Tesseract 提供了相当准确的识别，但在处理复杂布局时可能会有困难。
IronOCR：在准确性方面表现出色，能够在各种文档类型和语言中提供可靠的结果，包括嘈杂的图像。

5.2 易于集成

Windows OCR 引擎：可与 Windows 应用程序无缝集成，但缺乏自定义选项。
Tesseract：需要额外的配置和依赖关系才能集成，但提供了广泛的定制选项。
IronOCR for .NET：提供与 .NET 应用程序的简单集成，具有直观的 API 和全面的文档。

5.3 语言支持

Windows OCR 引擎和 Tesseract 与 Tesseract 和 IronOCR 相比，支持的语言数量有限。
IronOCR：支持超过 127 种语言，适合全球应用。
6. 结论
总之，虽然 Windows OCR 引擎和 Tesseract 是文本识别的热门选择，但IronOCR成为了最准确和多功能的 OCR 引擎。其业界领先的准确性、广泛的语言支持和简单的集成，使其成为寻求可靠 OCR 功能的企业和开发人员的优秀解决方案。通过利用 IronOCR，企业可以简化文档处理工作流程，提高数据提取的准确性，并从扫描文档和图像中获得有价值的见解。
IronOCR 提供免费试用。要了解有关IronOCR及其功能的更多信息，请访问此处。

坎那帕·乌东攀

立即与工程团队聊天

软件工程师

在成为软件工程师之前，Kannapat 从日本北海道大学完成了环境资源博士学位。在攻读学位期间，Kannapat 还成为了生物生产工程系车辆机器人实验室的成员。2022年，他利用自己的 C# 技能加入了 Iron Software 的工程团队，专注于 IronPDF。Kannapat 珍视他的工作，因为他能直接向编写 IronPDF 大部分代码的开发者学习。除了同伴学习，Kannapat 还享受在 Iron Software 工作的社交方面。不写代码或文档时，Kannapat 通常在 PS5 上玩游戏或重看《最后生还者》。

< 前一页
基于云的OCR（OCR功能比较）

下一步 >
Azure OCR 与 Google OCR（OCR 功能对比）