OCR工具

Azure OCR 与 Google OCR（OCR 功能对比）

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

坎那帕·乌东攀

2024年四月3日

在当今的数字化环境中，光学字符识别（OCR）技术对于希望从图像、PDF和其他文档中高效提取文本的企业来说已变得不可或缺。在多种OCR解决方案能力中，Microsoft Azure OCR、Google OCR和IronOCR因其独特的功能和能力而成为领先竞争者。在本文中，我们将讨论这些 OCR 服务、它们的功能以及选择哪一种。

1.OCR 服务简介

OCR 服务是基于云的平台，利用先进的机器学习算法从图像和文档中提取文本。 Azure OCR、Google OCR 和 IronOCR 是广泛使用的 OCR 服务，各有其优势和应用。

2.Azure OCR

Azure OCR工具，作为Microsoft Azure认知服务套件的一部分，为文本识别任务提供了可靠且可扩展的解决方案。它支持多种语言和文件格式，适用于不同的使用情况。 Microsoft Azure OCR 利用深度学习模型实现高精度文本提取，使企业能够高效简化文档处理工作流程 Azure 更像是一种计算机视觉服务。

2.1 Azure OCR 的主要功能

语言支持：Microsoft Azure OCR 支持 70 多种语言，包括阿拉伯语和中文等复杂脚本。
文档格式：它可以处理各种文档格式，包括图像、PDF 和扫描文档。
可扩展性：Azure OCR 可无缝扩展以处理大量文本提取请求，因此适合企业级应用。

2.2 代码示例 (C#)

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
using System;
class Program
{
    static async Task Main(string [] args)
    {
        // Create an instance of the ComputerVisionClient
        ComputerVisionClient client = new ComputerVisionClient(new ApiKeyServiceClientCredentials("YOUR_API_KEY"))
        {
            Endpoint = "https://YOUR_REGION.api.cognitive.microsoft.com/"
        };
        // Specify the image URL
        string imageUrl = "https://example.com/image.jpg";
        // Perform OCR on the image
        OcrResult result = await client.RecognizePrintedTextAsync(true, imageUrl);
        // Display the extracted text
        foreach (var region in result.Regions)
        {
            foreach (var line in region.Lines)
            {
                foreach (var word in line.Words)
                {
                    Console.Write(word.Text + " ");
                }
                Console.WriteLine();
            }
        }
    }
}

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
using System;
class Program
{
    static async Task Main(string [] args)
    {
        // Create an instance of the ComputerVisionClient
        ComputerVisionClient client = new ComputerVisionClient(new ApiKeyServiceClientCredentials("YOUR_API_KEY"))
        {
            Endpoint = "https://YOUR_REGION.api.cognitive.microsoft.com/"
        };
        // Specify the image URL
        string imageUrl = "https://example.com/image.jpg";
        // Perform OCR on the image
        OcrResult result = await client.RecognizePrintedTextAsync(true, imageUrl);
        // Display the extracted text
        foreach (var region in result.Regions)
        {
            foreach (var line in region.Lines)
            {
                foreach (var word in line.Words)
                {
                    Console.Write(word.Text + " ");
                }
                Console.WriteLine();
            }
        }
    }
}

Imports Microsoft.Azure.CognitiveServices.Vision.ComputerVision
Imports Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models
Imports System
Friend Class Program
	Shared Async Function Main(ByVal args() As String) As Task
		' Create an instance of the ComputerVisionClient
		Dim client As New ComputerVisionClient(New ApiKeyServiceClientCredentials("YOUR_API_KEY")) With {.Endpoint = "https://YOUR_REGION.api.cognitive.microsoft.com/"}
		' Specify the image URL
		Dim imageUrl As String = "https://example.com/image.jpg"
		' Perform OCR on the image
		Dim result As OcrResult = Await client.RecognizePrintedTextAsync(True, imageUrl)
		' Display the extracted text
		For Each region In result.Regions
			For Each line In region.Lines
				For Each word In line.Words
					Console.Write(word.Text & " ")
				Next word
				Console.WriteLine()
			Next line
		Next region
	End Function
End Class

$vbLabelText $csharpLabel

2.2.1 产出

Azure OCR 与 Google OCR（OCR 功能比较）：图 1 - Azure OCR 代码的控制台输出

3.谷歌 OCR

Google OCR 作为 Google Cloud 服务提供商的一部分，提供了一个强大的文本识别和文档分析平台。利用谷歌先进的机器学习算法，它可以提供准确的文本提取功能，并通过云计算提供图像标注和对象检测等附加功能。谷歌云平台 OCR 广泛应用于各行各业的发票处理、表格识别和内容数字化等任务。

3.1 Google OCR 的主要功能

多语言支持：Google OCR 支持 200 多种语言，可以识别多种文字，包括拉丁文、西里尔文和汉文。
图像分析：它提供先进的图像分析功能，如标签检测、人脸检测和地标识别。
与谷歌云服务集成：Google OCR 可与其他 Google Cloud vision API 服务无缝集成，使开发人员能够为文档管理和分析构建全面的解决方案。

3.2 代码示例 (C#)

using Google.Cloud.Vision.V1;
using Google.Protobuf;
using System.IO;
using Google.Apis.Auth.OAuth2;
var clientBuilder = new ImageAnnotatorClientBuilder { CredentialsPath = "path-to-credentials.json" };
var client = clientBuilder.Build();
var image = Image.FromFile("path-to-your-image.jpg");
var response = client.DetectText(image);
foreach (var annotation in response)
{
    Console.WriteLine(annotation.Description);
}

using Google.Cloud.Vision.V1;
using Google.Protobuf;
using System.IO;
using Google.Apis.Auth.OAuth2;
var clientBuilder = new ImageAnnotatorClientBuilder { CredentialsPath = "path-to-credentials.json" };
var client = clientBuilder.Build();
var image = Image.FromFile("path-to-your-image.jpg");
var response = client.DetectText(image);
foreach (var annotation in response)
{
    Console.WriteLine(annotation.Description);
}

Imports Google.Cloud.Vision.V1
Imports Google.Protobuf
Imports System.IO
Imports Google.Apis.Auth.OAuth2
Private clientBuilder = New ImageAnnotatorClientBuilder With {.CredentialsPath = "path-to-credentials.json"}
Private client = clientBuilder.Build()
Private image = System.Drawing.Image.FromFile("path-to-your-image.jpg")
Private response = client.DetectText(image)
For Each annotation In response
	Console.WriteLine(annotation.Description)
Next annotation

$vbLabelText $csharpLabel

3.2.1 产出

Azure OCR 与 Google OCR（OCR 功能比较）：图 2 - Google OCR 代码的控制台输出

4.IronOCR

IronOCR由Iron Software开发，是一个适用于.NET应用程序的多功能OCR库，提供行业领先的OCR准确性和性能。与基于云的 OCR 服务不同，IronOcr 提供内部文本提取功能，因此适合需要数据隐私和安全的应用。 IronOCR在精确性方面表现出色，尤其是在涉及复杂布局和噪声图像的场景中，使其成为寻求可靠OCR功能的企业的首选。

4.1 IronOCR 的主要功能

高准确性：IronOCR 在文本识别方面具有极高的准确性，可确保在各种文档类型和语言中获得可靠的结果。
本地 OCR：它提供本地文本提取功能，使企业能够在本地处理敏感文档，而无需依赖外部服务。
支持多种语言：IronOCR 支持超过 127 种语言，并提供全面的语言包，可实现无缝的多语言文本识别。

4.2 安装 IronPDF

IronOCR 可使用 NuGet 软件包管理器 for Console 安装，只需运行以下命令即可。

打开 Visual Studio，创建一个新项目或打开一个现有项目。
1. 在工具栏中转入工具并选择 NuGet 包管理器。
现在从新出现的列表中选择软件包管理器控制台。
现在控制台将出现，运行以下命令并按回车键。

Install-Package IronOcr

Install-Package IronOcr

SHELL

安装 IronOCR 需要一些时间，但一旦完成，我们就可以进入编码示例。

4.3 代码示例 (C#)

using IronOcr;
using System;
class Program
{
    static void Main(string [] args)
    {
        // Specify the path to the image file
        string imagePath = "path-to-your-image.jpg";
        // Instantiate the IronTesseract OCR engine
        var ocr = new IronTesseract();
        // Set the language for text recognition
        ocr.Language = OcrLanguage.English;
        // Perform text recognition on the image
        var result = ocr.Read(imagePath);
        // Display the extracted text
        Console.WriteLine("Extracted Text:");
        Console.WriteLine(result.Text);
    }
}

using IronOcr;
using System;
class Program
{
    static void Main(string [] args)
    {
        // Specify the path to the image file
        string imagePath = "path-to-your-image.jpg";
        // Instantiate the IronTesseract OCR engine
        var ocr = new IronTesseract();
        // Set the language for text recognition
        ocr.Language = OcrLanguage.English;
        // Perform text recognition on the image
        var result = ocr.Read(imagePath);
        // Display the extracted text
        Console.WriteLine("Extracted Text:");
        Console.WriteLine(result.Text);
    }
}

Imports IronOcr
Imports System
Friend Class Program
	Shared Sub Main(ByVal args() As String)
		' Specify the path to the image file
		Dim imagePath As String = "path-to-your-image.jpg"
		' Instantiate the IronTesseract OCR engine
		Dim ocr = New IronTesseract()
		' Set the language for text recognition
		ocr.Language = OcrLanguage.English
		' Perform text recognition on the image
		Dim result = ocr.Read(imagePath)
		' Display the extracted text
		Console.WriteLine("Extracted Text:")
		Console.WriteLine(result.Text)
	End Sub
End Class

$vbLabelText $csharpLabel

4.3.1 产出

Azure OCR与Google OCR（OCR功能比较）：图4 - IronOCR代码的控制台输出

5 比较评估

5.1 准确性和性能

Microsoft Azure OCR 和 Google OCR 在文本提取方面具有很高的准确性，适用于各种应用。
IronOCR在精准度方面表现出色，特别是在涉及复杂布局和噪声图像的情况下。

5.2 易于集成

Microsoft Azure OCR 和 Google 云解决方案 OCR 提供基于云的 OCR 服务，可轻松与云应用程序和服务集成。
IronOCR for .NET 提供内部部署 OCR 功能，并与 .NET 应用程序无缝集成，具有直观的 API 和丰富的文档。

5.3 可扩展性

Microsoft Azure OCR 和 Google OCR 可以无缝扩展，处理大量文本提取请求，因此适合企业级应用。
IronOCR 的可扩展性取决于应用程序的基础设施，因为它是在企业内部运行的。

6. 结论

在所有 OCR 工具中，Azure OCR、Google Vision API 和 IronOCR 被称为强大的 OCR 解决方案，在文本提取任务中提供高准确性和性能。虽然 Azure OCR 和 Google OCR 提供基于云的 OCR 服务，具有可扩展的基础设施和广泛的语言支持，但 IronOcr 作为最准确的解决方案脱颖而出。

IronOCR 脱颖而出，尤其适用于需要内部文本提取和卓越准确性的应用。通过利用 IronOCR，企业可以简化文档处理工作流程，提高数据提取的准确性，并从扫描的文档和图像中获取有价值的见解，从而使其成为首选。

要了解有关IronOCR及其服务的更多信息，请访问IronOCR文档页许可证，以帮助您开始改进图像处理方式。

坎那帕·乌东攀

立即与工程团队聊天

软件工程师

在成为软件工程师之前，Kannapat 从日本北海道大学完成了环境资源博士学位。在攻读学位期间，Kannapat 还成为了生物生产工程系车辆机器人实验室的成员。2022年，他利用自己的 C# 技能加入了 Iron Software 的工程团队，专注于 IronPDF。Kannapat 珍视他的工作，因为他能直接向编写 IronPDF 大部分代码的开发者学习。除了同伴学习，Kannapat 还享受在 Iron Software 工作的社交方面。不写代码或文档时，Kannapat 通常在 PS5 上玩游戏或重看《最后生还者》。

< 前一页
Windows OCR 引擎与 Tesseract：详细比较

下一步 >
最适合开发人员的免费 OCR 软件