与其他组件比较

IronOCR和Dynamsoft OCR之间的对比

发布 2022年六月13日
分享:

光学字符识别(或 OCR)是一种数据录入程序,涉及书面和印刷文本的识别和数字化。它是一种计算机技术,利用图像分析将印刷文本的数码照片转换成文字处理器等其他程序可以使用的字母和数字。文本被转换成字符编码,以便在计算机上进行搜索和更改。

过去是一个所有文档都是实体文档的世界,而未来可能是一个所有文档都是数字文档的社会,但现在却处于变化之中。实体文件和数字文件在这种过渡状态下共存,因此,OCR 等技术对于前后转换至关重要。

文档恢复、数据录入和可访问性只是 OCR 的部分应用。大多数 OCR 应用都来自扫描文件,但偶尔也会使用照片。OCR 可以节省大量时间,因为重打材料往往是唯一的选择。下面举例说明如何使用 OCR:

  • 可从扫描文件(包括传真)中恢复可编辑的文本文件。
  • 表单可根据手写内容进行近似分类。
  • 利用图书扫描创建可搜索和可编辑的电子图书。
  • 利用截图照片搜索和更改文本。

  • 使用文本转语音技术为视觉障碍者朗读书籍。

虽然这些只是 OCR 应用的一小部分,但它们显示了该技术在各行各业的广泛用途。几乎所有公司的所有员工每天都要大量使用文档,因此业务用途是 OCR 系统开发的一个重要考虑因素。

在本文中,我们将比较两款功能最强大的 OCR 阅读器:

  • IronOCR
  • Dynamsoft OCR

IronOCR 和 Dynamsoft OCR 是两个 .NET OCR 库,支持扫描图像的转换和 PDF 文档的 OCR 处理。只需几行代码,您就可以将图像转换为可搜索的文本。您还可以检索单个单词、字母和段落。

IronOCR - 卓越的功能

IronOCR 具有独一无二的能力,可以检测、读取和解释未经精确扫描的图片和 PDF 文档中的文本。IronOCR 提供了从文件和照片中提取文本的最简单方法,尽管它并不总是最快的,因为它能自动锐化和修正低质量扫描,减少倾斜、失真、背景噪点和透视问题,同时还能提高分辨率和对比度。

IronOCR 允许开发人员向其发送单页或多页扫描图像,它会返回所有文本、条形码和 QR 信息。OCR 库中的一组类为基于 Web、桌面或控制台的应用程序增加了 OCR 功能。Tesseract OCR C# 以及网络应用程序 JPG、PNG、TIFF、PDF、GIF 和 BMP 只是可用作输入的几种格式。

IronOCR 的光学字符识别功能 (光学字符识别) 引擎可以读取使用多种常见字体、斜体、加权和下划线编写的文本。裁剪类使 OCR 能够快速、精确地工作。在处理多页文档时,IronOCR 的多线程引擎可加快 OCR 的速度。

IronOCR 功能

在 Tesseract 管理方面,我们使用 IronOCR,因为它在以下方面独一无二:

  • 可直接在纯 .NET 环境中工作
  • 无需在计算机上安装 Tesseract
  • 运行最新的引擎Tesseract 5 ( 以及魔方 4 和 3)
  • 适用于任何 .NET 项目:.NET Framework 4.5 +、.NET Standard 2 +、.NET Core 2、3 和 .NET 5
  • 与传统的 Tesseract 相比,精度更高,速度更快
  • 支持 Xamarin、Mono、Azure 和 Docker
  • 使用 NuGet 软件包管理复杂的 Tesseract 词典系统
  • 无需配置即可支持 PDFS、多帧 Tiffs 和所有主要图像格式

  • 可纠正低质量和倾斜扫描,以获得 Tesseract 的最佳效果。

Dynamsoft OCR - 功能

Dynamsoft.NET OCR 库是一个 .NET 组件,可提供快速、可靠的光学字符识别功能。它可用于在 C# 或 VB.NET 中创建 .NET 桌面应用程序。您只需创建代码,就能将 PDF 或照片中的无用文本转换为数字文本,以便使用基本的 OCR API 进行编辑、搜索、存档等操作。

*可通过以下方式从扫描仪和其他符合 TWAIN 标准的设备获取图像:

  • 支持本地、缓冲内存和磁盘文件图像传输机制。
  • 利用自动文档进纸器,可以进行批量扫描 (ADF).
  • TWAIN 属性可用于修改常用设备功能。
  • 自动进纸(IfAutoFeed)、自动扫描(IfAutoScan)、分辨率(Resolution)、位深度(BitDepth)、亮度(Brightness)、对比度(Contrast)、单位(Unit)、双面(Duplex)和其他功能都可以更改。
  • 支持空白页检测。
  • 允许更改和保存扫描仪配置文件。

从符合 UVC 和 WIA 标准的网络摄像头捕捉图像:

  • 从所选网络摄像头捕捉照片的同时显示实时视频馈送。
  • 自定义摄像头的设置:亮度、对比度、色调、饱和度、清晰度、伽马值、白平衡、背光补偿、增益、色彩启用、变焦、对焦、曝光、光圈、摇镜头、倾斜、滚动。

可靠的图像加载/查看

  • 可加载 BMP、JPEG、PNG、TIFF 和多页 TIFF 格式的图像。
  • 支持放大和缩小照片。
  • 可从本地硬盘、FTP 服务器、HTTP 服务器或数据库中检索图片。
  • 使用最全面的 .NET 图像组件之一,对 BMP、JPEG、PNG 和 TIFF 进行图像解码

保存和上传/下载

  • 可通过文件流读写照片。
  • 支持将捕获的照片保存为 BMP、JPEG、PNG、TIFF 或多页 TIFF 格式,并将其保存到本地硬盘、网络服务器或数据库中。
  • 支持 RLE、G3/G4、LZW、PackBits 和 TIFF 压缩。
  • 支持 HTTPS 上传和下载。
  • 支持 BMP、JPEG、PNG 和 TIFF 图像编码,是目前最广泛的.NET 图像组件之一。
  • 可将新获得的照片附加到现有的 TIFF 文件中。

在 ASP.NET 中从扫描的 PDF 或其他图像中读取文本 (光学字符识别)

在当今快节奏的世界中,客户希望工作能迅速完成。有紧急项目的客户经常与我们联系。如果项目需要扫描包含图像的文件,我们的技术可以简单地识别图像内容并将其转换为文本。光学字符识别 (光学字符识别) 可为公司节省时间和金钱,同时减少数据输入错误。

使用 IronOCR

IronOCR 使用 IronOcr.IronTesseract 类来执行 OCR 转换。

在这个基本示例中,我们使用 IronOcr.IronTesseract 类从图像中读取文本,并自动将结果返回为字符串。

// PM> Install-Package IronOcr
using IronOcr;
var Result = new IronTesseract().Read(@"img\Screenshot.png");
Console.WriteLine(Result.Text);
// PM> Install-Package IronOcr
using IronOcr;
var Result = new IronTesseract().Read(@"img\Screenshot.png");
Console.WriteLine(Result.Text);
' PM> Install-Package IronOcr
Imports IronOcr
Private Result = (New IronTesseract()).Read("img\Screenshot.png")
Console.WriteLine(Result.Text)
VB   C#

因此,下面这段话是百分之百准确的:


IronOCR 简单示例

在这个简单的示例中,我们将测试 C# OCR 库从 PNG 中读取文本的准确性。

图像中读取文本的准确性。这是一个非常基本的测试,但随着教程的深入,情况会越来越复杂。

敏捷的棕狐狸跳过懒惰的狗

虽然表面上看似简单,但背后却隐藏着复杂的行为:扫描图像以确定对齐、质量和分辨率,查看其属性,优化 OCR 引擎,最后像人类一样读取文本。

对于机器来说,OCR 是一项艰巨的任务,其阅读速度可能与人类不相上下。换句话说,OCR 不是一个快速的程序。但在这种情况下,它是完全正确的。

C# OCR 应用程序结果的准确性

在大多数实际场景中,开发人员都希望自己的项目能尽快运行。在这种情况下,我们建议您使用 IronOCR add ons 命名空间的 OcrInput 和 IronTesseract 类。

您可以使用 OcrInput 设置 OCR 作业的具体功能,例如

  • JPEG、TIFF、GIF、BMP 和 PNG 只是其中几种可使用的图像格式
  • 导入 PDF 文档的全部或部分内容
  • 增强图像的对比度、分辨率和大小
  • 旋转、扫描噪音、数字噪音、倾斜和负片校正

铁魔方

从数百种预打包语言和方言中进行选择

  • 立即使用 Tesseract 5、4 或 3 OCR 引擎
  • 如果我们要查看的是截图、片段或整个文档,请指定文档类型
  • 识别条形码
  • 可存档的 PDF、Hocr HTML、DOM 和字符串都是 OCR 结果的选项
using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.tiff")) {
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.tiff")) {
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput("img\Potter.tiff")
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

即使是中等质量的扫描件,我们也能保证 100% 的准确性。

C# OCR Scan From Tiff 示例

正如您所看到的,阅读文本 (如果需要,还可以使用条形码) 从 TIFF 等扫描图像中进行 OCR 识别非常容易。这项 OCR 工作的准确率为 100%。

接下来,我们将尝试对同一页面进行质量低得多的扫描,这种扫描的 DPI 很低,而且存在大量失真和数字噪音,原纸也有破损。

C# OCR 低分辨率扫描与数字噪声

这也是 IronOCR 与 Tesseract 等其他 OCR 库相比真正的亮点所在。我们会发现,其他 OCR 项目避免讨论在真实世界的扫描图像上使用 OCR,而不是通过数字创建不切实际的 "完美 "测试案例,以达到 100% 的 OCR 准确率。

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.Deskew(); // removes rotation and perspective
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.Deskew(); // removes rotation and perspective
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput("img\Potter.LowQuality.tiff")
Input.Deskew() ' removes rotation and perspective
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

不添加输入.纠偏() 我们得到的精确度为 52.5%。这还不够好。

添加输入纠偏() 准确率达到99.8%,几乎与高质量扫描的 OCR 识别率相当。

使用 Dynamsoft OCR

我们将介绍一些使用 Dynamic Web TWAIN 在 JavaScript 中进行 TWAIN 扫描和客户端 OCR 的代码片段。

扫描图像

您可以使用 Dynamic Web TWAIN 的简单应用程序接口更改扫描设置并从 TWAIN 扫描仪获取照片。

function acquireImage()
{
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex); //select an available TWAIN scanners

    //set scanning settings like pixel type, resolution, ADF etc.
    DWObject.IfShowUI = false; //don't show the user interface of the scanner
    DWObject.PixelType = 1; //scan in gray
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; //scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    //acquire images from scanners
    DWObject.AcquireImage();
}
function acquireImage()
{
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex); //select an available TWAIN scanners

    //set scanning settings like pixel type, resolution, ADF etc.
    DWObject.IfShowUI = false; //don't show the user interface of the scanner
    DWObject.PixelType = 1; //scan in gray
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; //scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    //acquire images from scanners
    DWObject.AcquireImage();
}
Private Function acquireImage() As [function]
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex) 'select an available TWAIN scanners

	'set scanning settings like pixel type, resolution, ADF etc.
	DWObject.IfShowUI = False 'don't show the user interface of the scanner
	DWObject.PixelType = 1 'scan in gray
	DWObject.Resolution = 300
	DWObject.IfFeederEnabled = True 'scan from auto feeder
	DWObject.IfDuplexEnabled = False
	DWObject.IfDisableSourceAfterAcquire = True

	'acquire images from scanners
	DWObject.AcquireImage()
End Function
VB   C#

下载 OCR 专业模块

要将 OCR Professional 模块用于客户端 OCR,您需要在头部包含 ocrpro.js 并下载 OCR Pro DLL。


<script type="text/javascript" src="Resources/addon/dynamsoft.webtwain.addon.ocrpro.js"> </script>

Make edits to the .js file:

var CurrentPathName = unescape(location.pathname);
CurrentPath = CurrentPathName.substring(0, CurrentPathName.lastIndexOf("/") + 1);
DWObject.Addon.OCRPro.Download(CurrentPath + "Resources/addon/OCRPro.zip", OnSuccess, OnFailure);
JAVASCRIPT

Recognize text using OCR

Using the JS OCR recognition API to extract text from scanned images is as simple as inserting the code below.

DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image
DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image
DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo) ' 0 is the index of the image
VB   C#

Reading Cropped Regions of Images

Both sets of software offer solutions for cropping images for OCR.

Reading cropped regions with IronOCR

Iron's branch of Tesseract OCR is adept at reading specific regions of images, as shown in the following code sample.

We can make use of System.Drawing.Rectangle that is used to describe the exact region of an image to be read in pixels.

When dealing with a standardized form that is filled out, and only a portion of the content changes from case to case, this can be really handy.

Scanning a Section of a Page: We can make use of System.Drawing.Rectangle to designate a region in which we shall read a document. Pixels are always the unit of measurement.

We shall find that this improves speed while also avoiding reading needless text. In this example, we will read a student's name from a central region of a standardized paper.

C# OCR Scan From Tiff Example
C# OCR Scan From Tiff Example
using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// a 41% improvement on speed
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
Input.AddImage("img/ComSci.png", ContentArea);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// a 41% improvement on speed
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
Input.AddImage("img/ComSci.png", ContentArea);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput()
' a 41% improvement on speed
Dim ContentArea = New System.Drawing.Rectangle() With {
	.X = 215,
	.Y = 1250,
	.Height = 280,
	.Width = 1335
}
Input.AddImage("img/ComSci.png", ContentArea)
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

This results in a 41 percent boost in speed, while also allowing us to be more specific. This is extremely valuable for .NET OCR applications involving documents that are comparable and consistent, including invoices, receipts, checks, forms, expense claims, and so on.

When reading PDFs, ContentAreas (OCR cropping) is also supported.

Reading cropped regions with Dynamsoft OCR

To begin, launch Visual Studio and build a new C# Windows Forms Application, or open an existing one.

We will need to include DynamicDotNetTWAIN.dll, DynamicOCR.dll, and the appropriate language package. To do so, navigate to Tools -> Choose Toolbox Items, then to the.NET Framework Components tab, click the Browse... button, and locate DynamicDotNetTWAIN.dll in "..Program Files (x86)DynamsoftDynamic.NET TWAIN 4.3 TrialBinv4.0" or v2.0 (depends on the .NET Framework version you are using). Click the OK button. The DynamicDotNetTwain component will then appear in the Toolbox dialog (under the View menu), as illustrated in the accompanying image.

Add Dynamic .NET TWAIN .NET Component

Right-click the project file in Solution Explorer and select Add-> Existing Item... Then, in the file type filter's drop-down list, select All Files. Navigate to  “..\Program Files (x86)\Dynamsoft\Dynamic .NET TWAIN 4.3 Trial\Bin\OCRResources” to add items to the project folder. The .NET TWAIN component can then be dragged and dropped onto the form.

This is the code for clicking the LoadImage button:

private void button1_Click(object sender, EventArgs e) { OpenFileDialog filedlg = new OpenFileDialog(); if (filedlg.ShowDialog() == DialogResult.OK) { dynamicDotNetTwain1.LoadImage(filedlg.FileName);
// choose an image from your local disk and load it into Dynamic .NET TWAIN
} }

We can now attempt to OCR the loaded image and turn it into a searchable text file.

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) { dynamicDotNetTwain1.OCRTessDataPath = "../../"; // the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng";
// the language type
dynamicDotNetTwain1.OCRDllPath = "../../";
//the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text; byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
// OCR the selected area of the image
if (sbytes != null) { SaveFileDialog filedlg = new SaveFileDialog(); filedlg.Filter = "Text File(*.txt) *.txt"; if (filedlg.ShowDialog() == DialogResult.OK) { FileStream fs = File.OpenWrite(filedlg.FileName); fs.Write(sbytes, 0, sbytes.Length);
//save the OCR result as a text file
fs.Close(); } MessageBox.Show("OCR successful"); } else { MessageBox.Show(dynamicDotNetTwain1.ErrorString); } }
private void button1_Click(object sender, EventArgs e) { OpenFileDialog filedlg = new OpenFileDialog(); if (filedlg.ShowDialog() == DialogResult.OK) { dynamicDotNetTwain1.LoadImage(filedlg.FileName);
// choose an image from your local disk and load it into Dynamic .NET TWAIN
} }

We can now attempt to OCR the loaded image and turn it into a searchable text file.

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) { dynamicDotNetTwain1.OCRTessDataPath = "../../"; // the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng";
// the language type
dynamicDotNetTwain1.OCRDllPath = "../../";
//the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text; byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
// OCR the selected area of the image
if (sbytes != null) { SaveFileDialog filedlg = new SaveFileDialog(); filedlg.Filter = "Text File(*.txt) *.txt"; if (filedlg.ShowDialog() == DialogResult.OK) { FileStream fs = File.OpenWrite(filedlg.FileName); fs.Write(sbytes, 0, sbytes.Length);
//save the OCR result as a text file
fs.Close(); } MessageBox.Show("OCR successful"); } else { MessageBox.Show(dynamicDotNetTwain1.ErrorString); } }
Private Sub button1_Click(ByVal sender As Object, ByVal e As EventArgs)
	Dim filedlg As New OpenFileDialog()
	If filedlg.ShowDialog() = DialogResult.OK Then
		dynamicDotNetTwain1.LoadImage(filedlg.FileName)
' choose an image from your local disk and load it into Dynamic .NET TWAIN
	End If
End Sub

We can now attempt [to] OCR the loaded image [and] turn it into a searchable text file.private Sub dynamicDotNetTwain1_OnImageAreaSelected(ByVal sImageIndex As Short, ByVal left As Integer, ByVal top As Integer, ByVal right As Integer, ByVal bottom As Integer)
	dynamicDotNetTwain1.OCRTessDataPath = "../../" ' the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng"
' the language type
dynamicDotNetTwain1.OCRDllPath = "../../"
'the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text
Dim sbytes() As Byte = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom)
' OCR the selected area of the image
If sbytes IsNot Nothing Then
	Dim filedlg As New SaveFileDialog()
	filedlg.Filter = "Text File(*.txt) *.txt"
	If filedlg.ShowDialog() = DialogResult.OK Then
		Dim fs As FileStream = File.OpenWrite(filedlg.FileName)
		fs.Write(sbytes, 0, sbytes.Length)
'save the OCR result as a text file
fs.Close()
	End If
	MessageBox.Show("OCR successful")
Else
	MessageBox.Show(dynamicDotNetTwain1.ErrorString)
End If
End Sub
VB   C#

This is how the application looks.

Demo App of Zone OCR using Dynamic .NET TWAIN OCR SDK

Image Performance Tuning

The quality of the input image is the most crucial determinant in the speed of an OCR task. The lower the background noise and the higher the dpi, with a great goal value of around 200 dpi, the faster and more accurate the OCR output.

Image Processing Techniques for Dynamsoft OCR

We need to use OCR in a variety of situations, such as scanning a credit card number with our phone or extracting text from paper documents. OCR capabilities are included in Dynamsoft Label Recognition (DLR) and Dynamic Web TWAIN (DWT).

Although they can do an excellent job in general, we can improve the results by using various image processing techniques.

Lighten/remove shadows

Poor illumination may have an impact on the OCR result. To improve the outcome, we can whiten photos or eliminate shadows from images.

Invert

Because the OCR engine is often trained on text in dark colors, text in light colors can be harder to discover and recognize.

Light text

It will be easier to recognize if we invert its color

Light text inverted

To perform the inversion, we can use the GrayscaleTransformationModes parameter in DLR.

Here are the JSON settings:

"GrayscaleTransformationModes": [
    {
        "Mode": "DLR_GTM_INVERTED"
    }
]
"GrayscaleTransformationModes": [
    {
        "Mode": "DLR_GTM_INVERTED"
    }
]
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'"GrayscaleTransformationModes": [{ "Mode": "DLR_GTM_INVERTED" }]
VB   C#

DLR .net’s reading result:

Light text result

Rescale

If the letter height is too low, the OCR engine may not produce a good result. In general, the image should have a DPI of at least 300.

There is a ScaleUpModes parameter in DLR 1.1 that allows you to scale up letters. We may, of course, scale the image ourselves.

Reading the image directly yields the incorrect result:

1x image

After scaling up the image x2, the result is correct:

2x image

Deskew

It is fine if the text is a little distorted. However, if it is overly skewed, the outcome will be adversely altered. To improve the outcome, we need to crop the image.

To accomplish this, we can use the Hough Line Transform in OpenCV.

Skewed image

Here is the code to deskew the image above.

#coding=utf-8
import numpy as np
import cv2
import math
from PIL import Image

def deskew():
src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR)
gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5,5),np.uint8)
erode_Img = cv2.erode(gray,kernel)
eroDil = cv2.dilate(erode_Img,kernel) # erode and dilate
showAndWaitKey("eroDil",eroDil)

    canny = cv2.Canny(eroDil,50,150) # edge detection
    showAndWaitKey("canny",canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform
    drawing = np.zeros(src.shape [:], dtype=np.uint8)

    maxY=0
    degree_of_bottomline=0
    index=0
    for line in lines:        
        x1, y1, x2, y2 = line [0]            
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index==0:
            maxY=y1
            degree_of_bottomline=degree # take the degree of the line at the bottom
        else:        
            if y1>maxY:
                maxY=y1
                degree_of_bottomline=degree
        index=index+1
    showAndWaitKey("houghP",drawing)

    img=Image.fromarray(src)
    rotateImg = img.rotate(degree_of_bottomline)
    rotateImg_cv = np.array(rotateImg) 
    cv2.imshow("rotateImg",rotateImg_cv)
    cv2.imwrite("deskewed.jpg",rotateImg_cv)
    cv2.waitKey()

def showAndWaitKey(winName,img):
cv2.imshow(winName,img)
cv2.waitKey()

if __name__ == "__main__":              
deskew()
#coding=utf-8
import numpy as np
import cv2
import math
from PIL import Image

def deskew():
src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR)
gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5,5),np.uint8)
erode_Img = cv2.erode(gray,kernel)
eroDil = cv2.dilate(erode_Img,kernel) # erode and dilate
showAndWaitKey("eroDil",eroDil)

    canny = cv2.Canny(eroDil,50,150) # edge detection
    showAndWaitKey("canny",canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform
    drawing = np.zeros(src.shape [:], dtype=np.uint8)

    maxY=0
    degree_of_bottomline=0
    index=0
    for line in lines:        
        x1, y1, x2, y2 = line [0]            
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index==0:
            maxY=y1
            degree_of_bottomline=degree # take the degree of the line at the bottom
        else:        
            if y1>maxY:
                maxY=y1
                degree_of_bottomline=degree
        index=index+1
    showAndWaitKey("houghP",drawing)

    img=Image.fromarray(src)
    rotateImg = img.rotate(degree_of_bottomline)
    rotateImg_cv = np.array(rotateImg) 
    cv2.imshow("rotateImg",rotateImg_cv)
    cv2.imwrite("deskewed.jpg",rotateImg_cv)
    cv2.waitKey()

def showAndWaitKey(winName,img):
cv2.imshow(winName,img)
cv2.waitKey()

if __name__ == "__main__":              
deskew()
#coding=utf-8
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'import TryCast(numpy, np) import cv2 import math from PIL import Image def deskew(): src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR) gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY) kernel = np.ones((5,5),np.uint8) erode_Img = cv2.erode(gray,kernel) eroDil = cv2.dilate(erode_Img,kernel) # erode @and dilate showAndWaitKey("eroDil",eroDil) canny = cv2.Canny(eroDil,50,150) # edge detection showAndWaitKey("canny",canny) lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform drawing = np.zeros(src.shape [:], dtype=np.uint8) maxY=0 degree_of_bottomline=0 index=0 for line in lines: x1, y1, x2, y2 = line [0] cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA) k = float(y1-y2)/(x1-x2) degree = np.degrees(math.atan(k)) if index==0: maxY=y1 degree_of_bottomline=degree # take the degree @of the line at the bottom else: if y1> maxY: maxY=y1 degree_of_bottomline=degree index=index+1 showAndWaitKey("houghP",drawing) img=Image.fromarray(src) rotateImg = img.rotate(degree_of_bottomline) rotateImg_cv = np.array(rotateImg) cv2.imshow("rotateImg",rotateImg_cv) cv2.imwrite("deskewed.jpg",rotateImg_cv) cv2.waitKey() def showAndWaitKey(winName,img): cv2.imshow(winName,img) cv2.waitKey() if __name__ == "__main__": deskew()
VB   C#

Lines detected:

Lines detected

Deskewed:

Deskewed image

Image Processing Techniques for IronOCR

The quality of the input image is not important here because IronOCR excels at repairing defective documents (though this is time-consuming and will cause your OCR jobs to use more CPU cycles).

Choosing input image formats with less digital noise, such as TIFF or PNG, can also result in speedier outcomes than lossy image formats, such as JPEG.

The image filters listed below can significantly enhance performance:

OcrInput.Rotate (double degrees) — Rotates images clockwise by a specified number of degrees. Negative integers are used for anti-clockwise rotation.

OcrInput.Binarize() — This image filter makes every pixel either black or white, with no in-between. It may improve OCR performance in circumstances where the text-to-background contrast is very low.

OcrInput.ToGrayScale() — This image filter converts every pixel to a grayscale shade. It is unlikely to improve OCR accuracy, but it may increase speed.

OcrInput.Contrast() — Automatically increases contrast. In low-contrast scans, this filter frequently improves OCR speed and accuracy.

OcrInput.DeNoise() — This filter should be used only when noise is expected.

OcrInput.Invert() — Reverses all colors. For example, white becomes black: black becomes white.

OcrInput.Dilate() — Advanced morphology. Dilation is the process of adding pixels to the edges of objects in an image. (Erode's inverse)

OcrInput. Erode() — an advanced morphology function. Erosion is the process of removing pixels from the edges of objects. (Dilate's inverse)

OcrInput. Deskew() — Rotates an image so that it is orthogonal and the right way up. Because Tesseract tolerance for skewed scans can be as low as 5 degrees, this is quite useful for OCR.

DeepCleanBackgroundNoise() — Removes a lot of background noise. Only use this filter if you know there is a lot of background noise in the document because it can reduce OCR accuracy on clear documents and is quite CPU intensive.

OcrInput.EnhanceResolution — Improves the resolution of low-resolution photos. Because of OcrInput, this filter is rarely used. OcrInput and will detect and resolve low resolution automatically.

We may want to use Iron Tesseract to speed up OCR on higher-quality scans.

If we're looking for speed, we might start here and subsequently turn features back on until the proper balance is struck.

using IronOcr;
var Ocr = new IronTesseract();
// Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\\";
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"img\Potter.tiff"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
using IronOcr;
var Ocr = new IronTesseract();
// Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\\";
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"img\Potter.tiff"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Imports IronOcr
Private Ocr = New IronTesseract()
' Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\"
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly
Ocr.Language = OcrLanguage.EnglishFast
Using Input = New OcrInput("img\Potter.tiff")
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using
VB   C#

This result is 99.8% accurate compared to the baseline of 100% — but 35% faster.

Licensing and Pricing

Dynamsoft Licensing and Pricing

Per year license. All rates include one year of maintenance, which includes free software upgrades and premium support.

Dynamsoft offers two types of licenses:

Per client device license

The "One Client Device License" provides access to a same-origin Application (same protocol, same host, and same port) to use the software's features from a single client device. An inactive client device is one that has not accessed any software capability for 90 days in a row. An inactive client device's license seat will be instantly freed and made available for usage by any other active client device. When you reach the maximum number of license seats allowed, Dynamsoft will give you an extra 10% of your client device allowance for emergency use. Once the additional client device allowance has been depleted, no new client devices can access and use the software until there are available license seats again. Please keep in mind that exceeding your client device allowance has no effect on any client devices that have already been licensed.

Per-server license

To deploy the application to a single server, a "One Server License" is required. Servers refer to both physical and virtual servers and include, but are not limited to, production servers, failover servers, development servers that are also used for testing, quality assurance servers, testing servers, and staging servers, all of which require a license. Additional licenses are not required for continuous integration servers (build servers) or localhost development servers. The per-server license is only valid for on-premises server installations, and not for cloud deployments.

Pricing for Dynamsoft OCR starts at USD 1,249/year.

IronOCR Licensing and Pricing

As developers, we all want to accomplish our projects with the least amount of money and resources possible — budgeting is critical. Examine the chart to determine which license is best suited to your requirements and budget.

IronOCR provides licenses with a customizable number of developers, projects, and locations, allowing you to fulfill the needs of your project while only paying for the coverage you require.

IronOCR licensing keys enable you to publish your product without a watermark.

Licenses start from $749 and include one year of support and upgrades.

You can also use a trial license key to try IronOCR for free.

Conclusion

Tesseract OCR on Mac, Windows, Linux, Azure OCR, and Docker are all available with IronOCR for C#. .NET Framework 4.0 or above is required,  .NET Standard 2.0+, .NET Core 2.0+, .NET 5, Mono for macOS and Linux, and Xamarin for macOS are all examples of cross-platform development. IronOCR also uses the latest Tesseract 5 engine to read text, barcodes, and QR codes from all major image and PDF formats. In minutes, this library adds OCR functionality to your desktop, console, or web apps! The OCR can also read PDFs and multi-page TIFFs, and it can be saved as a searchable PDF document or XHTML in any OCR Scan. Plain text, barcode data, and an OCR result class encompassing paragraphs, lines, words, and characters are among its data output choices. It is available in 125 languages, including Arabic, Chinese, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Portuguese, Russian, and Spanish, but keep in mind that bespoke language packs can also be generated.

The Dynamic .NET TWAIN OCR add-on is a quick and reliable .NET component for Optical Character Recognition that you can use in WinForms and WPF applications written in C# or VB .NET. You can scan documents or capture photos from webcams using Dynamic .NET TWAIN's image capture module, and then conduct OCR on the images to convert the text in the images to text, searchable PDF files, or strings. Multiple Asian languages, as well as Arabic, are offered in addition to English.

IronOCR offers better licensing than Dynamsoft OCR; IronOcr starts at $749 with one year free, while Dynamsoft starts at $1249 with a free trial. IronOCR also offers licenses for multiple users, while with Dynamsoft, you only get one license per user.

While both sets of software aim at offering the best performance in terms of OCR readings of barcodes, image to text, and image to text, IronOCR stands out in that it shines its light even on images that are in pretty bad shape. It automatically puts in place its sophisticated tuning methods to give you the best OCR results. IronOCR also makes use of Tesseract to give you optimal results with little or no errors.

Iron Software is also offering its customers and users the option to grab its entire suite of software in just two clicks. This means that for the price of two of the components in the Iron Software suite, you can currently get all five components and uninterrupted support.

< 前一页
IronOCR与Tesseract.NET的比较
下一步 >
IronOCR与Abbyy Finereader的比较

准备开始了吗? 版本: 2024.9 刚刚发布

免费NuGet下载 总下载量: 2,319,721 查看许可证 >