OCR图像优化过滤器
OcrInput 类为 C# 和 .NET 开发人员提供了细粒度的控制,可在预处理图像输入以提高速度和准确性。
这否定了使用Photoshop批处理脚本或ImageMagick来准备OCR图像的常见做法。
如何在 Tesseract 中交替使用 OCR 过滤器
- 安装一个OCR库以使用OCR过滤器
- 使用图像路径创建一个OcrInput对象
- (可选)使用滤波方法处理图像。
- 使用Read方法。
- 使用OcrResult's的Text属性显示结果。
using IronOcr;
using System;
var ocrTesseract = new IronTesseract();
using var ocrInput = new OcrInput();
// First load all image(s)
ocrInput.LoadImage(@"images\image.png");
// Note: You don't need all of them; most users only need Deskew() and occasionally DeNoise()
ocrInput.WithTitle("My Document");
ocrInput.Binarize();
ocrInput.Contrast();
ocrInput.Deskew();
ocrInput.DeNoise();
ocrInput.Despeckle();
ocrInput.Dilate();
ocrInput.EnhanceResolution(300);
ocrInput.Invert();
ocrInput.Rotate(90);
ocrInput.Scale(150);
ocrInput.Sharpen();
ocrInput.ToGrayScale();
ocrInput.Erode();
// WIZARD - If you are unsure use the debug-wizard to test all combinations:
string codeToRun = OcrInputFilterWizard.Run(@"images\image.png", out double confidence, ocrTesseract);
Console.WriteLine(codeToRun);
// Optional: Export modified images so you can view them.
foreach (var page in ocrInput.GetPages())
{
    page.SaveAsImage($"filtered_{page.Index}.bmp");
}
var ocrResult = ocrTesseract.Read(ocrInput);
Console.WriteLine(ocrResult.Text);Imports IronOcr
Imports System
Private ocrTesseract = New IronTesseract()
Private ocrInput = New OcrInput()
' First load all image(s)
ocrInput.LoadImage("images\image.png")
' Note: You don't need all of them; most users only need Deskew() and occasionally DeNoise()
ocrInput.WithTitle("My Document")
ocrInput.Binarize()
ocrInput.Contrast()
ocrInput.Deskew()
ocrInput.DeNoise()
ocrInput.Despeckle()
ocrInput.Dilate()
ocrInput.EnhanceResolution(300)
ocrInput.Invert()
ocrInput.Rotate(90)
ocrInput.Scale(150)
ocrInput.Sharpen()
ocrInput.ToGrayScale()
ocrInput.Erode()
' WIZARD - If you are unsure use the debug-wizard to test all combinations:
Dim confidence As Double
Dim codeToRun As String = OcrInputFilterWizard.Run("images\image.png", confidence, ocrTesseract)
Console.WriteLine(codeToRun)
' Optional: Export modified images so you can view them.
For Each page In ocrInput.GetPages()
	page.SaveAsImage($"filtered_{page.Index}.bmp")
Next page
Dim ocrResult = ocrTesseract.Read(ocrInput)
Console.WriteLine(ocrResult.Text)Install-Package IronOcr
OcrInput 类为 C# 和 .NET 开发人员提供了细粒度的控制,可在预处理图像输入以提高速度和准确性。
这否定了使用Photoshop批处理脚本或ImageMagick来准备OCR图像的常见做法。
OcrInput对象Read方法。OcrResult's的Text属性显示结果。