OCR 圖像優化濾鏡
OcrInput
類別為 C# 和 .NET 開發者提供了在預處理圖像輸入以增加速度和精確度之前進行細粒度控制的功能。
這否定了使用Photoshop批處理腳本或ImageMagick來準備OCR圖片的常見做法。
如何在 Tesseract 中使用 OCR 篩選器替代方案
- 安裝 OCR 庫以使用 OCR 濾鏡
- 創建一個
OcrInput
使用影像路徑的物件 - (可選)使用濾鏡方法處理影像。
- 使用
讀取
方法。 - 顯示結果使用
OcrResult輸出
文本屬性。
using IronOcr; using System; var ocrTesseract = new IronTesseract(); using var ocrInput = new OcrInput(); // First load all image(s) ocrInput.LoadImage(@"images\image.png"); // Note: You don't need all of them; most users only need Deskew() and occasionally DeNoise() ocrInput.WithTitle("My Document"); ocrInput.Binarize(); ocrInput.Contrast(); ocrInput.Deskew(); ocrInput.DeNoise(); ocrInput.Despeckle(); ocrInput.Dilate(); ocrInput.EnhanceResolution(300); ocrInput.Invert(); ocrInput.Rotate(90); ocrInput.Scale(150); ocrInput.Sharpen(); ocrInput.ToGrayScale(); ocrInput.Erode(); // WIZARD - If you are unsure use the debug-wizard to test all combinations: string codeToRun = OcrInputFilterWizard.Run(@"images\image.png", out double confidence, ocrTesseract); Console.WriteLine(codeToRun); // Optional: Export modified images so you can view them. foreach (var page in ocrInput.GetPages()) { page.SaveAsImage($"filtered_{page.Index}.bmp"); } var ocrResult = ocrTesseract.Read(ocrInput); Console.WriteLine(ocrResult.Text);
Imports IronOcr Imports System Private ocrTesseract = New IronTesseract() Private ocrInput = New OcrInput() ' First load all image(s) ocrInput.LoadImage("images\image.png") ' Note: You don't need all of them; most users only need Deskew() and occasionally DeNoise() ocrInput.WithTitle("My Document") ocrInput.Binarize() ocrInput.Contrast() ocrInput.Deskew() ocrInput.DeNoise() ocrInput.Despeckle() ocrInput.Dilate() ocrInput.EnhanceResolution(300) ocrInput.Invert() ocrInput.Rotate(90) ocrInput.Scale(150) ocrInput.Sharpen() ocrInput.ToGrayScale() ocrInput.Erode() ' WIZARD - If you are unsure use the debug-wizard to test all combinations: Dim confidence As Double Dim codeToRun As String = OcrInputFilterWizard.Run("images\image.png", confidence, ocrTesseract) Console.WriteLine(codeToRun) ' Optional: Export modified images so you can view them. For Each page In ocrInput.GetPages() page.SaveAsImage($"filtered_{page.Index}.bmp") Next page Dim ocrResult = ocrTesseract.Read(ocrInput) Console.WriteLine(ocrResult.Text)
Install-Package IronOcr
OcrInput
類別為 C# 和 .NET 開發者提供了在預處理圖像輸入以增加速度和精確度之前進行細粒度控制的功能。
這否定了使用Photoshop批處理腳本或ImageMagick來準備OCR圖片的常見做法。
OcrInput
使用影像路徑的物件讀取
方法。OcrResult輸出
文本屬性。