如何使用 IronOCR 取得 C# OCR 讀取信心分數
IronOCR 的讀取信心值顯示 OCR 系統對識別文字準確度的信心程度,數值範圍為 0 至 100,數值越高代表可靠性越高——可透過任何 OcrResult 物件的 Confidence 屬性取得此數值。
OCR(光學字元辨識)的辨識信心值,指的是 OCR 系統針對其從影像或文件中辨識出的文字準確性所賦予的確定性或可靠性程度。 這是用來衡量 OCR 系統對所識別文字正確性的信心程度。 在處理掃描文件、照片或任何文字品質可能不一的圖像時,此指標尤為重要。
高信心分數表示識別結果的準確性極高,而低信心分數則表示識別結果的可靠性可能較低。 理解這些信心等級有助於開發人員在其應用程式中實作適當的驗證邏輯與錯誤處理機制。
快速入門:一行程式碼即可取得 OCR 讀取信心值
請使用 IronTesseract 的 Read 方法並傳入圖片檔案路徑,接著存取回傳結果 Confidence 上的 OcrResult 屬性,即可了解 IronOCR 對其文字辨識結果的信心程度。 這是一種簡單且可靠的方式,可藉此開始評估 OCR 輸出的準確性。
簡化工作流程(5 個步驟)
- 下載 C# 函式庫以存取讀取信心
- 請準備目標圖片及 PDF 文件
- 存取 OCR 結果的
Confidence屬性 - 檢索頁面、段落、行、WORD及字元的信心分數
- 請查閱"
選項"屬性以查看替代詞彙
如何在 C# 中取得讀取信心?
對輸入圖像執行 OCR 後,文字的信心水準會儲存於 Confidence 屬性中。 請使用 'using' 語句,以便在使用完畢後自動釋放物件。 請分別使用 OcrImageInput 和 OcrPdfInput 類別,加入圖片和 PDF 等文件。 Read 方法將傳回一個 OcrResult 物件,該物件允許存取 Confidence 屬性。
:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-get-confidence.cs
using IronOcr;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Add image
using var imageInput = new OcrImageInput("sample.tiff");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Get confidence level
double confidence = ocrResult.Confidence;
Imports IronOcr
' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()
' Add image
Private imageInput = New OcrImageInput("sample.tiff")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)
' Get confidence level
Private confidence As Double = ocrResult.Confidence
回傳的信心值範圍為 0 至 100,其中:
- 90-100:極佳信心 - 文本高度可靠
- 80-89:信心程度良好 - 內容大致準確,僅有輕微不確定性
- 70-79:中等信心 - 文字可能包含部分錯誤
- 低於 70:信心不足 - 文本應經審閱或重新處理
如何在不同層級建立信心?
您不僅可以取得整份文件的信心水準,還能查閱每頁、每段、每行、每個單字及每個字元的信心水準。 此外,您可以透過"區塊"來建立信心,此處的"區塊"代表一組緊密相鄰的段落集合。
:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-confidence-level.cs
// Get page confidence level
double pageConfidence = ocrResult.Pages[0].Confidence;
// Get paragraph confidence level
double paragraphConfidence = ocrResult.Paragraphs[0].Confidence;
// Get line confidence level
double lineConfidence = ocrResult.Lines[0].Confidence;
// Get word confidence level
double wordConfidence = ocrResult.Words[0].Confidence;
// Get character confidence level
double characterConfidence = ocrResult.Characters[0].Confidence;
// Get block confidence level
double blockConfidence = ocrResult.Blocks[0].Confidence;
' Get page confidence level
Dim pageConfidence As Double = ocrResult.Pages(0).Confidence
' Get paragraph confidence level
Dim paragraphConfidence As Double = ocrResult.Paragraphs(0).Confidence
' Get line confidence level
Dim lineConfidence As Double = ocrResult.Lines(0).Confidence
' Get word confidence level
Dim wordConfidence As Double = ocrResult.Words(0).Confidence
' Get character confidence level
Dim characterConfidence As Double = ocrResult.Characters(0).Confidence
' Get block confidence level
Dim blockConfidence As Double = ocrResult.Blocks(0).Confidence
實用範例:依信心程度篩選
當處理品質參差不齊的文件(例如低品質的掃描檔)時,您可以使用信心分數來篩選結果:
using IronOcr;
using System.Linq;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Configure for better accuracy
ocrTesseract.Configuration.ReadBarCodes = false;
ocrTesseract.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd;
// Add image
using var imageInput = new OcrImageInput("invoice.png");
// Apply filters to improve quality
imageInput.Deskew();
imageInput.DeNoise();
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Filter words with confidence above 85%
var highConfidenceWords = ocrResult.Words
.Where(word => word.Confidence >= 85)
.Select(word => word.Text)
.ToList();
// Process only high-confidence text
string reliableText = string.Join(" ", highConfidenceWords);
Console.WriteLine($"High confidence text: {reliableText}");
// Flag low-confidence words for manual review
var lowConfidenceWords = ocrResult.Words
.Where(word => word.Confidence < 85)
.Select(word => new { word.Text, word.Confidence })
.ToList();
foreach (var word in lowConfidenceWords)
{
Console.WriteLine($"Review needed: '{word.Text}' (Confidence: {word.Confidence:F2}%)");
}
using IronOcr;
using System.Linq;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Configure for better accuracy
ocrTesseract.Configuration.ReadBarCodes = false;
ocrTesseract.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd;
// Add image
using var imageInput = new OcrImageInput("invoice.png");
// Apply filters to improve quality
imageInput.Deskew();
imageInput.DeNoise();
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Filter words with confidence above 85%
var highConfidenceWords = ocrResult.Words
.Where(word => word.Confidence >= 85)
.Select(word => word.Text)
.ToList();
// Process only high-confidence text
string reliableText = string.Join(" ", highConfidenceWords);
Console.WriteLine($"High confidence text: {reliableText}");
// Flag low-confidence words for manual review
var lowConfidenceWords = ocrResult.Words
.Where(word => word.Confidence < 85)
.Select(word => new { word.Text, word.Confidence })
.ToList();
foreach (var word in lowConfidenceWords)
{
Console.WriteLine($"Review needed: '{word.Text}' (Confidence: {word.Confidence:F2}%)");
}
Imports IronOcr
Imports System.Linq
' Instantiate IronTesseract
Dim ocrTesseract As New IronTesseract()
' Configure for better accuracy
ocrTesseract.Configuration.ReadBarCodes = False
ocrTesseract.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd
' Add image
Using imageInput As New OcrImageInput("invoice.png")
' Apply filters to improve quality
imageInput.Deskew()
imageInput.DeNoise()
' Perform OCR
Dim ocrResult As OcrResult = ocrTesseract.Read(imageInput)
' Filter words with confidence above 85%
Dim highConfidenceWords = ocrResult.Words _
.Where(Function(word) word.Confidence >= 85) _
.Select(Function(word) word.Text) _
.ToList()
' Process only high-confidence text
Dim reliableText As String = String.Join(" ", highConfidenceWords)
Console.WriteLine($"High confidence text: {reliableText}")
' Flag low-confidence words for manual review
Dim lowConfidenceWords = ocrResult.Words _
.Where(Function(word) word.Confidence < 85) _
.Select(Function(word) New With {Key .Text = word.Text, Key .Confidence = word.Confidence}) _
.ToList()
For Each word In lowConfidenceWords
Console.WriteLine($"Review needed: '{word.Text}' (Confidence: {word.Confidence:F2}%)")
Next
End Using
OCR 中的字元選擇有哪些?
除了信心水準之外,還有另一個名為 Choices 的有趣屬性。 "選項"欄位包含替代詞彙清單及其統計相關性。 此資訊可讓使用者存取其他可能的字元。 此功能在處理多種語言或特殊字型時特別實用。
:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-get-choices.cs
using IronOcr;
using static IronOcr.OcrResult;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Add image
using var imageInput = new OcrImageInput("Potter.tiff");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Get choices
Choice[] choices = ocrResult.Characters[0].Choices;
Imports IronOcr
Imports IronOcr.OcrResult
' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()
' Add image
Private imageInput = New OcrImageInput("Potter.tiff")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)
' Get choices
Private choices() As Choice = ocrResult.Characters(0).Choices
替代字元選擇有何助益?
採用替代字元可帶來多項優勢:
- 歧義解決:當"O"與"0"、或"l"與"1"等字元容易混淆時
- 字型變體:針對風格化或裝飾性字型的不同詮釋
- 品質問題:處理文字品質不佳的情況時有多種處理方式
- 語言語境:基於語言規則的替代詮釋
字元選擇的處理
以下是一個完整的範例,展示如何透過字詞選擇來提升翻譯精準度:
using IronOcr;
using System;
using System.Linq;
using static IronOcr.OcrResult;
// Configure IronTesseract for detailed results
IronTesseract ocrTesseract = new IronTesseract();
// Process image with potential ambiguities
using var imageInput = new OcrImageInput("ambiguous_text.png");
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Analyze character choices for each word
foreach (var word in ocrResult.Words)
{
Console.WriteLine($"\nWord: '{word.Text}' (Confidence: {word.Confidence:F2}%)");
// Check each character in the word
foreach (var character in word.Characters)
{
if (character.Choices != null && character.Choices.Length > 1)
{
Console.WriteLine($" Character '{character.Text}' has alternatives:");
// Display all choices sorted by confidence
foreach (var choice in character.Choices.OrderByDescending(c => c.Confidence))
{
Console.WriteLine($" - '{choice.Text}': {choice.Confidence:F2}%");
}
}
}
}
using IronOcr;
using System;
using System.Linq;
using static IronOcr.OcrResult;
// Configure IronTesseract for detailed results
IronTesseract ocrTesseract = new IronTesseract();
// Process image with potential ambiguities
using var imageInput = new OcrImageInput("ambiguous_text.png");
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Analyze character choices for each word
foreach (var word in ocrResult.Words)
{
Console.WriteLine($"\nWord: '{word.Text}' (Confidence: {word.Confidence:F2}%)");
// Check each character in the word
foreach (var character in word.Characters)
{
if (character.Choices != null && character.Choices.Length > 1)
{
Console.WriteLine($" Character '{character.Text}' has alternatives:");
// Display all choices sorted by confidence
foreach (var choice in character.Choices.OrderByDescending(c => c.Confidence))
{
Console.WriteLine($" - '{choice.Text}': {choice.Confidence:F2}%");
}
}
}
}
Imports IronOcr
Imports System
Imports System.Linq
Imports IronOcr.OcrResult
' Configure IronTesseract for detailed results
Dim ocrTesseract As New IronTesseract()
' Process image with potential ambiguities
Using imageInput As New OcrImageInput("ambiguous_text.png")
Dim ocrResult As OcrResult = ocrTesseract.Read(imageInput)
' Analyze character choices for each word
For Each word In ocrResult.Words
Console.WriteLine(vbCrLf & $"Word: '{word.Text}' (Confidence: {word.Confidence:F2}%)")
' Check each character in the word
For Each character In word.Characters
If character.Choices IsNot Nothing AndAlso character.Choices.Length > 1 Then
Console.WriteLine($" Character '{character.Text}' has alternatives:")
' Display all choices sorted by confidence
For Each choice In character.Choices.OrderByDescending(Function(c) c.Confidence)
Console.WriteLine($" - '{choice.Text}': {choice.Confidence:F2}%")
Next
End If
Next
Next
End Using
進階信心策略
在處理護照、車牌或 MICR 支票等特殊文件時,信心分數對於驗證結果至關重要:
using IronOcr;
public class DocumentValidator
{
private readonly IronTesseract ocr = new IronTesseract();
public bool ValidatePassportNumber(string imagePath, double minConfidence = 95.0)
{
using var input = new OcrImageInput(imagePath);
// Configure for passport reading
ocr.Configuration.ReadBarCodes = true;
ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.SingleLine;
// Apply preprocessing
input.Deskew();
input.Scale(200); // Upscale for better accuracy
var result = ocr.Read(input);
// Find passport number pattern
var passportLine = result.Lines
.Where(line => line.Text.Contains("P<") || IsPassportNumberFormat(line.Text))
.FirstOrDefault();
if (passportLine != null)
{
Console.WriteLine($"Passport line found: {passportLine.Text}");
Console.WriteLine($"Confidence: {passportLine.Confidence:F2}%");
// Only accept if confidence meets threshold
return passportLine.Confidence >= minConfidence;
}
return false;
}
private bool IsPassportNumberFormat(string text)
{
// Simple passport number validation
return System.Text.RegularExpressions.Regex.IsMatch(text, @"^[A-Z]\d{7,9}$");
}
}
using IronOcr;
public class DocumentValidator
{
private readonly IronTesseract ocr = new IronTesseract();
public bool ValidatePassportNumber(string imagePath, double minConfidence = 95.0)
{
using var input = new OcrImageInput(imagePath);
// Configure for passport reading
ocr.Configuration.ReadBarCodes = true;
ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.SingleLine;
// Apply preprocessing
input.Deskew();
input.Scale(200); // Upscale for better accuracy
var result = ocr.Read(input);
// Find passport number pattern
var passportLine = result.Lines
.Where(line => line.Text.Contains("P<") || IsPassportNumberFormat(line.Text))
.FirstOrDefault();
if (passportLine != null)
{
Console.WriteLine($"Passport line found: {passportLine.Text}");
Console.WriteLine($"Confidence: {passportLine.Confidence:F2}%");
// Only accept if confidence meets threshold
return passportLine.Confidence >= minConfidence;
}
return false;
}
private bool IsPassportNumberFormat(string text)
{
// Simple passport number validation
return System.Text.RegularExpressions.Regex.IsMatch(text, @"^[A-Z]\d{7,9}$");
}
}
Imports IronOcr
Public Class DocumentValidator
Private ReadOnly ocr As New IronTesseract()
Public Function ValidatePassportNumber(imagePath As String, Optional minConfidence As Double = 95.0) As Boolean
Using input As New OcrImageInput(imagePath)
' Configure for passport reading
ocr.Configuration.ReadBarCodes = True
ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.SingleLine
' Apply preprocessing
input.Deskew()
input.Scale(200) ' Upscale for better accuracy
Dim result = ocr.Read(input)
' Find passport number pattern
Dim passportLine = result.Lines _
.Where(Function(line) line.Text.Contains("P<") OrElse IsPassportNumberFormat(line.Text)) _
.FirstOrDefault()
If passportLine IsNot Nothing Then
Console.WriteLine($"Passport line found: {passportLine.Text}")
Console.WriteLine($"Confidence: {passportLine.Confidence:F2}%")
' Only accept if confidence meets threshold
Return passportLine.Confidence >= minConfidence
End If
Return False
End Using
End Function
Private Function IsPassportNumberFormat(text As String) As Boolean
' Simple passport number validation
Return System.Text.RegularExpressions.Regex.IsMatch(text, "^[A-Z]\d{7,9}$")
End Function
End Class
優化以提升信心
為獲得更高的信心分數,建議考慮使用影像濾鏡與預處理技術:
using IronOcr;
// Create an optimized OCR workflow
IronTesseract ocr = new IronTesseract();
using var input = new OcrImageInput("low_quality_scan.jpg");
// Apply multiple filters to improve confidence
input.Deskew(); // Correct rotation
input.DeNoise(); // Remove noise
input.Sharpen(); // Enhance edges
input.Dilate(); // Thicken text
input.Scale(150); // Upscale for clarity
// Configure for accuracy over speed
ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
ocr.Configuration.EngineMode = TesseractEngineMode.TesseractOnly;
var result = ocr.Read(input);
Console.WriteLine($"Document confidence: {result.Confidence:F2}%");
// Generate confidence report
var confidenceReport = result.Pages
.Select((page, index) => new
{
PageNumber = index + 1,
Confidence = page.Confidence,
WordCount = page.Words.Length,
LowConfidenceWords = page.Words.Count(w => w.Confidence < 80)
});
foreach (var page in confidenceReport)
{
Console.WriteLine($"Page {page.PageNumber}: {page.Confidence:F2}% confidence");
Console.WriteLine($" Total words: {page.WordCount}");
Console.WriteLine($" Low confidence words: {page.LowConfidenceWords}");
}
using IronOcr;
// Create an optimized OCR workflow
IronTesseract ocr = new IronTesseract();
using var input = new OcrImageInput("low_quality_scan.jpg");
// Apply multiple filters to improve confidence
input.Deskew(); // Correct rotation
input.DeNoise(); // Remove noise
input.Sharpen(); // Enhance edges
input.Dilate(); // Thicken text
input.Scale(150); // Upscale for clarity
// Configure for accuracy over speed
ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
ocr.Configuration.EngineMode = TesseractEngineMode.TesseractOnly;
var result = ocr.Read(input);
Console.WriteLine($"Document confidence: {result.Confidence:F2}%");
// Generate confidence report
var confidenceReport = result.Pages
.Select((page, index) => new
{
PageNumber = index + 1,
Confidence = page.Confidence,
WordCount = page.Words.Length,
LowConfidenceWords = page.Words.Count(w => w.Confidence < 80)
});
foreach (var page in confidenceReport)
{
Console.WriteLine($"Page {page.PageNumber}: {page.Confidence:F2}% confidence");
Console.WriteLine($" Total words: {page.WordCount}");
Console.WriteLine($" Low confidence words: {page.LowConfidenceWords}");
}
Imports IronOcr
' Create an optimized OCR workflow
Dim ocr As New IronTesseract()
Using input As New OcrImageInput("low_quality_scan.jpg")
' Apply multiple filters to improve confidence
input.Deskew() ' Correct rotation
input.DeNoise() ' Remove noise
input.Sharpen() ' Enhance edges
input.Dilate() ' Thicken text
input.Scale(150) ' Upscale for clarity
' Configure for accuracy over speed
ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
ocr.Configuration.EngineMode = TesseractEngineMode.TesseractOnly
Dim result = ocr.Read(input)
Console.WriteLine($"Document confidence: {result.Confidence:F2}%")
' Generate confidence report
Dim confidenceReport = result.Pages _
.Select(Function(page, index) New With {
.PageNumber = index + 1,
.Confidence = page.Confidence,
.WordCount = page.Words.Length,
.LowConfidenceWords = page.Words.Count(Function(w) w.Confidence < 80)
})
For Each page In confidenceReport
Console.WriteLine($"Page {page.PageNumber}: {page.Confidence:F2}% confidence")
Console.WriteLine($" Total words: {page.WordCount}")
Console.WriteLine($" Low confidence words: {page.LowConfidenceWords}")
Next
End Using
摘要
理解並善用 OCR 信心分數,對於建構穩健的文件處理應用程式至關重要。 透過運用 IronOCR 的信心屬性與字元選擇功能,開發人員可在其 OCR 工作流程中實作智慧型驗證、錯誤處理及品質保證機制。 無論您處理的是螢幕截圖、表格或專業文件,信心分數都能提供確保文字精準擷取所需的評估指標。
常見問題
什麼是 OCR 信心值,它為何重要?
OCR 信心值是一個介於 0 到 100 之間的數值,用以表示 OCR 系統對文字辨識準確度的信心程度。IronOCR 透過任何 OcrResult 物件上的 Confidence 屬性提供此指標,協助開發人員評估辨識文字的可靠性,特別是在處理掃描文件、照片或文字品質參差不齊的圖像時。
如何在 C# 中快速檢查 OCR 信心分數?
透過 IronOCR,您只需一行程式碼即可取得 OCR 信心分數:double confidence = new IronOcr.IronTesseract().Read("input.png").Confidence; 此方法會傳回 0 至 100 之間的信心分數,用以表示 IronOCR 對其文字辨識結果的確定程度。
不同的信心分數區間代表什麼意思?
IronOCR 信心分數說明:90-100(優秀)表示文字高度可靠;80-89(良好)表示文字大致準確,僅有輕微不確定性;70-79(中等)表示文字可能包含部分錯誤;低於 70(低)表示文字應經審閱或重新處理。
如何查看不同文字元素的信心等級?
IronOCR 允許您在多個層級(頁面、段落、行、單字及個別字元)檢視辨識信心等級。執行 OCR 後,您可以透過 OcrResult 物件結構,存取各層級的 Confidence 屬性。
能否提供附有信心分數的替代詞彙建議?
是的,IronOCR 提供了一個 Choices 屬性,會提供替代單字選項及其信心分數。當 OCR 引擎識別出同一段文字有多種可能的解釋時,此功能便能派上用場,讓您得以實作智慧驗證邏輯。
我該如何在應用程式中實作基於信度的驗證?
使用 IronOCR 的 Read 方法後,請檢查 OcrResult 的 Confidence 屬性。根據信心閾值實作條件邏輯——例如,自動接受信心值高於 90 的結果,將信心值介於 70 至 90 之間的結果標記為待審查,並重新處理或手動驗證信心值低於 70 的結果。
IronOCR 能否整合至現有應用程式中?
IronOCR 設計上可輕鬆透過 C# 整合至現有應用程式中,讓開發人員能以最少的努力,為其軟體增添 OCR 功能。
使用 IronOCR 進行文件管理有哪些好處?
使用 IronOCR 進行文件管理,可將掃描文件轉換為可搜尋且可編輯的文字,從而簡化工作流程,減少人工資料輸入的需求,並提升文件的可存取性。
IronOCR 如何提升資料準確性?
IronOCR 透過其先進的辨識演算法與影像校正功能來提升資料準確性,確保文字擷取過程既可靠又精確。
IronOCR 是否有提供免費試用版?
是的,Iron Software 提供 IronOCR 的免費試用版,讓使用者能在決定購買前測試其功能與效能。

