如何获得阅读自信

查克尼特·宾

2023年十月22日

更新 2024年十二月10日

Translated

View the article in English

在OCR（光学字符识别）中，读取置信度是指OCR系统对其在图像或文档中识别的文本准确性所赋予的确定性或可靠性水平。这是衡量OCR系统对识别文本是否正确的置信度的一种方式。

高置信度得分表示识别的准确性非常可靠，而低置信度得分则表明识别可能较不可靠。

开始使用IronOCR

立即在您的项目中开始使用IronOCR，并享受免费试用。

第一步：

如何获得阅读自信

下载一个C#库以访问读取置信度
准备目标图像和 PDF 文档
访问OCR结果的Confidence属性
获取页面、段落、行、单词和字符的置信度
查看 Choices 属性以获取替代词选择

获取阅读信心示例

在对输入图像执行OCR后，文本的置信度水平存储在Confidence属性中。使用 'using' 语句来自动处理对象的释放。分别使用OcrImageInput和OcrPdfInput类添加诸如图像和PDF等文档。 Read 方法将返回一个 'OcrResult' 对象，该对象允许访问 Confidence 属性

:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-get-confidence.cs

using IronOcr;

// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();

// Add image
using var imageInput = new OcrImageInput("sample.tiff");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);

// Get confidence level
double confidence = ocrResult.Confidence;

Imports IronOcr

' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()

' Add image
Private imageInput = New OcrImageInput("sample.tiff")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)

' Get confidence level
Private confidence As Double = ocrResult.Confidence

$vbLabelText $csharpLabel

获取不同级别的读取置信度

不仅可以检索整个文档的置信度，还可以访问每个页面、段落、行、单词和字符的置信度。此外，您可以获取块的置信度，该块代表一个或多个紧密排列的段落的集合。

:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-confidence-level.cs

// Get page confidence level
double pageConfidence = ocrResult.Pages[0].Confidence;

// Get paragraph confidence level
double paragraphConfidence = ocrResult.Paragraphs[0].Confidence;

// Get line confidence level
double lineConfidence = ocrResult.Lines[0].Confidence;

// Get word confidence level
double wordConfidence = ocrResult.Words[0].Confidence;

// Get character confidence level
double characterConfidence = ocrResult.Characters[0].Confidence;

// Get block confidence level
double blockConfidence = ocrResult.Blocks[0].Confidence;

' Get page confidence level
Dim pageConfidence As Double = ocrResult.Pages(0).Confidence

' Get paragraph confidence level
Dim paragraphConfidence As Double = ocrResult.Paragraphs(0).Confidence

' Get line confidence level
Dim lineConfidence As Double = ocrResult.Lines(0).Confidence

' Get word confidence level
Dim wordConfidence As Double = ocrResult.Words(0).Confidence

' Get character confidence level
Dim characterConfidence As Double = ocrResult.Characters(0).Confidence

' Get block confidence level
Dim blockConfidence As Double = ocrResult.Blocks(0).Confidence

$vbLabelText $csharpLabel

获取字符选项

除了置信水平之外，还有一个有趣的属性叫做Choices。选择包含了一系列备选词及其统计相关性的列表。此信息允许用户访问其他可能的字符。

:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-get-choices.cs

using IronOcr;
using static IronOcr.OcrResult;

// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();

// Add image
using var imageInput = new OcrImageInput("Potter.tiff");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);

// Get choices
Choice[] choices = ocrResult.Characters[0].Choices;

Imports IronOcr
Imports IronOcr.OcrResult

' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()

' Add image
Private imageInput = New OcrImageInput("Potter.tiff")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)

' Get choices
Private choices() As Choice = ocrResult.Characters(0).Choices

$vbLabelText $csharpLabel

检索信息

查克尼特·宾

立即与工程团队聊天

软件工程师

Chaknith 是开发者中的福尔摩斯。他第一次意识到自己可能在软件工程方面有前途，是在他出于乐趣做代码挑战的时候。他的重点是 IronXL 和 IronBarcode，但他为能帮助客户解决每一款产品的问题而感到自豪。Chaknith 利用他从直接与客户交谈中获得的知识，帮助进一步改进产品。他的轶事反馈不仅仅局限于 Jira 票据，还支持产品开发、文档编写和市场营销，从而提升客户的整体体验。当他不在办公室时，他可能会在学习机器学习、编程或徒步旅行。