如何获得阅读自信
读取 OCR 的信心 (光学字符识别) 是指 OCR 系统对图像或文档中已识别文本的准确性所赋予的确定性或可靠性级别。它衡量的是 OCR 系统对所识别文本的正确性有多大信心。
置信度分值高,表示识别准确的把握性高,而置信度分值低,则表示识别的可靠性可能较低。
如何获得阅读自信
开始在您的项目中使用IronPDF,并立即获取免费试用。
查看 IronOCR 上 Nuget 用于快速安装和部署。它有超过800万次下载,正在使用C#改变OCR。
Install-Package IronOcr
考虑安装 IronOCR DLL 直接。下载并手动安装到您的项目或GAC表单中: IronOcr.zip
手动安装到你的项目中
下载DLL获取读取信心示例
对输入图像执行 OCR 后,文本的置信度将存储在 Confidence 属性中。利用 "using "语句自动处理对象。分别使用 OcrImageInput
和 OcrPdfInput
类添加图像和 PDF 等文档。Read "方法将返回一个 "OcrResult "对象,该对象允许访问置信度属性。
:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-get-confidence.cs
using IronOcr;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Add image
using var imageInput = new OcrImageInput("sample.tiff");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Get confidence level
double confidence = ocrResult.Confidence;
Imports IronOcr
' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()
' Add image
Private imageInput = New OcrImageInput("sample.tiff")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)
' Get confidence level
Private confidence As Double = ocrResult.Confidence
获取不同级别的读取机密信息
您不仅可以检索整个文档的置信度,还可以访问每一页、段落、行、单词和字符的置信度。此外,您还可以获取段落块的置信度,段落块是一个或多个紧密相连段落的集合。
:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-confidence-level.cs
// Get page confidence level
double pageConfidence = ocrResult.Pages[0].Confidence;
// Get paragraph confidence level
double paragraphConfidence = ocrResult.Paragraphs[0].Confidence;
// Get line confidence level
double lineConfidence = ocrResult.Lines[0].Confidence;
// Get word confidence level
double wordConfidence = ocrResult.Words[0].Confidence;
// Get character confidence level
double characterConfidence = ocrResult.Characters[0].Confidence;
// Get block confidence level
double blockConfidence = ocrResult.Blocks[0].Confidence;
' Get page confidence level
Dim pageConfidence As Double = ocrResult.Pages(0).Confidence
' Get paragraph confidence level
Dim paragraphConfidence As Double = ocrResult.Paragraphs(0).Confidence
' Get line confidence level
Dim lineConfidence As Double = ocrResult.Lines(0).Confidence
' Get word confidence level
Dim wordConfidence As Double = ocrResult.Words(0).Confidence
' Get character confidence level
Dim characterConfidence As Double = ocrResult.Characters(0).Confidence
' Get block confidence level
Dim blockConfidence As Double = ocrResult.Blocks(0).Confidence
获取角色选择
除了置信度,还有一个有趣的属性叫做 选择。选择包含一个备选字词列表及其统计相关性。用户可以利用这些信息访问其他可能的字符。
:path=/static-assets/ocr/content-code-examples/how-to/tesseract-result-confidence-get-choices.cs
using IronOcr;
using static IronOcr.OcrResult;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Add image
using var imageInput = new OcrImageInput("Potter.tiff");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Get choices
Choice[] choices = ocrResult.Characters[0].Choices;
Imports IronOcr
Imports IronOcr.OcrResult
' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()
' Add image
Private imageInput = New OcrImageInput("Potter.tiff")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)
' Get choices
Private choices() As Choice = ocrResult.Characters(0).Choices