如何使用IronOCR讀取照片

Curtis Chau

2025年2月16日

已更新 2025年2月16日

Translated

View the article in English

在處理大量文件時，特別是 TIFF 檔這類的掃描影像，人工擷取文字可能會耗時且容易產生人為錯誤。這就是光學字符識別（OCR）發揮作用的地方，它提供了一種自動化方法，能夠準確地將圖像中的文字轉換為數位數據。 OCR 技術可以處理來自掃描文件或照片等圖像的複雜性，並將它們轉換為可搜尋、可編輯的文字。這不僅加快了文件處理速度，還確保比人工轉錄提取更準確的數據。

使用 OCR 處理像 TIFF 這樣的格式，可能由於其大小、色深或壓縮而難以閱讀，使企業和開發人員能夠快速數位化和管理大量數據。使用像 IronOCR 的 ReadPhoto 功能這樣的 OCR 解決方案，開發人員可以從影像中提取文字，甚至執行進階操作，例如搜尋關鍵字或將掃描數據轉換為可搜尋的 PDF。這項技術特別適用於處理法律文件、檔案或收據的行業，因為在這些領域中，有效的數據檢索至關重要。

在本教程中，我們會簡要地提供一個輸入示例以及如何使用ReadPhoto和如何操作結果對象的示例。以及開發人員可能會發現他們在使用 ReadPhoto 而不是 IronOCR 的標準 Read 的情境。

如何使用IronOCR讀取照片

下載用於讀取照片的 C# 庫
匯入影像進行處理
根據圖像類型使用適當的匯入方法
使用ReadPhoto方法從圖像中提取數據
訪問OcrPhotoResult屬性以查看和操作提取的數據

立即在您的專案中使用IronOCR，並享受免費試用。

第一步：

若要使用此功能，您還必須安裝 IronOcr.Extension.AdvancedScan 套件。

讀取照片範例

使用IronOCR讀取高品質的照片格式，如tiff和gif，相對簡單。首先，我們創建一個新變數並將其指定為OcrInput，然後使用LoadImageFrame載入圖像。最後，我們使用ReadPhoto方法並獲得結果。

請注意

由於 Tiff 在單一影像中包含多個幀，因此需要 frameNumber 參數。此外，索引從 0 開始，而不是從 1 開始。
該方法目前僅適用於英語、中文、日語、韓語和拉丁字母。
使用高級掃描功能在 .NET Framework 上運行需要項目在 x64 架構上運行。

輸入

由於大多數瀏覽器不原生支持TIFF格式，您可以從這裡下載TIFF輸入。為了顯示 TIFF 檔案，我將其轉換為 WEBP。

代碼

:path=/static-assets/ocr/content-code-examples/how-to/read-photo-read-photo.cs

using IronOcr;
using IronSoftware.Drawing;
using System;

// Instantiate OCR engine
var ocr = new IronTesseract();

using var inputPhoto = new OcrInput();
inputPhoto.LoadImageFrame("ocr.tiff", 0);

// Read photo
OcrPhotoResult result = ocr.ReadPhoto(inputPhoto);

// Index number refer to region order in the page
int number = result.TextRegions[0].FrameNumber;

// Extract the text in the first region
string textinregion = result.TextRegions[0].TextInRegion;

//Extract the co_ordinates of the first text region
Rectangle region = result.TextRegions[0].Region;

var output = $"Text in First Region: {textinregion}\n"
             + $"Text Region:\n"
             + $"Starting X: {region.X}\n"
             + $"Starting Y: {region.Y}\n"
             + $"Region Width: {region.Width}\n"
             + $"Region Height: {region.Height}\n"
             + $"Result Confidence: {result.Confidence}\n\n"
             + $"Full Scnned Photo Text: {result.Text}";

Console.WriteLine(output);

Imports Microsoft.VisualBasic
Imports IronOcr
Imports IronSoftware.Drawing
Imports System

' Instantiate OCR engine
Private ocr = New IronTesseract()

Private inputPhoto = New OcrInput()
inputPhoto.LoadImageFrame("ocr.tiff", 0)

' Read photo
Dim result As OcrPhotoResult = ocr.ReadPhoto(inputPhoto)

' Index number refer to region order in the page
Dim number As Integer = result.TextRegions(0).FrameNumber

' Extract the text in the first region
Dim textinregion As String = result.TextRegions(0).TextInRegion

'Extract the co_ordinates of the first text region
Dim region As Rectangle = result.TextRegions(0).Region

Dim output = $"Text in First Region: {textinregion}" & vbLf & $"Text Region:" & vbLf & $"Starting X: {region.X}" & vbLf & $"Starting Y: {region.Y}" & vbLf & $"Region Width: {region.Width}" & vbLf & $"Region Height: {region.Height}" & vbLf & $"Result Confidence: {result.Confidence}" & vbLf & vbLf & $"Full Scnned Photo Text: {result.Text}"

Console.WriteLine(output)

$vbLabelText $csharpLabel

輸出

文字：從 OCR 輸入中提取的文字。

Confidence: 一個「雙精度浮點」屬性，表示每個字元平均的統計準確度信心，以1為最高，0為最低。

TextRegions：屬性 "TextRegions" 的列表，指出輸入內 OCR 文本及其位置。在上面的例子中，我們打印了幀號以及包含文本的矩形。

《`ReadPhoto`》與《`Read`》的區別

此 readPhoto 方法與標準 read 方法的主要區別在於結果對象和它接受的文件格式。 LoadImageFrame 特別只接受 tiff 和 gif，並且不接受像 Jpeg 這樣的格式，這有幾個原因。

Tiff 和 Jpg 圖像之間的比較

Tiff 作為一種文件格式是無損的，通常用於將多個頁面和多個幀壓縮成一個單一的格式。它通常用於高品質的多影像儲存（例如法律文件、醫學影像）。它比標準的 jpg 格式要複雜得多，因此需要以不同的方式從中完整提取文本。

此外，Tiff 圖像僅使用不同的壓縮方式，因此 IronOCR 必須使用專門的方法來解析文本。

以下是對tiff和jpg的進一步比較。

Feature	TIFF (Tagged Image File Format)	JPG/JPEG (Joint Photographic Experts Group)
Compression	Lossless or uncompressed (preserves quality)	Lossy compression (reduces quality for smaller file size)
File Size	Large (due to high quality and optional lack of compression)	Smaller, optimized for web use and fast loading
Image Quality	High (ideal for professional use, retains all details)	Lower (due to lossy compression, some quality is sacrificed)
Color Depth	Supports high color depth (up to 16-bit or 32-bit per channel)	24-bit color (16.7 million colors)
Use Case	Professional photography, publishing, scanning, archiving	Web images, social media, everyday photos
Transparency	Supports transparency and alpha channels	Does not support transparency
Editing	Good for multiple edits (no quality loss with resaving)	Quality degrades with repeated edits and saves
Compatibility	Widely supported in professional software	Universally supported across all platforms and devices
Animation	Does not support animation	Does not support animation
Metadata	Stores extensive metadata (EXIF, layers, etc.)	Stores EXIF metadata but is more limited