使用 IRONOCR

如何在 Blazor 中從圖像讀取文本

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

坎納帕特·烏頓潘

2023年6月20日

已更新 2024年1月29日

Blazor 框架由 ASP.NET 團隊構建，用於使用 HTML 和 C# 而非 JavaScript 開發互動式用戶界面網絡應用程式。 Blazor 透過 WebAssembly 在網頁瀏覽器中直接執行 C# 程式碼。這使得建立和開發具有邏輯的元件變得容易，並且可以重複使用。這是在開發人員中流行的用於使用 C# 構建用戶界面的框架。

在本文中，我們將使用IronOCR透過光學字符識別（OCR）來創建一個Blazor Server應用程式，以從圖像文件中讀取文字。

如何在Blazor中使用光學字符識別從圖像中讀取文本?

先決條件

擁有最新版本的 Visual Studio。您可以從此鏈接下載。
ASP.NET 和 Web 開發工作負載。在安裝 Visual Studio 時，選擇安裝 ASP.NET 和 Web 開發工作負載，因為這是此項目所需的。
IronOCR C# 函式庫。我們將使用 IronOCR 將影像資料轉換為機器可讀的文字。您可以直接從NuGet網站下載。在 Visual Studio 中從 NuGet 套件管理器下載並安裝 IronOCR 是一個更方便的方法。

建立 Blazor 伺服器應用程式

打開 Visual Studio 並按照步驟創建一個 Blazor Server 應用程式：

點擊建立一個新專案，然後從列出的專案模板中選擇「Blazor Server App」。
在 Visual Studio 中創建一個新的 Blazor Server 應用程式
接下來，將您的專案命名為恰當的名稱。在此，我們將其命名為"BlazorReadText"。
配置 Blazor 專案
最後，設置附加信息並點擊創建。
選擇長期支持的 .NET Framework 和項目其他資訊
Blazor 伺服器應用程式現已建立。現在我們需要安裝必要的套件，然後使用IronOCR提取圖像數據。

添加必要的套件

`BlazorInputFile`

第一步是安裝BlazorInputFile套件。它是一個用於Blazor應用程序的組件，用於將單個或多個文件上傳到伺服器。此元件將用於在 Blazor 應用程式的 Razor 頁面上上傳圖像檔案。打開方案的 NuGet 套件管理，然後瀏覽BlazorInputFile。

如何從圖片中讀取文本（Blazor），圖4：安裝BlazorInputFile套件

安裝 BlazorInputFile 套件

選擇專案的複選框，然後點擊安裝。

現在，打開 Pages 資料夾中的 _Host.cshtml 檔案，然後新增以下 JavaScript 檔案：

 <script src="_content/BlazorInputFile/inputfile.js"></script>

 <script src="_content/BlazorInputFile/inputfile.js"></script>

HTML

如何在 Blazor 中從影像讀取文字，圖 5：從解決方案總管導覽至 _Host.cshtml 檔案

從解決方案總管導航到 _Host.cshtml 文件

最後，在_Imports.razor文件中添加以下代碼。

@using BlazorInputFile

@using BlazorInputFile

'INSTANT VB TODO TASK: The following line uses invalid syntax:
'@using BlazorInputFile

$vbLabelText $csharpLabel

IronOCR

IronOCR 是一個 C# 函式庫，用於掃描和讀取不同格式的圖像。它提供了以超過127種全球語言處理圖像的功能。

要安裝IronOCR，打開NuGet包管理器並搜索IronOCR。選擇專案並點擊安裝按鈕。

如何在 Blazor 中從圖片讀取文字，圖 6：在 NuGet 套件管理器中安裝 IronOcr 套件

在NuGet包管理器中安裝IronOCR套件

在_Imports.razor文件中添加IronOCR命名空间：

@using IronOCR

@using IronOCR

'INSTANT VB TODO TASK: The following line uses invalid syntax:
'@using IronOCR

$vbLabelText $csharpLabel

建立 Blazor UI 元件

元件表示具有業務邏輯的使用者介面，以展現動態行為。 Blazor 使用 Razor 元件來構建其應用程式。這些元件可以嵌套、重複使用，並在項目之間共享。默認情況下，應用程式中提供了Counter和FetchData頁面，為了簡化而刪除這些頁面。

在 BlazorReadText 應用程式下的 pages 資料夾上按一下滑鼠右鍵，然後選擇新增 > Razor 元件。如果您找不到 Razor Component，請按一下New Item，然後從 C# 元件中選擇 "Razor Component"。將元件命名為「OCR.razor」，然後點擊「新增」。

如何在 Blazor 中從圖像讀取文字，圖 7：新增 Razor 元件

新增 Razor 元件

最佳做法是將這個 Razor 頁面的代碼分離到另一個類中。再一次，右鍵點擊頁面資料夾，然後選擇新增 > 類別。將類別命名為與頁面名稱相同，然後點擊新增。 Blazor 是一個智能框架，它將這個類標記為具有相同名稱的頁面。

如何在 Blazor 中從圖片讀取文字，圖 8：為 OCR.razor Razor 元件建立 OCR.razor.cs 程式碼檔案

為 OCR.razor Razor 元件建立一個 OCR.razor.cs 程式碼檔案

現在，我們進入實際的程式碼實作，將使用 IronOCR 讀取圖像數據。

Blazor OCR.razor UI元件源代碼以讀取圖像資料

要識別圖像中的文字，請上傳圖像，將其轉換為二進位數據，然後應用IronOCR方法來提取文字。

打開 OCR.razor.cs 類別，並撰寫以下範例源代碼：


using BlazorInputFile;
using Microsoft.AspNetCore.Components;
using IronOcr;

namespace BlazorReadText.Pages
{
    public class OCRModel : ComponentBase
    {
        protected string imageText;
        protected string imagePreview;
        byte [] imageFileBytes;

        const string DefaultStatus = "Maximum size allowed for the image is 4 MB";
        protected string status = DefaultStatus;

        const int MaxFileSize = 4 * 1024 * 1024; // 4MB

        protected async Task ViewImage(IFileListEntry [] files)
        {
            var file = files.FirstOrDefault();
            if (file == null)
            {
                return;
            }
            else if (file.Size > MaxFileSize)
            {
                status = $"The file size is {file.Size} bytes, this is more than the allowed limit of {MaxFileSize} bytes.";
                return;
            }
            else if (!file.Type.Contains("image"))
            {
                status = "Please uplaod a valid image file";
                return;
            }
            else
            {
                var memoryStream = new MemoryStream();
                await file.Data.CopyToAsync(memoryStream);
                imageFileBytes = memoryStream.ToArray();
                string base64String = Convert.ToBase64String(imageFileBytes, 0, imageFileBytes.Length);

                imagePreview = string.Concat("data:image/png;base64,", base64String);
                memoryStream.Flush();
                status = DefaultStatus;
            }
        }

        protected private async Task GetText()
        {
            if (imageFileBytes != null)
            {
                IronTesseract ocr = new IronTesseract();
                using (OcrInput input = new OcrInput(imageFileBytes))
                {
                    OcrResult result = ocr.Read(input);
                    imageText = result.Text;
                }
            }
        }
    }
}


using BlazorInputFile;
using Microsoft.AspNetCore.Components;
using IronOcr;

namespace BlazorReadText.Pages
{
    public class OCRModel : ComponentBase
    {
        protected string imageText;
        protected string imagePreview;
        byte [] imageFileBytes;

        const string DefaultStatus = "Maximum size allowed for the image is 4 MB";
        protected string status = DefaultStatus;

        const int MaxFileSize = 4 * 1024 * 1024; // 4MB

        protected async Task ViewImage(IFileListEntry [] files)
        {
            var file = files.FirstOrDefault();
            if (file == null)
            {
                return;
            }
            else if (file.Size > MaxFileSize)
            {
                status = $"The file size is {file.Size} bytes, this is more than the allowed limit of {MaxFileSize} bytes.";
                return;
            }
            else if (!file.Type.Contains("image"))
            {
                status = "Please uplaod a valid image file";
                return;
            }
            else
            {
                var memoryStream = new MemoryStream();
                await file.Data.CopyToAsync(memoryStream);
                imageFileBytes = memoryStream.ToArray();
                string base64String = Convert.ToBase64String(imageFileBytes, 0, imageFileBytes.Length);

                imagePreview = string.Concat("data:image/png;base64,", base64String);
                memoryStream.Flush();
                status = DefaultStatus;
            }
        }

        protected private async Task GetText()
        {
            if (imageFileBytes != null)
            {
                IronTesseract ocr = new IronTesseract();
                using (OcrInput input = new OcrInput(imageFileBytes))
                {
                    OcrResult result = ocr.Read(input);
                    imageText = result.Text;
                }
            }
        }
    }
}

Imports BlazorInputFile
Imports Microsoft.AspNetCore.Components
Imports IronOcr

Namespace BlazorReadText.Pages
	Public Class OCRModel
		Inherits ComponentBase

		Protected imageText As String
		Protected imagePreview As String
		Private imageFileBytes() As Byte

		Private Const DefaultStatus As String = "Maximum size allowed for the image is 4 MB"
		Protected status As String = DefaultStatus

		Private Const MaxFileSize As Integer = 4 * 1024 * 1024 ' 4MB

		Protected Async Function ViewImage(ByVal files() As IFileListEntry) As Task
			Dim file = files.FirstOrDefault()
			If file Is Nothing Then
				Return
			ElseIf file.Size > MaxFileSize Then
				status = $"The file size is {file.Size} bytes, this is more than the allowed limit of {MaxFileSize} bytes."
				Return
			ElseIf Not file.Type.Contains("image") Then
				status = "Please uplaod a valid image file"
				Return
			Else
				Dim memoryStream As New MemoryStream()
				Await file.Data.CopyToAsync(memoryStream)
				imageFileBytes = memoryStream.ToArray()
				Dim base64String As String = Convert.ToBase64String(imageFileBytes, 0, imageFileBytes.Length)

				imagePreview = String.Concat("data:image/png;base64,", base64String)
				memoryStream.Flush()
				status = DefaultStatus
			End If
		End Function

		Private Protected Async Function GetText() As Task
			If imageFileBytes IsNot Nothing Then
				Dim ocr As New IronTesseract()
				Using input As New OcrInput(imageFileBytes)
					Dim result As OcrResult = ocr.Read(input)
					imageText = result.Text
				End Using
			End If
		End Function
	End Class
End Namespace

$vbLabelText $csharpLabel

在上述代碼中，ViewImage 方法用於從輸入文件中取得上傳的文件，並檢查它是否為圖像且大小小於指定的大小。如果發生任何檔案大小或檔案類型錯誤，則if-else區塊會處理它。然後將影像複製到MemoryStream。最後，圖像被轉換為位元組陣列，因為IronOcr.OcrInput可以接受二進位格式的圖像。

GetText 方法使用 IronOCR 從輸入圖像讀取文本。 IronOCR 使用最新的 Tesseract 5 引擎，並支援超過 127 種語言。轉換後的影像以Read方法檢索結果。 IronTesseract 由 IronOCR 團隊開發，是 Google Tesseract 的擴展版本。如需更多資訊，請造訪C# Tesseract OCR 範例。

最後，提取的文字會儲存在imageText變數中以供顯示。這個函式庫支援英文字元圖片，無需額外配置。您可以查看如何在此程式碼範例頁面上使用不同的語言。

Blazor 前端 UI 元件原始碼

現在，為應用程式建立使用者介面。打開 OCR.razor 文件並編寫以下代碼：

@page "/IronOCR"
@inherits OCRModel

<h2>Optical Character Recognition (OCR) Using Blazor and IronOCR Software</h2>

<div class="row">
    <div class="col-md-5">
        <textarea disabled class="form-control" rows="10" cols="15">@imageText</textarea>
    </div>
    <div class="col-md-5">
        <div class="image-container">
            <img class="preview-image" width="800" height="500" src=@imagePreview>
        </div>
        <BlazorInputFile.InputFile OnChange="@ViewImage" />
        <p>@status</p>
        <hr />
        <button class="btn btn-primary btn-lg" @onclick="GetText">
            Extract Text
        </button>
    </div>
</div>

@page "/IronOCR"
@inherits OCRModel

<h2>Optical Character Recognition (OCR) Using Blazor and IronOCR Software</h2>

<div class="row">
    <div class="col-md-5">
        <textarea disabled class="form-control" rows="10" cols="15">@imageText</textarea>
    </div>
    <div class="col-md-5">
        <div class="image-container">
            <img class="preview-image" width="800" height="500" src=@imagePreview>
        </div>
        <BlazorInputFile.InputFile OnChange="@ViewImage" />
        <p>@status</p>
        <hr />
        <button class="btn btn-primary btn-lg" @onclick="GetText">
            Extract Text
        </button>
    </div>
</div>

'INSTANT VB TODO TASK: The following line uses invalid syntax:
'@page "/IronOCR" @inherits OCRModel <h2> Optical Character Recognition(OCR) @Using Blazor @and IronOCR Software</h2> <div class="row"> <div class="col-md-5"> <textarea disabled class="form-control" rows="10" cols="15"> @imageText</textarea> </div> <div class="col-md-5"> <div class="image-container"> <img class="preview-image" width="800" height="500" src=@imagePreview> </div> <BlazorInputFile.InputFile OnChange="@ViewImage" /> <p> @status</p> <hr /> <button class="btn btn-primary btn-lg" @onclick="GetText"> Extract Text </button> </div> </div>

$vbLabelText $csharpLabel

在上面的程式碼中，UI 包含一個輸入檔案標籤以選擇影像檔案，還有一個影像標籤來顯示影像。在輸入框下方有一個按鈕，可以觸發GetText方法。有一個文字區域用於顯示從影像資料中提取的文字。

將鏈接添加到導航選單

最後，請在共享資料夾下的 NavMenu.razor 檔案中新增一個指向 OCR.razor 頁面的連結。

<div class="nav-item px-3">
    <NavLink class="nav-link" href="IronOCR">
        <span class="oi oi-plus" aria-hidden="true"></span> Read Image Text
    </NavLink>
</div>

<div class="nav-item px-3">
    <NavLink class="nav-link" href="IronOCR">
        <span class="oi oi-plus" aria-hidden="true"></span> Read Image Text
    </NavLink>
</div>

HTML

移除到Counter和FetchData的連結，因為它們是不需要的。

一切都已完成，準備就緒。按 F5 執行應用程式。

前端應顯示如下所示：

如何在Blazor中從圖片中讀取文字，圖9：Blazor伺服器應用程式的介面

Blazor Server 應用程式的使用者介面

讓我們上傳圖像並提取文本以視覺化輸出。

如何在Blazor中從圖像讀取文字，圖10：上傳的圖像和提取的文字

上傳的圖片和提取的文本

輸出文本乾淨，且可以從文本區域中複製。

摘要

本文展示了如何在 Blazor 伺服器應用程式中創建一個具有後端代碼的 Blazor UI 元件，以從圖像中讀取文本。 IronOCR 是一個多用途的庫，用於在任何基於 C# 的應用程式中提取文字。它支持最新的.NET Framework，並且可以很好地與Razor應用程式一起使用。 IronOCR 是一個跨平台的庫，支持 Windows、Linux、macOS、MAUI。此外，IronOCR 使用來自 Tesseract 的最佳結果提供高準確度，無需任何額外設置。它支持PDF 檔案和所有流行的圖像格式。也可以從圖像中讀取條碼值。

您也可以在免費試用中嘗試 IronOCR。從此處下載軟體庫。

坎納帕特·烏頓潘

立即與工程團隊聊天

軟體工程師

在成為軟體工程師之前，Kannapat 在日本北海道大學完成了環境資源博士學位。在攻讀學位期間，Kannapat 也成為了車輛機器人實驗室的成員，該實驗室隸屬於生物生產工程學系。2022 年，他利用自己的 C# 技能，加入了 Iron Software 的工程團隊，專注於 IronPDF 的開發。Kannapat 珍視這份工作，因為他可以直接向負責撰寫大部分 IronPDF 程式碼的開發人員學習。除了同儕學習外，Kannapat 還享受在 Iron Software 工作的社交方面。當他不在撰寫程式碼或文件時，Kannapat 通常會在 PS5 上玩遊戲或重看《最後生還者》。

< 上一頁
發票 OCR API（開發者教程）

下一個 >
OCR 收據數據提取（逐步教程）