如何使用非同步和多執行緒

Chipego

2023年11月14日

已更新 2024年12月10日

Translated

View the article in English

在不斷變化的軟體開發領域中，有效處理大量文本數據始終是一個關鍵挑戰。在這篇文章中，我們探討了在 IronOCR 和 Tesseract 的背景下，異步支持和多線程的動態協同效應。非同步編程引入了一個非阻塞範例，確保在執行OCR任務期間，我們的應用程序保持靈活和響應。同時，我們深入探討多線程的領域，揭示了平行處理在顯著提升文字識別操作性能方面的潛力。加入我們，一起揭開這些技術整合的神秘面紗，賦予開發者提升其OCR驅動應用的效率和響應能力。

開始使用IronOCR

立即在您的專案中使用IronOCR，並享受免費試用。

第一步：

如何使用非同步和多執行緒搭配Tesseract

下載支援 Tesseract 的 C# 函式庫，並具有非同步及多執行緒功能
利用由IronOCR管理的多執行緒
準備 PDF 文件和圖像以便閱讀
利用 OcrReadTask 物件來利用非同步併發優勢
使用ReadAsync方法以提高使用便利性

了解多執行緒处理

在 IronOCR 中，通過無縫多線程增強了圖像處理和 OCR 閱讀的效率，無需開發人員使用專門的 API。 IronTesseract 自動利用多核心上的所有可用線程，優化系統資源以實現快速且響應敏捷的 OCR 執行。這種固有的多線程不僅簡化了開發，而且顯著提升了性能，展示了並行性在OCR工作流中的精密整合。

因此，多執行緒讀取看起來會像這樣簡單：

:path=/static-assets/ocr/content-code-examples/how-to/async-simple-multithreading.cs

using IronOcr;
using System;

var ocr = new IronTesseract();

using (var input = new OcrPdfInput(@"example.pdf"))
{
    var result = ocr.Read(input);
    Console.WriteLine(result.Text);
};

Imports IronOcr
Imports System

Private ocr = New IronTesseract()

Using input = New OcrPdfInput("example.pdf")
	Dim result = ocr.Read(input)
	Console.WriteLine(result.Text)
End Using

$vbLabelText $csharpLabel

了解異步支持

在光學文字識別（OCR）的領域，非同步程式設計或稱 "async" 在優化性能方面發揮了關鍵作用。非同步支援允許開發者執行 OCR 任務而不阻塞主線程，確保應用程序保持響應。想像處理大型文件或圖像以進行文字識別 - 非同步支援使系統能夠在 OCR 操作進行時繼續處理其他任務。

在本節中，我們將深入探討在IronOCR中無縫整合異步支持，展示多種讓您的OCR服務非阻塞的方法。

使用 OcrReadTask 物件

在使用IronOCR時，利用OcrReadTask物件會成為增強您的OCR過程中控制和靈活性的寶貴資產。這些物件封裝了OCR操作，使開發人員能夠有效管理文字識別任務。本節提供了在您的 IronOCR 工作流程中使用 OcrReadTask 對象的示例，展示了如何利用它們來啟動並優化 OCR 任務。無論您是組織複雜的文件處理，還是微調您基於 OCR 技術的應用程序的響應性，有效利用OcrReadTask對象有助於最大化 IronOCR 的功能。

:path=/static-assets/ocr/content-code-examples/how-to/async-ocrtask.cs

using IronOcr;

IronTesseract ocr = new IronTesseract();

OcrPdfInput largePdf = new OcrPdfInput("chapter1.pdf");

Func<OcrResult> reader = () =>
{
    return ocr.Read(largePdf);
};

OcrReadTask readTask = new OcrReadTask(reader.Invoke);
// Start the OCR task asynchronously
readTask.Start();

// Continue with other tasks while OCR is in progress
DoOtherTasks();

// Wait for the OCR task to complete and retrieve the result
OcrResult result = await Task.Run(() => readTask.Result);

Console.Write($"##### OCR RESULTS ###### \n {result.Text}");

largePdf.Dispose();
readTask.Dispose();

static void DoOtherTasks()
{
    // Simulate other tasks being performed while OCR is in progress
    Console.WriteLine("Performing other tasks...");
    Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}

Imports Microsoft.VisualBasic
Imports IronOcr

Private ocr As New IronTesseract()

Private largePdf As New OcrPdfInput("chapter1.pdf")

Private reader As Func(Of OcrResult) = Function()
	Return ocr.Read(largePdf)
End Function

Private readTask As New OcrReadTask(AddressOf reader.Invoke)
' Start the OCR task asynchronously
readTask.Start()

' Continue with other tasks while OCR is in progress
DoOtherTasks()

' Wait for the OCR task to complete and retrieve the result
Dim result As OcrResult = Await Task.Run(Function() readTask.Result)

Console.Write($"##### OCR RESULTS ###### " & vbLf & " {result.Text}")

largePdf.Dispose()
readTask.Dispose()

'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
'	' Simulate other tasks being performed while OCR is in progress
'	Console.WriteLine("Performing other tasks...");
'	Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}

$vbLabelText $csharpLabel

使用異步方法

ReadAsync() 提供了一種簡單且直觀的機制，用於以非同步方式啟動 OCR 操作。無需複雜的線程處理或繁瑣的任務管理，開發人員可以輕鬆地將異步OCR整合到他們的應用程序中。此方法將主線程從阻塞 OCR 任務的負擔中解放出來，確保應用程式保持響應靈活。

:path=/static-assets/ocr/content-code-examples/how-to/async-read-async.cs

using IronOcr;
using System;
using System.Threading.Tasks;

IronTesseract ocr = new IronTesseract();

using (OcrPdfInput largePdf = new OcrPdfInput("PDFs/example.pdf"))
{
    var result = await ocr.ReadAsync(largePdf);
    DoOtherTasks();
    Console.Write($"##### OCR RESULTS ###### " +
                $"\n {result.Text}");
}

static void DoOtherTasks()
{
    // Simulate other tasks being performed while OCR is in progress
    Console.WriteLine("Performing other tasks...");
    System.Threading.Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}

Imports Microsoft.VisualBasic
Imports IronOcr
Imports System
Imports System.Threading.Tasks

Private ocr As New IronTesseract()

Using largePdf As New OcrPdfInput("PDFs/example.pdf")
	Dim result = Await ocr.ReadAsync(largePdf)
	DoOtherTasks()
	Console.Write($"##### OCR RESULTS ###### " & $vbLf & " {result.Text}")
End Using

'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
'	' Simulate other tasks being performed while OCR is in progress
'	Console.WriteLine("Performing other tasks...");
'	System.Threading.Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}

$vbLabelText $csharpLabel

結論

總之，利用 IronOCR 中的多執行緒技術證明是優化 OCR 任務的改變遊戲規則。 IronOCR 的固有多執行緒功能結合像 ReadAsync() 這樣的易用方法，簡化了大容量文本數據的處理。這種協同效應確保您的應用程式保持響應式和高效，使IronOCR成為打造高性能軟件解決方案和精簡文本識別功能的強大工具。

Chipego

立即與工程團隊聊天

軟體工程師

Chipego 擁有天生的傾聽技能，這幫助他理解客戶問題，並提供智能解決方案。他在獲得信息技術理學學士學位後，于 2023 年加入 Iron Software 團隊。IronPDF 和 IronOCR 是 Chipego 專注的兩個產品，但隨著他每天找到新的方法來支持客戶，他對所有產品的了解也在不斷增長。他喜歡在 Iron Software 的協作生活，公司內的團隊成員從各自不同的經歷中共同努力，創造出有效的創新解決方案。當 Chipego 離開辦公桌時，他常常享受讀好書或踢足球的樂趣。