How to use Async and Multithreading

by Chipego Kalinda

In the ever-evolving landscape of software development, the efficient processing of large volumes of textual data remains a pivotal challenge. In this article, we explore the dynamic synergy of Async Support and Multithreading within the context of IronOCR and Tesseract. Asynchronous programming introduces a non-blocking paradigm, ensuring our applications remain nimble and responsive during the execution of OCR tasks. Simultaneously, we delve into the realm of multithreading, unraveling the potential for parallelism to significantly boost the performance of text recognition operations. Join us as we demystify the integration of these techniques, empowering developers to elevate the efficiency and responsiveness of their OCR-powered applications.


C# NuGet Library for OCR

Install with NuGet

Install-Package IronOcr
or
C# OCR DLL

Download DLL

Download DLL

Manually install into your project

Understanding Multithreading

In IronOCR, the efficiency of image processing and OCR reading is enhanced by seamless multithreading, eliminating the need for developers to employ a specialized API. IronTesseract automatically leverages all available threads across multiple cores, optimizing system resources for swift and responsive OCR execution. This intrinsic multithreading not only simplifies development but also significantly boosts performance, showcasing a sophisticated integration of parallelism into the OCR workflow.

So a Multithreaded read would look as simple as this:

:path=/static-assets/ocr/content-code-examples/how-to/async-simple-multithreading.cs
using IronOcr;
using System;

var ocr = new IronTesseract();

using (var input = new OcrPdfInput(@"example.pdf"))
{
    var result = ocr.Read(input);
    Console.WriteLine(result.Text);
};
Imports IronOcr
Imports System

Private ocr = New IronTesseract()

Using input = New OcrPdfInput("example.pdf")
	Dim result = ocr.Read(input)
	Console.WriteLine(result.Text)
End Using
VB   C#

Understanding Async Support

In the realm of Optical Character Recognition (OCR), asynchronous programming, or "async," plays a pivotal role in optimizing performance. Async support allows developers to execute OCR tasks without blocking the main thread, ensuring the application remains responsive. Imagine processing large documents or images for text recognition – async support allows the system to continue handling other tasks while OCR operations are underway.

In this section, we'll delve into the effortless integration of Async Support in IronOCR, showcasing different ways to make your OCR services non-blocking.

Using An OcrReadTask Object

When working with IronOCR, the utilization of OcrReadTask objects proves to be a valuable asset in enhancing control and flexibility within your OCR processes. These objects encapsulate OCR operations, allowing developers to manage text recognition tasks efficiently. This section provides examples of employing OcrReadTask objects in your IronOCR workflow, demonstrating how they can be leveraged to initiate, and optimize OCR tasks. Whether you are orchestrating complex document processing or fine-tuning the responsiveness of your OCR-powered application, effectively utilizing OcrReadTask objects helps to maximize the capabilities of IronOCR.

:path=/static-assets/ocr/content-code-examples/how-to/async-ocrtask.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();

OcrPdfInput largePdf = new OcrPdfInput("chapter1.pdf");

Func<OcrResult> reader = () =>
{
    return ocr.Read(largePdf);
};

OcrReadTask readTask = new OcrReadTask(reader.Invoke);
// Start the OCR task asynchronously
readTask.Start();

// Continue with other tasks while OCR is in progress
DoOtherTasks();

// Wait for the OCR task to complete and retrieve the result
OcrResult result = await Task.Run(() => readTask.Result);

Console.Write($"##### OCR RESULTS ###### \n {result.Text}");

largePdf.Dispose();
readTask.Dispose();

static void DoOtherTasks()
{
    // Simulate other tasks being performed while OCR is in progress
    Console.WriteLine("Performing other tasks...");
    Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}
Imports Microsoft.VisualBasic
Imports IronOcr

Private ocr As New IronTesseract()

Private largePdf As New OcrPdfInput("chapter1.pdf")

Private reader As Func(Of OcrResult) = Function()
	Return ocr.Read(largePdf)
End Function

Private readTask As New OcrReadTask(AddressOf reader.Invoke)
' Start the OCR task asynchronously
readTask.Start()

' Continue with other tasks while OCR is in progress
DoOtherTasks()

' Wait for the OCR task to complete and retrieve the result
Dim result As OcrResult = Await Task.Run(Function() readTask.Result)

Console.Write($"##### OCR RESULTS ###### " & vbLf & " {result.Text}")

largePdf.Dispose()
readTask.Dispose()

'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
'	' Simulate other tasks being performed while OCR is in progress
'	Console.WriteLine("Performing other tasks...");
'	Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}
VB   C#

Use Async Methods

ReadAsync() provides a straightforward and intuitive mechanism for initiating OCR operations asynchronously. Without the need for intricate threading or complex task management, developers can effortlessly integrate asynchronous OCR into their applications. This method liberates the main thread from the burdens of blocking OCR tasks, ensuring the application remains responsive and agile.

:path=/static-assets/ocr/content-code-examples/how-to/async-read-async.cs
using IronOcr;
using System;
using System.Threading.Tasks;

IronTesseract ocr = new IronTesseract();

using (OcrPdfInput largePdf = new OcrPdfInput("PDFs/example.pdf"))
{
    var result = await ocr.ReadAsync(largePdf);
    DoOtherTasks();
    Console.Write($"##### OCR RESULTS ###### " +
                $"\n {result.Text}");
}

static void DoOtherTasks()
{
    // Simulate other tasks being performed while OCR is in progress
    Console.WriteLine("Performing other tasks...");
    System.Threading.Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}
Imports Microsoft.VisualBasic
Imports IronOcr
Imports System
Imports System.Threading.Tasks

Private ocr As New IronTesseract()

Using largePdf As New OcrPdfInput("PDFs/example.pdf")
	Dim result = Await ocr.ReadAsync(largePdf)
	DoOtherTasks()
	Console.Write($"##### OCR RESULTS ###### " & $vbLf & " {result.Text}")
End Using

'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
'	' Simulate other tasks being performed while OCR is in progress
'	Console.WriteLine("Performing other tasks...");
'	System.Threading.Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}
VB   C#

Conclusion

In summary, leveraging multithreading in IronOCR proves to be a game-changer for optimizing OCR tasks. The innate multithreading capabilities of IronOCR, combined with user-friendly methods like ReadAsync(), simplify the handling of large volumes of text data. This synergy ensures your applications remain responsive and efficient, making IronOCR a formidable tool for crafting high-performance software solutions with streamlined text recognition capabilities.

Chipego

Software Engineer

Chipego has a natural skill for listening that helps him to comprehend customer issues, and offer intelligent solutions. He joined the Iron Software team in 2023, after studying a Bachelor of Science in Information Technology. IronPDF and IronOCR are the two products Chipego has been focusing on, but his knowledge of all products is growing daily, as he finds new ways to support customers. He enjoys how collaborative life is at Iron Software, with team members from across the company bringing their varied experience to contribute to effective, innovative solutions. When Chipego is away from his desk, he can often be found enjoying a good book or playing football.