How to use Async and Multithreading

In the ever-evolving landscape of software development, the efficient processing of large volumes of textual data remains a pivotal challenge. In this article, we explore the dynamic synergy of Async Support and Multithreading within the context of IronOCR and Tesseract. Asynchronous programming introduces a non-blocking paradigm, ensuring our applications remain nimble and responsive during the execution of OCR tasks. Simultaneously, we delve into the realm of multithreading, unraveling the potential for parallelism to significantly boost the performance of text recognition operations. Join us as we demystify the integration of these techniques, empowering developers to elevate the efficiency and responsiveness of their OCR-powered applications.

Get started with IronOCR

Start using IronOCR in your project today with a free trial.

First Step:
green arrow pointer


Understanding Multithreading

In IronOCR, the efficiency of image processing and OCR reading is enhanced by seamless multithreading, eliminating the need for developers to employ a specialized API. IronTesseract automatically leverages all available threads across multiple cores, optimizing system resources for swift and responsive OCR execution. This intrinsic multithreading not only simplifies development but also significantly boosts performance, showcasing a sophisticated integration of parallelism into the OCR workflow.

Here is what a multithreaded read might look like in C#:

:path=/static-assets/ocr/content-code-examples/how-to/async-simple-multithreading.cs
// Importing the necessary namespaces.
using IronOcr;
using System;

// Instantiate IronTesseract, the OCR engine used for recognizing text.
var ocr = new IronTesseract();

// Using a using statement for OcrInput ensures proper resource management.
// OcrInput is used to process various formats, including PDFs.
using (var input = new OcrInput(@"example.pdf")) // Specifying the path of the input PDF file.
{
    // Read the PDF input to extract text.
    var result = ocr.Read(input);
    
    // Output the extracted text result to the console.
    Console.WriteLine(result.Text);
}
' Importing the necessary namespaces.
Imports IronOcr
Imports System

' Instantiate IronTesseract, the OCR engine used for recognizing text.
Private ocr = New IronTesseract()

' Using a using statement for OcrInput ensures proper resource management.
' OcrInput is used to process various formats, including PDFs.
Using input = New OcrInput("example.pdf") ' Specifying the path of the input PDF file.
	' Read the PDF input to extract text.
	Dim result = ocr.Read(input)

	' Output the extracted text result to the console.
	Console.WriteLine(result.Text)
End Using
$vbLabelText   $csharpLabel

Understanding Async Support

In the realm of Optical Character Recognition (OCR), asynchronous programming, or "async," plays a pivotal role in optimizing performance. Async support allows developers to execute OCR tasks without blocking the main thread, ensuring the application remains responsive. Imagine processing large documents or images for text recognition – async support allows the system to continue handling other tasks while OCR operations are underway.

In this section, we'll delve into the effortless integration of Async Support in IronOCR, showcasing different ways to make your OCR services non-blocking.

Using An OcrReadTask Object

When working with IronOCR, the utilization of OcrReadTask objects proves to be a valuable asset in enhancing control and flexibility within your OCR processes. These objects encapsulate OCR operations, allowing developers to manage text recognition tasks efficiently. This section provides examples of employing OcrReadTask objects in your IronOCR workflow, demonstrating how they can be leveraged to initiate and optimize OCR tasks. Whether you are orchestrating complex document processing or fine-tuning the responsiveness of your OCR-powered application, effectively utilizing OcrReadTask objects helps to maximize the capabilities of IronOCR.

:path=/static-assets/ocr/content-code-examples/how-to/async-ocrtask.cs
using System;
using System.Threading;
using System.Threading.Tasks;
using IronOcr;

// Creating an instance of IronTesseract for OCR processing
IronTesseract ocr = new IronTesseract();

// Loading a large PDF for OCR processing
OcrPdfInput largePdf = new OcrPdfInput("chapter1.pdf");

// Defining a function that reads the PDF using the OCR engine and returns the result
Func<OcrResult> reader = () =>
{
    return ocr.Read(largePdf);
};

// Creating a task factory to run the OCR processing in a task
Task<OcrResult> ocrTask = Task.Factory.StartNew(reader);

// Continue with other tasks while OCR is in progress
DoOtherTasks();

// Await the OCR task to ensure that the processing completes
ocrTask.Wait();
OcrResult result = ocrTask.Result;

// Print the OCR results
Console.WriteLine($"##### OCR RESULTS ######\n{result.Text}");

// Clean up resources by disposing of them
largePdf.Dispose();

// Method to simulate other tasks being performed while the OCR is in progress
static void DoOtherTasks()
{
    Console.WriteLine("Performing other tasks...");
    Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}
Imports Microsoft.VisualBasic
Imports System
Imports System.Threading
Imports System.Threading.Tasks
Imports IronOcr

' Creating an instance of IronTesseract for OCR processing
Private ocr As New IronTesseract()

' Loading a large PDF for OCR processing
Private largePdf As New OcrPdfInput("chapter1.pdf")

' Defining a function that reads the PDF using the OCR engine and returns the result
Private reader As Func(Of OcrResult) = Function()
	Return ocr.Read(largePdf)
End Function

' Creating a task factory to run the OCR processing in a task
Private ocrTask As Task(Of OcrResult) = Task.Factory.StartNew(reader)

' Continue with other tasks while OCR is in progress
DoOtherTasks()

' Await the OCR task to ensure that the processing completes
ocrTask.Wait()
Dim result As OcrResult = ocrTask.Result

' Print the OCR results
Console.WriteLine($"##### OCR RESULTS ######" & vbLf & "{result.Text}")

' Clean up resources by disposing of them
largePdf.Dispose()

' Method to simulate other tasks being performed while the OCR is in progress
'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
'	Console.WriteLine("Performing other tasks...");
'	Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}
$vbLabelText   $csharpLabel

Use Async Methods

ReadAsync() provides a straightforward and intuitive mechanism for initiating OCR operations asynchronously. Without the need for intricate threading or complex task management, developers can effortlessly integrate asynchronous OCR into their applications. This method liberates the main thread from the burdens of blocking OCR tasks, ensuring the application remains responsive and agile.

:path=/static-assets/ocr/content-code-examples/how-to/async-read-async.cs
using IronOcr;
using System;
using System.Threading.Tasks;

// Instantiate IronTesseract which will be used to recognize text from PDFs
IronTesseract ocr = new IronTesseract();

// Create an asynchronous task to encapsulate the OCR process
async Task PerformOcrAsync()
{
    // Open the PDF file using OcrPdfInput and ensure proper disposal with 'using'
    using (OcrPdfInput largePdf = new OcrPdfInput("PDFs/example.pdf"))
    {
        // Use ReadAsync to perform OCR on the PDF, capturing the result
        var result = await ocr.ReadAsync(largePdf);

        // Simulate performing other tasks while OCR is being processed
        DoOtherTasks();

        // Output the OCR results to the console
        Console.WriteLine("##### OCR RESULTS #####");
        Console.WriteLine(result.Text);
    }
}

// Run the OCR task, ensuring proper asynchronous execution
await PerformOcrAsync();

// Method to simulate other tasks being performed while OCR is in progress
static void DoOtherTasks()
{
    Console.WriteLine("Performing other tasks...");
    System.Threading.Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}
Imports IronOcr
Imports System
Imports System.Threading.Tasks

' Instantiate IronTesseract which will be used to recognize text from PDFs
Private ocr As New IronTesseract()

' Create an asynchronous task to encapsulate the OCR process
Async Function PerformOcrAsync() As Task
	' Open the PDF file using OcrPdfInput and ensure proper disposal with 'using'
	Using largePdf As New OcrPdfInput("PDFs/example.pdf")
		' Use ReadAsync to perform OCR on the PDF, capturing the result
		Dim result = Await ocr.ReadAsync(largePdf)

		' Simulate performing other tasks while OCR is being processed
		DoOtherTasks()

		' Output the OCR results to the console
		Console.WriteLine("##### OCR RESULTS #####")
		Console.WriteLine(result.Text)
	End Using
End Function

' Run the OCR task, ensuring proper asynchronous execution
Await PerformOcrAsync()

' Method to simulate other tasks being performed while OCR is in progress
'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
'	Console.WriteLine("Performing other tasks...");
'	System.Threading.Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}
$vbLabelText   $csharpLabel

Conclusion

In summary, leveraging multithreading in IronOCR proves to be a game-changer for optimizing OCR tasks. The innate multithreading capabilities of IronOCR, combined with user-friendly methods like ReadAsync(), simplify the handling of large volumes of text data. This synergy ensures your applications remain responsive and efficient, making IronOCR a formidable tool for crafting high-performance software solutions with streamlined text recognition capabilities.

Frequently Asked Questions

What is the advantage of using Async Support in IronOCR?

Async Support allows developers to execute OCR tasks without blocking the main thread, ensuring the application remains responsive, especially when processing large documents or images.

How does multithreading improve OCR performance in IronOCR?

Multithreading optimizes system resources by leveraging all available threads across multiple cores, boosting the performance and responsiveness of OCR operations.

What is an OcrReadTask object in IronOCR?

An OcrReadTask object encapsulates OCR operations, providing developers with enhanced control and flexibility to efficiently manage text recognition tasks.

How do I initiate an asynchronous OCR operation in IronOCR?

You can initiate an asynchronous OCR operation using the ReadAsync() method, which allows OCR tasks to run in the background, freeing the main thread.

Can IronOCR handle OCR tasks on multiple cores?

Yes, IronOCR automatically uses all available cores for OCR tasks, optimizing processing speed and resource utilization.

Is a specialized API required for multithreading in IronOCR?

No, IronOCR simplifies development by automatically handling multithreading, eliminating the need for a specialized API.

How does ReadAsync() enhance application responsiveness?

ReadAsync() allows OCR tasks to be non-blocking, meaning the application can continue to perform other tasks while waiting for OCR operations to complete.

What is the benefit of using ReadAsync() over traditional OCR methods?

ReadAsync() provides a straightforward mechanism for asynchronous OCR, enhancing application efficiency without the need for complex task management.

Can IronOCR be used for both PDFs and images?

Yes, IronOCR can process both PDFs and images for text recognition, utilizing its multithreading and async capabilities.

Does IronOCR support non-blocking OCR operations?

Yes, by using async programming, IronOCR supports non-blocking OCR operations, allowing applications to remain agile and responsive.

Chipego related to Conclusion
Software Engineer
Chipego has a natural skill for listening that helps him to comprehend customer issues, and offer intelligent solutions. He joined the Iron Software team in 2023, after studying a Bachelor of Science in Information Technology. IronPDF and IronOCR are the two products Chipego has been focusing on, but his knowledge of all products is growing daily, as he finds new ways to support customers. He enjoys how collaborative life is at Iron Software, with team members from across the company bringing their varied experience to contribute to effective, innovative solutions. When Chipego is away from his desk, he can often be found enjoying a good book or playing football.