How to use Async and Multithreading
In the ever-evolving landscape of software development, the efficient processing of large volumes of textual data remains a pivotal challenge. In this article, we explore the dynamic synergy of Async Support and Multithreading within the context of IronOCR and Tesseract. Asynchronous programming introduces a non-blocking paradigm, ensuring our applications remain nimble and responsive during the execution of OCR tasks. Simultaneously, we delve into the realm of multithreading, unraveling the potential for parallelism to significantly boost the performance of text recognition operations. Join us as we demystify the integration of these techniques, empowering developers to elevate the efficiency and responsiveness of their OCR-powered applications.
Get started with IronOCR
Start using IronOCR in your project today with a free trial.
How to use Async and Multithreading with Tesseract
- Download a C# library that supports Tesseract with async and multithreading
- Utilize multithreading managed by IronOCR
- Prepare the PDF document and image for reading
- Employ the OcrReadTask Object to take advantage of asynchronous concurrency
- Use the
ReadAsync
method for ease of use
Understanding Multithreading
In IronOCR, the efficiency of image processing and OCR reading is enhanced by seamless multithreading, eliminating the need for developers to employ a specialized API. IronTesseract automatically leverages all available threads across multiple cores, optimizing system resources for swift and responsive OCR execution. This intrinsic multithreading not only simplifies development but also significantly boosts performance, showcasing a sophisticated integration of parallelism into the OCR workflow.
So a Multithreaded read would look as simple as this:
:path=/static-assets/ocr/content-code-examples/how-to/async-simple-multithreading.cs
using IronOcr;
using System;
var ocr = new IronTesseract();
using (var input = new OcrPdfInput(@"example.pdf"))
{
var result = ocr.Read(input);
Console.WriteLine(result.Text);
};
Imports IronOcr
Imports System
Private ocr = New IronTesseract()
Using input = New OcrPdfInput("example.pdf")
Dim result = ocr.Read(input)
Console.WriteLine(result.Text)
End Using
Understanding Async Support
In the realm of Optical Character Recognition (OCR), asynchronous programming, or "async," plays a pivotal role in optimizing performance. Async support allows developers to execute OCR tasks without blocking the main thread, ensuring the application remains responsive. Imagine processing large documents or images for text recognition – async support allows the system to continue handling other tasks while OCR operations are underway.
In this section, we'll delve into the effortless integration of Async Support in IronOCR, showcasing different ways to make your OCR services non-blocking.
Using An OcrReadTask Object
When working with IronOCR, the utilization of OcrReadTask
objects proves to be a valuable asset in enhancing control and flexibility within your OCR processes. These objects encapsulate OCR operations, allowing developers to manage text recognition tasks efficiently. This section provides examples of employing OcrReadTask
objects in your IronOCR workflow, demonstrating how they can be leveraged to initiate, and optimize OCR tasks. Whether you are orchestrating complex document processing or fine-tuning the responsiveness of your OCR-powered application, effectively utilizing OcrReadTask
objects helps to maximize the capabilities of IronOCR.
:path=/static-assets/ocr/content-code-examples/how-to/async-ocrtask.cs
using IronOcr;
IronTesseract ocr = new IronTesseract();
OcrPdfInput largePdf = new OcrPdfInput("chapter1.pdf");
Func<OcrResult> reader = () =>
{
return ocr.Read(largePdf);
};
OcrReadTask readTask = new OcrReadTask(reader.Invoke);
// Start the OCR task asynchronously
readTask.Start();
// Continue with other tasks while OCR is in progress
DoOtherTasks();
// Wait for the OCR task to complete and retrieve the result
OcrResult result = await Task.Run(() => readTask.Result);
Console.Write($"##### OCR RESULTS ###### \n {result.Text}");
largePdf.Dispose();
readTask.Dispose();
static void DoOtherTasks()
{
// Simulate other tasks being performed while OCR is in progress
Console.WriteLine("Performing other tasks...");
Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}
Imports Microsoft.VisualBasic
Imports IronOcr
Private ocr As New IronTesseract()
Private largePdf As New OcrPdfInput("chapter1.pdf")
Private reader As Func(Of OcrResult) = Function()
Return ocr.Read(largePdf)
End Function
Private readTask As New OcrReadTask(AddressOf reader.Invoke)
' Start the OCR task asynchronously
readTask.Start()
' Continue with other tasks while OCR is in progress
DoOtherTasks()
' Wait for the OCR task to complete and retrieve the result
Dim result As OcrResult = Await Task.Run(Function() readTask.Result)
Console.Write($"##### OCR RESULTS ###### " & vbLf & " {result.Text}")
largePdf.Dispose()
readTask.Dispose()
'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
' ' Simulate other tasks being performed while OCR is in progress
' Console.WriteLine("Performing other tasks...");
' Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}
Use Async Methods
ReadAsync()
provides a straightforward and intuitive mechanism for initiating OCR operations asynchronously. Without the need for intricate threading or complex task management, developers can effortlessly integrate asynchronous OCR into their applications. This method liberates the main thread from the burdens of blocking OCR tasks, ensuring the application remains responsive and agile.
:path=/static-assets/ocr/content-code-examples/how-to/async-read-async.cs
using IronOcr;
using System;
using System.Threading.Tasks;
IronTesseract ocr = new IronTesseract();
using (OcrPdfInput largePdf = new OcrPdfInput("PDFs/example.pdf"))
{
var result = await ocr.ReadAsync(largePdf);
DoOtherTasks();
Console.Write($"##### OCR RESULTS ###### " +
$"\n {result.Text}");
}
static void DoOtherTasks()
{
// Simulate other tasks being performed while OCR is in progress
Console.WriteLine("Performing other tasks...");
System.Threading.Thread.Sleep(2000); // Simulating work for 2000 milliseconds
}
Imports Microsoft.VisualBasic
Imports IronOcr
Imports System
Imports System.Threading.Tasks
Private ocr As New IronTesseract()
Using largePdf As New OcrPdfInput("PDFs/example.pdf")
Dim result = Await ocr.ReadAsync(largePdf)
DoOtherTasks()
Console.Write($"##### OCR RESULTS ###### " & $vbLf & " {result.Text}")
End Using
'INSTANT VB TODO TASK: Local functions are not converted by Instant VB:
'static void DoOtherTasks()
'{
' ' Simulate other tasks being performed while OCR is in progress
' Console.WriteLine("Performing other tasks...");
' System.Threading.Thread.Sleep(2000); ' Simulating work for 2000 milliseconds
'}
Conclusion
In summary, leveraging multithreading in IronOCR proves to be a game-changer for optimizing OCR tasks. The innate multithreading capabilities of IronOCR, combined with user-friendly methods like ReadAsync(), simplify the handling of large volumes of text data. This synergy ensures your applications remain responsive and efficient, making IronOCR a formidable tool for crafting high-performance software solutions with streamlined text recognition capabilities.