Guide to using IronOCR Computer Vision

Introduction

IronOCR utilizes OpenCV to use Computer Vision to detect areas where text exists in an image. This is useful for images that contain a lot of noise, images with text in many different places, and images where text is warped. Use of computer vision in IronOCR will determine where text regions exist and then use Tesseract to attempt to read those regions.

Get started with IronOCR

Start using IronOCR in your project today with a free trial.

First Step:
green arrow pointer


IronOCR.ComputerVision Installation via NuGet Package

OpenCV methods that perform Computer Vision in IronOCR are included in the regular IronOCR NuGet package.

Use of these methods requires the NuGet installation of IronOcr.ComputerVision. You will be prompted to download this package if it is not already installed.

  • Windows: IronOcr.ComputerVision.Windows
  • Linux: IronOcr.ComputerVision.Linux
  • macOS: IronOcr.ComputerVision.MacOS
  • macOS ARM: IronOcr.ComputerVision.MacOS.ARM

Install using the NuGet Package Manager or paste the following in the Package Manager Console:

Install-Package IronOcr.ComputerVision.Windows

This command will provide the necessary assemblies to use IronOCR Computer Vision with our model file.

Functionality and API

Code examples are included further down in this tutorial. Here is a general overview of the methods that are currently available:

MethodExplanation
FindTextRegionDetect regions which contain text elements and instruct Tesseract to only search for text within the area in which text was detected.
FindMultipleTextRegionsDetect areas which contain text elements and divide the page into separate images based on text regions.
GetTextRegionsScans the image and returns a list of text regions as `List`.

Code Examples

FindTextRegion

Usage of FindTextRegion will use computer vision to detect regions which contain text elements on every page of an OcrInput object.

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-1.cs
using IronOcr;

// Create an instance of the IronTesseract OCR engine
var ocr = new IronTesseract();

try
{
    // Create an OcrInput object to hold the image(s) for processing
    using var input = new OcrInput();

    // Load an image into the OcrInput object
    // Ensure that the file path is correct and the image exists at the given location
    input.AddImage("/path/to/file.png"); // Update the path to the correct image location

    // Perform OCR on the input image and store the result
    OcrResult result = ocr.Read(input);

    // Extract the recognized text from the OCR result
    string resultText = result.Text;

    // Output the result text or use it further in the application
    Console.WriteLine(resultText);
}
catch (Exception ex)
{
    // Handle exceptions, e.g., file not found, OCR error, etc.
    Console.WriteLine("An error occurred: " + ex.Message);
}
Imports IronOcr

' Create an instance of the IronTesseract OCR engine
Private ocr = New IronTesseract()

Try
	' Create an OcrInput object to hold the image(s) for processing
	Dim input = New OcrInput()

	' Load an image into the OcrInput object
	' Ensure that the file path is correct and the image exists at the given location
	input.AddImage("/path/to/file.png") ' Update the path to the correct image location

	' Perform OCR on the input image and store the result
	Dim result As OcrResult = ocr.Read(input)

	' Extract the recognized text from the OCR result
	Dim resultText As String = result.Text

	' Output the result text or use it further in the application
	Console.WriteLine(resultText)
Catch ex As Exception
	' Handle exceptions, e.g., file not found, OCR error, etc.
	Console.WriteLine("An error occurred: " & ex.Message)
End Try
$vbLabelText   $csharpLabel

It can optionally be called with custom parameters:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-2.cs
// Import the IronOcr library to utilize OCR functionality
using IronOcr;

// Create a new instance of IronTesseract which serves as the OCR engine
var ocr = new IronTesseract();

// Utilize a using statement to ensure the OcrInput object is disposed of properly
// OcrInput manages the image(s) we want for text recognition
using (var input = new OcrInput())
{
    // Add an image file to the OcrInput. Replace the path with your specific image file path
    input.AddImage("/path/file.png");

    // Enhance OCR accuracy by finding a text region with specified settings:
    // - Scale: How much the image should be resized for OCR processing
    // - Dilation: The degree of line thickening in the image
    // - Binarize: Convert the image to black and white
    // - Invert: Invert the colors of the image
    input.FindTextRegions(scale: 2.0, dilation: 20, binarize: true, invert: true);

    // Use the OCR engine to read the text from the specified image
    OcrResult result = ocr.Read(input);

    // Extract and store the recognized text from the OCR result
    string resultText = result.Text;

    // Output the recognized text to the console
    Console.WriteLine(resultText);
}
' Import the IronOcr library to utilize OCR functionality
Imports IronOcr

' Create a new instance of IronTesseract which serves as the OCR engine
Private ocr = New IronTesseract()

' Utilize a using statement to ensure the OcrInput object is disposed of properly
' OcrInput manages the image(s) we want for text recognition
Using input = New OcrInput()
	' Add an image file to the OcrInput. Replace the path with your specific image file path
	input.AddImage("/path/file.png")

	' Enhance OCR accuracy by finding a text region with specified settings:
	' - Scale: How much the image should be resized for OCR processing
	' - Dilation: The degree of line thickening in the image
	' - Binarize: Convert the image to black and white
	' - Invert: Invert the colors of the image
	input.FindTextRegions(scale:= 2.0, dilation:= 20, binarize:= True, invert:= True)

	' Use the OCR engine to read the text from the specified image
	Dim result As OcrResult = ocr.Read(input)

	' Extract and store the recognized text from the OCR result
	Dim resultText As String = result.Text

	' Output the recognized text to the console
	Console.WriteLine(resultText)
End Using
$vbLabelText   $csharpLabel

The overload can be used to return the text region as a Rectangle as well:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-3.cs
using IronOcr;

// Create a new instance of OcrInput to manage input images for OCR processing
using var input = new OcrInput();

// Load the image into the OcrInput instance
// Replace "/path/file.png" with the actual path to your image
input.Add("/path/to/your/image.png");

// Find text regions in the image with enhanced settings
// 'scale' parameter increases the size of the regions found for better accuracy
// 'binarize' parameter converts the image to black and white to improve text detection
input.FindTextRegions(scale: 2.0, binarize: true);
Imports IronOcr

' Create a new instance of OcrInput to manage input images for OCR processing
Private input = New OcrInput()

' Load the image into the OcrInput instance
' Replace "/path/file.png" with the actual path to your image
input.Add("/path/to/your/image.png")

' Find text regions in the image with enhanced settings
' 'scale' parameter increases the size of the regions found for better accuracy
' 'binarize' parameter converts the image to black and white to improve text detection
input.FindTextRegions(scale:= 2.0, binarize:= True)
$vbLabelText   $csharpLabel

FindMultipleTextRegions

Usage of FindMultipleTextRegions takes all pages of an OcrInput object, uses computer vision to detect areas which contain text elements and divides the input into separate images based on text regions:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-4.cs
using IronOcr;

// Initialize an IronTesseract instance.
// IronTesseract is a wrapper around the Tesseract engine, tailored for .NET, to perform OCR operations.
var ocr = new IronTesseract();

// Create an OcrInput instance for input image(s).
// OcrInput allows you to load images to perform OCR on them.
using var input = new OcrInput();

// Load an image from the given file path.
// Ensure that the file path is correct and that the image exists at this path.
input.AddImage("/path/file.png");

// (Optional) Optimize the input for individual text regions if necessary.
// Uncomment the next line if your images have multiple sections of text to improve accuracy.
// input.FindMultipleTextRegions();

// Perform OCR on the input images and get the result.
// OcrResult holds the text extracted from the images.
OcrResult result = ocr.Read(input);

// Store the recognized text into a string variable.
// This string contains the text extracted from the input image.
string resultText = result.Text;

// The extracted text from the image is now stored in 'resultText' and can be used as needed.
Imports IronOcr

' Initialize an IronTesseract instance.
' IronTesseract is a wrapper around the Tesseract engine, tailored for .NET, to perform OCR operations.
Private ocr = New IronTesseract()

' Create an OcrInput instance for input image(s).
' OcrInput allows you to load images to perform OCR on them.
Private input = New OcrInput()

' Load an image from the given file path.
' Ensure that the file path is correct and that the image exists at this path.
input.AddImage("/path/file.png")

' (Optional) Optimize the input for individual text regions if necessary.
' Uncomment the next line if your images have multiple sections of text to improve accuracy.
' input.FindMultipleTextRegions();

' Perform OCR on the input images and get the result.
' OcrResult holds the text extracted from the images.
Dim result As OcrResult = ocr.Read(input)

' Store the recognized text into a string variable.
' This string contains the text extracted from the input image.
Dim resultText As String = result.Text

' The extracted text from the image is now stored in 'resultText' and can be used as needed.
$vbLabelText   $csharpLabel

It can optionally be called with custom parameters:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-5.cs
// Import the IronOcr library, which provides OCR functionality for image text extraction
using IronOcr;

// Create an instance of the IronTesseract class to perform OCR operations
var ocr = new IronTesseract();

// Initialize an OcrInput object within a using statement to ensure proper disposal
using var input = new OcrInput();

// Load at least one image into the OcrInput instance
// Make sure to provide the correct path to your image file
input.AddImage("/path/to/your/image/file.png");

// The 'FindMultipleTextRegions' method allows for detecting multiple regions of text within the image.
// Parameters:
// - Scale: The scale used to resize the image (2.0 is double size, if needed for better recognition)
// - DilationAmount: Affects the dilation step in pre-processing (-1 turns it off, usually 0 or positive)
// - Binarize: Converts the image to black and white if true
// - Invert: Inverts colors if true (false retains original color)
input.FindMultipleTextRegions(Scale: 2.0, DilationAmount: 0, Binarize: true, Invert: false);

// Perform OCR on the input image and store the result
OcrResult result = ocr.Read(input);

// Extract the text from the OCR result
string resultText = result.Text;

// Output the extracted text (optional)
// Uncomment the following line to display the OCR result in the console
// Console.WriteLine(resultText);
' Import the IronOcr library, which provides OCR functionality for image text extraction
Imports IronOcr

' Create an instance of the IronTesseract class to perform OCR operations
Private ocr = New IronTesseract()

' Initialize an OcrInput object within a using statement to ensure proper disposal
Private input = New OcrInput()

' Load at least one image into the OcrInput instance
' Make sure to provide the correct path to your image file
input.AddImage("/path/to/your/image/file.png")

' The 'FindMultipleTextRegions' method allows for detecting multiple regions of text within the image.
' Parameters:
' - Scale: The scale used to resize the image (2.0 is double size, if needed for better recognition)
' - DilationAmount: Affects the dilation step in pre-processing (-1 turns it off, usually 0 or positive)
' - Binarize: Converts the image to black and white if true
' - Invert: Inverts colors if true (false retains original color)
input.FindMultipleTextRegions(Scale:= 2.0, DilationAmount:= 0, Binarize:= True, Invert:= False)

' Perform OCR on the input image and store the result
Dim result As OcrResult = ocr.Read(input)

' Extract the text from the OCR result
Dim resultText As String = result.Text

' Output the extracted text (optional)
' Uncomment the following line to display the OCR result in the console
' Console.WriteLine(resultText);
$vbLabelText   $csharpLabel

Another overload method of FindMultipleTextRegions takes an OCR Page and returns a list of OCR Pages, one for each Text region on it:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-6.cs
using IronOcr;
using System.Collections.Generic;

// Initialize the page index to specify which page to extract text regions from
int pageIndex = 0;

// Create an OcrInput object to hold images for processing
using var input = new OcrInput();

// Load at least one image into the OcrInput object
// Ensure the path is correct and points to the image file you want to process.
// Replace "/path/file.png" with the actual path to your image file.
input.Add("/path/file.png");

// Retrieve the specified page from the loaded images
// The ToOcrPage method converts an image input into an OcrPage object for processing.
var selectedPage = input.ToOcrPage(pageIndex);

// Find and return multiple text regions from the selected page.
// The FindLines method identifies and extracts lines of text from the page.
// It returns a list of OcrResult representing different text areas.
List<OcrResult> textRegionsOnPage = selectedPage.FindLines();
Imports IronOcr
Imports System.Collections.Generic

' Initialize the page index to specify which page to extract text regions from
Private pageIndex As Integer = 0

' Create an OcrInput object to hold images for processing
Private input = New OcrInput()

' Load at least one image into the OcrInput object
' Ensure the path is correct and points to the image file you want to process.
' Replace "/path/file.png" with the actual path to your image file.
input.Add("/path/file.png")

' Retrieve the specified page from the loaded images
' The ToOcrPage method converts an image input into an OcrPage object for processing.
Dim selectedPage = input.ToOcrPage(pageIndex)

' Find and return multiple text regions from the selected page.
' The FindLines method identifies and extracts lines of text from the page.
' It returns a list of OcrResult representing different text areas.
Dim textRegionsOnPage As List(Of OcrResult) = selectedPage.FindLines()
$vbLabelText   $csharpLabel

GetTextRegions

Usage of GetTextRegions returns a list of crop areas where text was detected in a page:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-7.cs
// Necessary namespaces for OCR and image manipulations
using IronOcr; // Required for OCR operations
using System.Linq; // Provides LINQ query capabilities

// Initialize the page index variable to specify which page we want to process
int pageIndex = 0;

// Create a new instance of OcrInput to load image files.
// 'using' ensures that system resources are managed efficiently by disposing the OcrInput object when done.
using var input = new OcrInput();

// Load at least one image into the OcrInput instance.
// AddImage() method is used with the path to the image file.
input.AddImage("/path/to/your/file.png"); // Make sure to set the correct path to your image.

// Retrieve the desired page from the input using the specified pageIndex.
// Pages is an IEnumerable of pages in the input, ElementAt is used to fetch a specific page by index.
var selectedPage = input.Pages.ElementAtOrDefault(pageIndex);

// Check if the selected page is null to avoid exceptions if the index is invalid.
if (selectedPage != null)
{
    // Get all text regions from the selected page.
    // This typically provides bounding boxes or coordinate data for text that was detected on that page.
    var regions = selectedPage.TextRegions;
    
    // Further processing of text regions can be performed here, like extracting text or analyzing text coordinates.
    Console.WriteLine($"Number of text regions detected: {regions.Count}");
}
else
{
    // When the pageIndex is out of range, notify the user.
    Console.WriteLine("The specified page index is out of range.");
}

// Note: Ensure that the IronOcr library and other dependencies are correctly installed and referenced in your project.
' Necessary namespaces for OCR and image manipulations
Imports IronOcr ' Required for OCR operations
Imports System.Linq ' Provides LINQ query capabilities

' Initialize the page index variable to specify which page we want to process
Private pageIndex As Integer = 0

' Create a new instance of OcrInput to load image files.
' 'using' ensures that system resources are managed efficiently by disposing the OcrInput object when done.
Private input = New OcrInput()

' Load at least one image into the OcrInput instance.
' AddImage() method is used with the path to the image file.
input.AddImage("/path/to/your/file.png") ' Make sure to set the correct path to your image.

' Retrieve the desired page from the input using the specified pageIndex.
' Pages is an IEnumerable of pages in the input, ElementAt is used to fetch a specific page by index.
Dim selectedPage = input.Pages.ElementAtOrDefault(pageIndex)

' Check if the selected page is null to avoid exceptions if the index is invalid.
If selectedPage IsNot Nothing Then
	' Get all text regions from the selected page.
	' This typically provides bounding boxes or coordinate data for text that was detected on that page.
	Dim regions = selectedPage.TextRegions

	' Further processing of text regions can be performed here, like extracting text or analyzing text coordinates.
	Console.WriteLine($"Number of text regions detected: {regions.Count}")
Else
	' When the pageIndex is out of range, notify the user.
	Console.WriteLine("The specified page index is out of range.")
End If

' Note: Ensure that the IronOcr library and other dependencies are correctly installed and referenced in your project.
$vbLabelText   $csharpLabel

Frequently Asked Questions

What is this product and how does it use computer vision?

IronOCR uses OpenCV to perform computer vision tasks to detect areas of text in an image. This is useful for images with noise, different text placements, or warped text.

How do I install this ComputerVision component?

IronOCR.ComputerVision can be installed via the IronOCR NuGet package. Depending on the operating system, you will need to install the appropriate package such as IronOcr.ComputerVision.Windows or IronOcr.ComputerVision.Linux.

What is the function of the FindTextRegion method?

The FindTextRegion method detects regions in an image that contain text, guiding Tesseract to only search for text within those areas.

How can I use the FindMultipleTextRegions method?

FindMultipleTextRegions divides an input image into separate images based on detected text regions, allowing for individual text extraction from each region.

What does the GetTextRegions method do?

GetTextRegions returns a list of crop areas, or rectangles, where text was detected in a page, allowing for further processing or analysis.

Can I use custom parameters with the text region detection method?

Yes, FindTextRegion can be called with custom parameters such as minTextHeight and maxTextHeight to specify height constraints for text detection.

Is it possible to retrieve text regions as rectangles?

Yes, by using the overload FindTextRegionsAsRectangles, you can retrieve detected text regions as Rectangle objects.

What is the usage example for detecting text regions?

An example usage of FindTextRegion involves creating an OcrInput object with an image path, calling FindTextRegions, and using IronTesseract to read the detected text.

How do overload methods enhance the text region division function?

Overload methods for FindMultipleTextRegions allow you to divide an OCR page into separate pages for each detected text region, facilitating detailed text extraction.

What are some code examples provided for these methods?

The tutorial provides code examples for using methods like FindTextRegion, FindMultipleTextRegions, and GetTextRegions, demonstrating their application in C#.

Chaknith Bin
Software Engineer
Chaknith works on IronXL and IronBarcode. He has deep expertise in C# and .NET, helping improve the software and support customers. His insights from user interactions contribute to better products, documentation, and overall experience.