Guide to using IronOCR Computer Vision

Introduction

IronOCR utilizes OpenCV to use Computer Vision to detect areas where text exists in an image. This is useful for images that contain a lot of noise, images with text in many different places, and images where text is warped. Use of computer vision in IronOCR will determine where text regions exist and then use Tesseract to attempt to read those regions.

Get started with IronOCR

Start using IronOCR in your project today with a free trial.

First Step:
green arrow pointer


IronOCR.ComputerVision Installation via NuGet Package

OpenCV methods that perform Computer Vision in IronOCR are included in the regular IronOCR NuGet package.

Use of these methods requires the NuGet installation of IronOcr.ComputerVision. You will be prompted to download this package if it is not already installed.

  • Windows: IronOcr.ComputerVision.Windows
  • Linux: IronOcr.ComputerVision.Linux
  • macOS: IronOcr.ComputerVision.MacOS
  • macOS ARM: IronOcr.ComputerVision.MacOS.ARM

Install using the NuGet Package Manager or paste the following in the Package Manager Console:

Install-Package IronOcr.ComputerVision.Windows

This command will provide the necessary assemblies to use IronOCR Computer Vision with our model file.

Functionality and API

Code examples are included further down in this tutorial. Here is a general overview of the methods that are currently available:

MethodExplanation
FindTextRegionDetect regions which contain text elements and instruct Tesseract to only search for text within the area in which text was detected.
FindMultipleTextRegionsDetect areas which contain text elements and divide the page into separate images based on text regions.
GetTextRegionsScans the image and returns a list of text regions as `List`.

Code Examples

FindTextRegion

Usage of FindTextRegion will use computer vision to detect regions which contain text elements on every page of an OcrInput object.

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-1.cs
using IronOcr;

var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");

input.FindTextRegion();
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr

Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")

input.FindTextRegion()
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
$vbLabelText   $csharpLabel

It can optionally be called with custom parameters:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-2.cs
using IronOcr;

var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");

input.FindTextRegion(Scale: 2.0, DilationAmount: 20, Binarize: true, Invert: true);
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr

Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")

input.FindTextRegion(Scale:= 2.0, DilationAmount:= 20, Binarize:= True, Invert:= True)
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
$vbLabelText   $csharpLabel

The overload can be used to return the text region as a Rectangle as well:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-3.cs
using IronOcr;

using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");

input.FindTextRegion(Scale: 2.0, Binarize: true);
Imports IronOcr

Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")

input.FindTextRegion(Scale:= 2.0, Binarize:= True)
$vbLabelText   $csharpLabel

FindMultipleTextRegions

Usage of FindMultipleTextRegions takes all pages of an OcrInput object, uses computer vision to detect areas which contain text elements and divides the input into separate images based on text regions:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-4.cs
using IronOcr;

var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");

input.FindMultipleTextRegions();
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr

Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")

input.FindMultipleTextRegions()
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
$vbLabelText   $csharpLabel

It can optionally be called with custom parameters:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-5.cs
using IronOcr;

var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");

input.FindMultipleTextRegions(Scale: 2.0, DilationAmount: -1, Binarize: true, Invert: false);
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr

Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")

input.FindMultipleTextRegions(Scale:= 2.0, DilationAmount:= -1, Binarize:= True, Invert:= False)
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
$vbLabelText   $csharpLabel

Another overload method of FindMultipleTextRegions takes an OCR Page and returns a list of OCR Pages, one for each Text region on it:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-6.cs
using IronOcr;
using System.Collections.Generic;
using System.Linq;

int pageIndex = 0;
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");

var selectedPage = input.GetPages().ElementAt(pageIndex);
List<OcrInputPage> textRegionsOnPage = selectedPage.FindMultipleTextRegions();
Imports IronOcr
Imports System.Collections.Generic
Imports System.Linq

Private pageIndex As Integer = 0
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")

Dim selectedPage = input.GetPages().ElementAt(pageIndex)
Dim textRegionsOnPage As List(Of OcrInputPage) = selectedPage.FindMultipleTextRegions()
$vbLabelText   $csharpLabel

GetTextRegions

Usage of GetTextRegions returns a list of crop areas where text was detected in a page:

:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-7.cs
using IronOcr;
using IronSoftware.Drawing;
using System.Collections.Generic;
using System.Linq;


int pageIndex = 0;
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");

var selectedPage = input.GetPages().ElementAt(pageIndex);
var regions = selectedPage.GetTextRegions();
Imports IronOcr
Imports IronSoftware.Drawing
Imports System.Collections.Generic
Imports System.Linq


Private pageIndex As Integer = 0
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")

Dim selectedPage = input.GetPages().ElementAt(pageIndex)
Dim regions = selectedPage.GetTextRegions()
$vbLabelText   $csharpLabel

Frequently Asked Questions

What is this product and how does it use computer vision?

IronOCR uses OpenCV to perform computer vision tasks to detect areas of text in an image. This is useful for images with noise, different text placements, or warped text.

How do I install this ComputerVision component?

IronOCR.ComputerVision can be installed via the IronOCR NuGet package. Depending on the operating system, you will need to install the appropriate package such as IronOcr.ComputerVision.Windows or IronOcr.ComputerVision.Linux.

What is the function of the FindTextRegion method?

The FindTextRegion method detects regions in an image that contain text, guiding Tesseract to only search for text within those areas.

How can I use the FindMultipleTextRegions method?

FindMultipleTextRegions divides an input image into separate images based on detected text regions, allowing for individual text extraction from each region.

What does the GetTextRegions method do?

GetTextRegions returns a list of crop areas, or rectangles, where text was detected in a page, allowing for further processing or analysis.

Can I use custom parameters with the text region detection method?

Yes, FindTextRegion can be called with custom parameters such as minTextHeight and maxTextHeight to specify height constraints for text detection.

Is it possible to retrieve text regions as rectangles?

Yes, by using the overload FindTextRegionsAsRectangles, you can retrieve detected text regions as Rectangle objects.

What is the usage example for detecting text regions?

An example usage of FindTextRegion involves creating an OcrInput object with an image path, calling FindTextRegions, and using IronTesseract to read the detected text.

How do overload methods enhance the text region division function?

Overload methods for FindMultipleTextRegions allow you to divide an OCR page into separate pages for each detected text region, facilitating detailed text extraction.

What are some code examples provided for these methods?

The tutorial provides code examples for using methods like FindTextRegion, FindMultipleTextRegions, and GetTextRegions, demonstrating their application in C#.

Chaknith Bin
Software Engineer
Chaknith works on IronXL and IronBarcode. He has deep expertise in C# and .NET, helping improve the software and support customers. His insights from user interactions contribute to better products, documentation, and overall experience.