Guide to using IronOCR Computer Vision

Introduction

IronOCR utilizes OpenCV to use Computer Vision to detect areas where text exists in an image. This is useful for images that contain a lot of noise, images with text in many different places, and images where text is warped. Use of computer vision in IronOCR will determine where text regions exists and then use Tesseract to attempt to read those regions.

IronOCR.ComputerVision Installation via NuGet Package

OpenCV methods that perform Computer Vision in IronOCR are visible in the regular IronOCR NuGet package.

Use of these methods requires NuGet install of IronOcr.ComputerVision to the solution, you are prompted to download if you do not have it installed.

  • Windows: IronOcr.ComputerVision.Windows
  • Linux: IronOcr.ComputerVision.Linux
  • macOS: IronOcr.ComputerVision.MacOS
  • macOS ARM: IronOcr.ComputerVision.MacOS.ARM

Install using the NuGet Package Manager or paste the following in the Package Manager Console:

PM> Install-Package IronOcr.ComputerVision.Windows

This will provide the necessary assemblies to use IronOCR Computer Vision with our model file.

Functionality and API

Code Examples are included further down this tutorial. Here is a general overview of the methods that are currently available:

MethodExplanation
FindTextRegionDetect regions which contain text elements and instruct Tesseract to only search for text within the area in which text was detected.
FindMultipleTextRegionsDetect areas which contain text elements and divide the page into separate images based on text regions.
GetTextRegionsScans the image and returns a list of text regions as `List`.

Code Examples

FindTextRegion

Usage of FindTextRegion will use computer vision to detect regions which contain text elements on every page of an OcrInput object.

var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindTextRegion();
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindTextRegion();
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
Dim ocr = New IronTesseract()

Using input = New OcrInput(path)
	input.FindTextRegion()
	Dim result As OcrResult = ocr.Read(input)
	Dim resultText As String = result.Text
End Using
VB   C#

Can optionally be called with custom parameters:

var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindTextRegion(Scale: 2.0, DilationAmount: 20, Binarize: true, Invert: true);
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindTextRegion(Scale: 2.0, DilationAmount: 20, Binarize: true, Invert: true);
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
Dim ocr = New IronTesseract()

Using input = New OcrInput(path)
	input.FindTextRegion(Scale:= 2.0, DilationAmount:= 20, Binarize:= True, Invert:= True)
	Dim result As OcrResult = ocr.Read(input)
	Dim resultText As String = result.Text
End Using
VB   C#

The overload can be used to return the text region as a Rectangle as well:

using (var input = new OcrInput(path))
{
    CropRectangle textRegion = input.FindTextRegion(Scale: 2.0, Binarize: true);
}
using (var input = new OcrInput(path))
{
    CropRectangle textRegion = input.FindTextRegion(Scale: 2.0, Binarize: true);
}
Using input = New OcrInput(path)
	Dim textRegion As CropRectangle = input.FindTextRegion(Scale:= 2.0, Binarize:= True)
End Using
VB   C#

FindMultipleTextRegions

Usage of FindMultipleTextRegions takes all pages of an OcrInput object and uses computer vision to detect areas which contain text elements and divide the input into separate images based on text regions:

var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindMultipleTextRegions();
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindMultipleTextRegions();
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
Dim ocr = New IronTesseract()

Using input = New OcrInput(path)
	input.FindMultipleTextRegions()
	Dim result As OcrResult = ocr.Read(input)
	Dim resultText As String = result.Text
End Using
VB   C#

Can optionally be called with custom parameters:

var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindMultipleTextRegions(Scale: 2.0, DilationAmount: -1, Binarize: true, Invert: false);
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
var ocr = new IronTesseract();

using (var input = new OcrInput(path))
{
    input.FindMultipleTextRegions(Scale: 2.0, DilationAmount: -1, Binarize: true, Invert: false);
    OcrResult result = ocr.Read(input);
    string resultText = result.Text;
}
Dim ocr = New IronTesseract()

Using input = New OcrInput(path)
	input.FindMultipleTextRegions(Scale:= 2.0, DilationAmount:= -1, Binarize:= True, Invert:= False)
	Dim result As OcrResult = ocr.Read(input)
	Dim resultText As String = result.Text
End Using
VB   C#

Another overload method of FindMultipleTextRegions takes an OCR Page and returns a list of OCR Pages, one for each Text region on it:

int pageIndex = 0;

using (var input = new OcrInput(path))
{
    var selectedPage = input.Pages[pageIndex];
    List<OcrInput.Page> textRegionsOnPage = selectedPage.FindMultipleTextRegions();
}
int pageIndex = 0;

using (var input = new OcrInput(path))
{
    var selectedPage = input.Pages[pageIndex];
    List<OcrInput.Page> textRegionsOnPage = selectedPage.FindMultipleTextRegions();
}
Dim pageIndex As Integer = 0

Using input = New OcrInput(path)
	Dim selectedPage = input.Pages(pageIndex)
	Dim textRegionsOnPage As List(Of OcrInput.Page) = selectedPage.FindMultipleTextRegions()
End Using
VB   C#

GetTextRegions

Usage of GetTextRegions returns a list of crop areas where text was detected in a page:

int pageIndex = 0;

using (var input = new OcrInput(path))
{
    var selectedPage = input.Pages[pageIndex];
    List<CropRectangle> regions = selectedPage.GetTextRegions();
}
int pageIndex = 0;

using (var input = new OcrInput(path))
{
    var selectedPage = input.Pages[pageIndex];
    List<CropRectangle> regions = selectedPage.GetTextRegions();
}
Dim pageIndex As Integer = 0

Using input = New OcrInput(path)
	Dim selectedPage = input.Pages(pageIndex)
	Dim regions As List(Of CropRectangle) = selectedPage.GetTextRegions()
End Using
VB   C#