Guide to using IronOCR Computer Vision
Introduction
IronOCR utilizes OpenCV to use Computer Vision to detect areas where text exists in an image. This is useful for images that contain a lot of noise, images with text in many different places, and images where text is warped. Use of computer vision in IronOCR will determine where text regions exist and then use Tesseract to attempt to read those regions.
Get started with IronOCR
Start using IronOCR in your project today with a free trial.
IronOCR.ComputerVision Installation via NuGet Package
OpenCV methods that perform Computer Vision in IronOCR are included in the regular IronOCR NuGet package.
Use of these methods requires the NuGet installation of IronOcr.ComputerVision
. You will be prompted to download this package if it is not already installed.
- Windows:
IronOcr.ComputerVision.Windows
- Linux:
IronOcr.ComputerVision.Linux
- macOS:
IronOcr.ComputerVision.MacOS
- macOS ARM:
IronOcr.ComputerVision.MacOS.ARM
Install using the NuGet Package Manager or paste the following in the Package Manager Console:
Install-Package IronOcr.ComputerVision.Windows
This command will provide the necessary assemblies to use IronOCR Computer Vision with our model file.
Functionality and API
Code examples are included further down in this tutorial. Here is a general overview of the methods that are currently available:
Method | Explanation |
---|---|
FindTextRegion | Detect regions which contain text elements and instruct Tesseract to only search for text within the area in which text was detected. |
FindMultipleTextRegions | Detect areas which contain text elements and divide the page into separate images based on text regions. |
GetTextRegions | Scans the image and returns a list of text regions as `List |
Code Examples
FindTextRegion
Usage of FindTextRegion
will use computer vision to detect regions which contain text elements on every page of an OcrInput
object.
:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-1.cs
using IronOcr;
var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");
input.FindTextRegion();
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr
Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")
input.FindTextRegion()
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
It can optionally be called with custom parameters:
:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-2.cs
using IronOcr;
var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");
input.FindTextRegion(Scale: 2.0, DilationAmount: 20, Binarize: true, Invert: true);
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr
Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")
input.FindTextRegion(Scale:= 2.0, DilationAmount:= 20, Binarize:= True, Invert:= True)
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
The overload can be used to return the text region as a Rectangle as well:
:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-3.cs
using IronOcr;
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");
input.FindTextRegion(Scale: 2.0, Binarize: true);
Imports IronOcr
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")
input.FindTextRegion(Scale:= 2.0, Binarize:= True)
FindMultipleTextRegions
Usage of FindMultipleTextRegions
takes all pages of an OcrInput
object, uses computer vision to detect areas which contain text elements and divides the input into separate images based on text regions:
:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-4.cs
using IronOcr;
var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");
input.FindMultipleTextRegions();
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr
Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")
input.FindMultipleTextRegions()
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
It can optionally be called with custom parameters:
:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-5.cs
using IronOcr;
var ocr = new IronTesseract();
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");
input.FindMultipleTextRegions(Scale: 2.0, DilationAmount: -1, Binarize: true, Invert: false);
OcrResult result = ocr.Read(input);
string resultText = result.Text;
Imports IronOcr
Private ocr = New IronTesseract()
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")
input.FindMultipleTextRegions(Scale:= 2.0, DilationAmount:= -1, Binarize:= True, Invert:= False)
Dim result As OcrResult = ocr.Read(input)
Dim resultText As String = result.Text
Another overload method of FindMultipleTextRegions
takes an OCR Page and returns a list of OCR Pages, one for each Text region on it:
:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-6.cs
using IronOcr;
using System.Collections.Generic;
using System.Linq;
int pageIndex = 0;
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");
var selectedPage = input.GetPages().ElementAt(pageIndex);
List<OcrInputPage> textRegionsOnPage = selectedPage.FindMultipleTextRegions();
Imports IronOcr
Imports System.Collections.Generic
Imports System.Linq
Private pageIndex As Integer = 0
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")
Dim selectedPage = input.GetPages().ElementAt(pageIndex)
Dim textRegionsOnPage As List(Of OcrInputPage) = selectedPage.FindMultipleTextRegions()
GetTextRegions
Usage of GetTextRegions
returns a list of crop areas where text was detected in a page:
:path=/static-assets/ocr/content-code-examples/tutorials/csharp-recognize-text-from-image-computer-vision-7.cs
using IronOcr;
using IronSoftware.Drawing;
using System.Collections.Generic;
using System.Linq;
int pageIndex = 0;
using var input = new OcrInput();
// Load at least one image
input.LoadImage("/path/file.png");
var selectedPage = input.GetPages().ElementAt(pageIndex);
var regions = selectedPage.GetTextRegions();
Imports IronOcr
Imports IronSoftware.Drawing
Imports System.Collections.Generic
Imports System.Linq
Private pageIndex As Integer = 0
Private input = New OcrInput()
' Load at least one image
input.LoadImage("/path/file.png")
Dim selectedPage = input.GetPages().ElementAt(pageIndex)
Dim regions = selectedPage.GetTextRegions()
Frequently Asked Questions
What is this product and how does it use computer vision?
IronOCR uses OpenCV to perform computer vision tasks to detect areas of text in an image. This is useful for images with noise, different text placements, or warped text.
How do I install this ComputerVision component?
IronOCR.ComputerVision can be installed via the IronOCR NuGet package. Depending on the operating system, you will need to install the appropriate package such as IronOcr.ComputerVision.Windows or IronOcr.ComputerVision.Linux.
What is the function of the FindTextRegion method?
The FindTextRegion method detects regions in an image that contain text, guiding Tesseract to only search for text within those areas.
How can I use the FindMultipleTextRegions method?
FindMultipleTextRegions divides an input image into separate images based on detected text regions, allowing for individual text extraction from each region.
What does the GetTextRegions method do?
GetTextRegions returns a list of crop areas, or rectangles, where text was detected in a page, allowing for further processing or analysis.
Can I use custom parameters with the text region detection method?
Yes, FindTextRegion can be called with custom parameters such as minTextHeight and maxTextHeight to specify height constraints for text detection.
Is it possible to retrieve text regions as rectangles?
Yes, by using the overload FindTextRegionsAsRectangles, you can retrieve detected text regions as Rectangle objects.
What is the usage example for detecting text regions?
An example usage of FindTextRegion involves creating an OcrInput object with an image path, calling FindTextRegions, and using IronTesseract to read the detected text.
How do overload methods enhance the text region division function?
Overload methods for FindMultipleTextRegions allow you to divide an OCR page into separate pages for each detected text region, facilitating detailed text extraction.
What are some code examples provided for these methods?
The tutorial provides code examples for using methods like FindTextRegion, FindMultipleTextRegions, and GetTextRegions, demonstrating their application in C#.