Estonian OCR in C# and .NET
Other versions of this document:
IronOCR is a C# software component allowing .NET developers to read text from images and PDF documents in 126 languages, including Estonian. It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.
Contents of IronOcr.Languages.Estonian
This package contains the following OCR languages for .NET:
- Estonian
- EstonianBest
- EstonianFast
Download
Estonian Language Pack [eesti]
Installation
The first thing we have to do is install our Estonian OCR package into your .NET project.
Install-Package IronOCR.Languages.Estonian
Code Example
This C# code example reads Estonian text from an image or PDF document.
// Import the IronOcr namespace
using IronOcr;
// Create a new instance of the IronTesseract class
var Ocr = new IronTesseract();
// Set the OCR language to Estonian
Ocr.Language = OcrLanguage.Estonian;
// Load the image or PDF from which text needs to be extracted
using (var Input = new OcrInput(@"images\Estonian.png"))
{
// Perform OCR to read text from the specified input
var Result = Ocr.Read(Input);
// Extract all the recognized text from the OCR result
var AllText = Result.Text;
}
// Import the IronOcr namespace
using IronOcr;
// Create a new instance of the IronTesseract class
var Ocr = new IronTesseract();
// Set the OCR language to Estonian
Ocr.Language = OcrLanguage.Estonian;
// Load the image or PDF from which text needs to be extracted
using (var Input = new OcrInput(@"images\Estonian.png"))
{
// Perform OCR to read text from the specified input
var Result = Ocr.Read(Input);
// Extract all the recognized text from the OCR result
var AllText = Result.Text;
}
' Import the IronOcr namespace
Imports IronOcr
' Create a new instance of the IronTesseract class
Private Ocr = New IronTesseract()
' Set the OCR language to Estonian
Ocr.Language = OcrLanguage.Estonian
' Load the image or PDF from which text needs to be extracted
Using Input = New OcrInput("images\Estonian.png")
' Perform OCR to read text from the specified input
Dim Result = Ocr.Read(Input)
' Extract all the recognized text from the OCR result
Dim AllText = Result.Text
End Using
Explanation of the Code:
- IronTesseract: This is a primary class provided by IronOCR to perform OCR operations.
- Ocr.Language: By setting this property, we define which language should be used during OCR. Here, it is set to Estonian.
- OcrInput: This is used to specify the image or PDF document that we want to read from. It takes a file path as an input.
- Ocr.Read(Input): This method processes the specified input and performs OCR on it.
- Result.Text: This property contains all the text that has been successfully recognized and extracted from the image or PDF document.