Estonian OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component allowing .NET developers to read text from images and PDF documents in 126 languages, including Estonian. It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.

Contents of IronOcr.Languages.Estonian

This package contains the following OCR languages for .NET:

  • Estonian
  • EstonianBest
  • EstonianFast

Download

Estonian Language Pack [eesti]

Installation

The first thing we have to do is install our Estonian OCR package into your .NET project.

Install-Package IronOCR.Languages.Estonian

Code Example

This C# code example reads Estonian text from an image or PDF document.

// Import the IronOcr namespace
using IronOcr;

// Create a new instance of the IronTesseract class
var Ocr = new IronTesseract();

// Set the OCR language to Estonian
Ocr.Language = OcrLanguage.Estonian;

// Load the image or PDF from which text needs to be extracted
using (var Input = new OcrInput(@"images\Estonian.png"))
{
    // Perform OCR to read text from the specified input
    var Result = Ocr.Read(Input);

    // Extract all the recognized text from the OCR result
    var AllText = Result.Text;
}
// Import the IronOcr namespace
using IronOcr;

// Create a new instance of the IronTesseract class
var Ocr = new IronTesseract();

// Set the OCR language to Estonian
Ocr.Language = OcrLanguage.Estonian;

// Load the image or PDF from which text needs to be extracted
using (var Input = new OcrInput(@"images\Estonian.png"))
{
    // Perform OCR to read text from the specified input
    var Result = Ocr.Read(Input);

    // Extract all the recognized text from the OCR result
    var AllText = Result.Text;
}
' Import the IronOcr namespace
Imports IronOcr

' Create a new instance of the IronTesseract class
Private Ocr = New IronTesseract()

' Set the OCR language to Estonian
Ocr.Language = OcrLanguage.Estonian

' Load the image or PDF from which text needs to be extracted
Using Input = New OcrInput("images\Estonian.png")
	' Perform OCR to read text from the specified input
	Dim Result = Ocr.Read(Input)

	' Extract all the recognized text from the OCR result
	Dim AllText = Result.Text
End Using
$vbLabelText   $csharpLabel

Explanation of the Code:

  • IronTesseract: This is a primary class provided by IronOCR to perform OCR operations.
  • Ocr.Language: By setting this property, we define which language should be used during OCR. Here, it is set to Estonian.
  • OcrInput: This is used to specify the image or PDF document that we want to read from. It takes a file path as an input.
  • Ocr.Read(Input): This method processes the specified input and performs OCR on it.
  • Result.Text: This property contains all the text that has been successfully recognized and extracted from the image or PDF document.