Latin Alphabet OCR in C# and .NET

126 More Languages

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including the Latin Alphabet.

It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.

Contents of IronOcr.Languages.LatinAlphabet

This package contains 64 OCR languages for .NET:

  • LatinAlphabet
  • LatinAlphabetBest
  • LatinAlphabetFast

Download

Latin Alphabet Language Pack [latine]

Installation

The first thing we have to do is install our Latin Alphabet OCR package to your .NET project.

Install-Package IronOCR.Languages.LatinAlphabet

Code Example

This C# code example reads Latin Alphabet text from an Image or PDF document.

// Install the IronOCR.languages.LatinAlphabet package first
using IronOcr;

var Ocr = new IronTesseract(); // Initialize IronTesseract instance

// Set the OCR language to LatinAlphabet
Ocr.Language = OcrLanguage.LatinAlphabet;

// Define the input image or PDF you want to read
using (var Input = new OcrInput(@"images\LatinAlphabet.png"))
{
    // Perform OCR reading on the input
    var Result = Ocr.Read(Input);

    // Extract the recognized text
    var AllText = Result.Text;

    // Output the recognized text
    Console.WriteLine(AllText);
}
// Install the IronOCR.languages.LatinAlphabet package first
using IronOcr;

var Ocr = new IronTesseract(); // Initialize IronTesseract instance

// Set the OCR language to LatinAlphabet
Ocr.Language = OcrLanguage.LatinAlphabet;

// Define the input image or PDF you want to read
using (var Input = new OcrInput(@"images\LatinAlphabet.png"))
{
    // Perform OCR reading on the input
    var Result = Ocr.Read(Input);

    // Extract the recognized text
    var AllText = Result.Text;

    // Output the recognized text
    Console.WriteLine(AllText);
}
' Install the IronOCR.languages.LatinAlphabet package first
Imports IronOcr

Private Ocr = New IronTesseract() ' Initialize IronTesseract instance

' Set the OCR language to LatinAlphabet
Ocr.Language = OcrLanguage.LatinAlphabet

' Define the input image or PDF you want to read
Using Input = New OcrInput("images\LatinAlphabet.png")
	' Perform OCR reading on the input
	Dim Result = Ocr.Read(Input)

	' Extract the recognized text
	Dim AllText = Result.Text

	' Output the recognized text
	Console.WriteLine(AllText)
End Using
$vbLabelText   $csharpLabel

Explanation

  1. IronTesseract Initialization: An instance of IronTesseract is initialized, which will handle the OCR processing.

  2. Language Setting: The OCR language is set to LatinAlphabet, which is one of the available languages in the IronOCR package.

  3. Input Specification: An OcrInput object is created, specifying the path to the image or PDF from which text will be extracted.

  4. OCR Execution: The Read method of the IronTesseract instance is called to process the OcrInput. This returns a Result object containing the extracted text.

  5. Text Extraction: The Text property of the Result object is used to access the recognized text.

  6. Output: The recognized text is printed to the console for verification.

Ensure that the file path in OcrInput correctly points to your image or PDF file to avoid file not found exceptions.