Arabic OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component that allows .NET coders to read text from images and PDF documents in 126 languages, including Arabic.

It is an advanced fork of Tesseract, built exclusively for .NET developers, and regularly outperforms other Tesseract engines in both speed and accuracy.

Contents of IronOcr.Languages.Arabic

This package contains 108 OCR languages for .NET:

  • Arabic
  • ArabicBest
  • ArabicFast
  • ArabicAlphabet
  • ArabicAlphabetBest
  • ArabicAlphabetFast

Download

Arabic Language Pack [العربية]

Installation

The first thing we have to do is install the Arabic OCR package into your .NET project.

Install-Package IronOCR.Languages.Arabic

Code Example

This C# code example reads Arabic text from an image or PDF document.

// Import the IronOcr namespace to use its classes.
using IronOcr;

// Create a new instance of the IronTesseract class.
var Ocr = new IronTesseract();

// Set the OCR language to Arabic.
Ocr.Language = OcrLanguage.Arabic;

// Use a using statement to ensure that resources are disposed of correctly.
using (var Input = new OcrInput(@"images\Arabic.png"))
{
    // Perform OCR on the input image or document.
    var Result = Ocr.Read(Input);

    // Retrieve all recognized text from the document.
    var AllText = Result.Text;

    // Optionally, you can output the text to the console or use it otherwise.
    // Console.WriteLine(AllText);
}
// Import the IronOcr namespace to use its classes.
using IronOcr;

// Create a new instance of the IronTesseract class.
var Ocr = new IronTesseract();

// Set the OCR language to Arabic.
Ocr.Language = OcrLanguage.Arabic;

// Use a using statement to ensure that resources are disposed of correctly.
using (var Input = new OcrInput(@"images\Arabic.png"))
{
    // Perform OCR on the input image or document.
    var Result = Ocr.Read(Input);

    // Retrieve all recognized text from the document.
    var AllText = Result.Text;

    // Optionally, you can output the text to the console or use it otherwise.
    // Console.WriteLine(AllText);
}
' Import the IronOcr namespace to use its classes.
Imports IronOcr

' Create a new instance of the IronTesseract class.
Private Ocr = New IronTesseract()

' Set the OCR language to Arabic.
Ocr.Language = OcrLanguage.Arabic

' Use a using statement to ensure that resources are disposed of correctly.
Using Input = New OcrInput("images\Arabic.png")
	' Perform OCR on the input image or document.
	Dim Result = Ocr.Read(Input)

	' Retrieve all recognized text from the document.
	Dim AllText = Result.Text

	' Optionally, you can output the text to the console or use it otherwise.
	' Console.WriteLine(AllText);
End Using
$vbLabelText   $csharpLabel