Arabic OCR in C# and .NET

This article was translated from English: Does it need improvement?
Translated
View the article in English
Other versions of this document:

IronOCR est un composant logiciel C# qui permet aux codeurs .NET de lire du texte à partir d'images et de documents PDF dans 126 langues, y compris l'arabe.

Il s'agit d'une version avancée de Tesseract, conçue exclusivement pour les développeurs .NET, qui surpasse régulièrement les autres moteurs Tesseract en termes de vitesse et de précision.

Contenu de IronOcr.Languages.Arabic

Ce pack contient 108 langues OCR pour .NET :

  • Langue arabe
  • ArabicBest
  • ArabicFast
  • Alphabet arabe
  • ArabeAlphabetBest
  • Alphabet arabeRapide

Télécharger

Pack linguistique arabe [العربية]

Installation

La première chose à faire est d'installer le paquetage Arabic OCR dans votre projet .NET.

Install-Package IronOCR.Languages.Arabic

Exemple de code

Cet exemple de code C# lit du texte arabe à partir d'une image ou d'un document PDF.

// Import the IronOcr namespace to use its classes.
using IronOcr;

// Create a new instance of the IronTesseract class.
var Ocr = new IronTesseract();

// Set the OCR language to Arabic.
Ocr.Language = OcrLanguage.Arabic;

// Use a using statement to ensure that resources are disposed of correctly.
using (var Input = new OcrInput(@"images\Arabic.png"))
{
    // Perform OCR on the input image or document.
    var Result = Ocr.Read(Input);

    // Retrieve all recognized text from the document.
    var AllText = Result.Text;

    // Optionally, you can output the text to the console or use it otherwise.
    // Console.WriteLine(AllText);
}
// Import the IronOcr namespace to use its classes.
using IronOcr;

// Create a new instance of the IronTesseract class.
var Ocr = new IronTesseract();

// Set the OCR language to Arabic.
Ocr.Language = OcrLanguage.Arabic;

// Use a using statement to ensure that resources are disposed of correctly.
using (var Input = new OcrInput(@"images\Arabic.png"))
{
    // Perform OCR on the input image or document.
    var Result = Ocr.Read(Input);

    // Retrieve all recognized text from the document.
    var AllText = Result.Text;

    // Optionally, you can output the text to the console or use it otherwise.
    // Console.WriteLine(AllText);
}
' Import the IronOcr namespace to use its classes.
Imports IronOcr

' Create a new instance of the IronTesseract class.
Private Ocr = New IronTesseract()

' Set the OCR language to Arabic.
Ocr.Language = OcrLanguage.Arabic

' Use a using statement to ensure that resources are disposed of correctly.
Using Input = New OcrInput("images\Arabic.png")
	' Perform OCR on the input image or document.
	Dim Result = Ocr.Read(Input)

	' Retrieve all recognized text from the document.
	Dim AllText = Result.Text

	' Optionally, you can output the text to the console or use it otherwise.
	' Console.WriteLine(AllText);
End Using
$vbLabelText   $csharpLabel