Catalan OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Catalan.

It is an advanced fork of Tesseract, built exclusively for .NET developers, and regularly outperforms other Tesseract engines in both speed and accuracy.

Contents of IronOcr.Languages.Catalan

This package contains 46 OCR languages for .NET:

  • Catalan
  • CatalanBest
  • CatalanFast

Download

Catalan Language Pack [català]

Installation

The first thing we have to do is install our Catalan OCR package to your .NET project.

Install-Package IronOCR.Languages.Catalan

Code Example

This C# code example reads Catalan text from an image or PDF document.

// Import the IronOcr namespace to use its OCR functionality
using IronOcr;

class CatalanOcrExample
{
    static void Main()
    {
        // Create a new instance of the IronTesseract class
        var Ocr = new IronTesseract();

        // Set the language for OCR processing to Catalan
        Ocr.Language = OcrLanguage.Catalan;

        // Define the input image or PDF from which you want to read the text
        using (var Input = new OcrInput(@"images\Catalan.png"))
        {
            // Perform OCR reading on the input
            var Result = Ocr.Read(Input);

            // Retrieve all recognized text
            var AllText = Result.Text;

            // Output the recognized text
            Console.WriteLine(AllText);
        }
    }
}
// Import the IronOcr namespace to use its OCR functionality
using IronOcr;

class CatalanOcrExample
{
    static void Main()
    {
        // Create a new instance of the IronTesseract class
        var Ocr = new IronTesseract();

        // Set the language for OCR processing to Catalan
        Ocr.Language = OcrLanguage.Catalan;

        // Define the input image or PDF from which you want to read the text
        using (var Input = new OcrInput(@"images\Catalan.png"))
        {
            // Perform OCR reading on the input
            var Result = Ocr.Read(Input);

            // Retrieve all recognized text
            var AllText = Result.Text;

            // Output the recognized text
            Console.WriteLine(AllText);
        }
    }
}
' Import the IronOcr namespace to use its OCR functionality
Imports IronOcr

Friend Class CatalanOcrExample
	Shared Sub Main()
		' Create a new instance of the IronTesseract class
		Dim Ocr = New IronTesseract()

		' Set the language for OCR processing to Catalan
		Ocr.Language = OcrLanguage.Catalan

		' Define the input image or PDF from which you want to read the text
		Using Input = New OcrInput("images\Catalan.png")
			' Perform OCR reading on the input
			Dim Result = Ocr.Read(Input)

			' Retrieve all recognized text
			Dim AllText = Result.Text

			' Output the recognized text
			Console.WriteLine(AllText)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

In this code:

  • We create an instance of IronTesseract to handle OCR operations.
  • The Ocr.Language is specified as Catalan, indicating that the OCR engine should process images using the Catalan language model.
  • We use OcrInput to specify the file path of the image or PDF document.
  • The Read method is called on the Ocr object, and the OCR reading results are stored in the variable Result.
  • Finally, Result.Text contains the recognized text, which is printed to the console.