Panjabi OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Panjabi. It is an advanced fork of Tesseract, built exclusively for the .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.

Contents of IronOcr.Languages.Panjabi

This package contains 46 OCR languages for .NET:

  • Panjabi
  • PanjabiBest
  • PanjabiFast

Download

Panjabi Language Pack [ਪੰਜਾਬੀ]

Installation

The first thing we have to do is install the Panjabi OCR package into your .NET project.

Install-Package IronOCR.Languages.Panjabi

Code Example

This C# code example reads Panjabi text from an Image or PDF document.

// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create a new instance of the IronTesseract class
        var Ocr = new IronTesseract();

        // Set the language to Panjabi
        Ocr.Language = OcrLanguage.Panjabi;

        // Define the input image or PDF file
        using (var Input = new OcrInput(@"images\Panjabi.png"))
        {
            // Perform OCR on the input file
            var Result = Ocr.Read(Input);

            // Extract and store the recognized text from the OCR result
            var AllText = Result.Text;
        }
    }
}
// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create a new instance of the IronTesseract class
        var Ocr = new IronTesseract();

        // Set the language to Panjabi
        Ocr.Language = OcrLanguage.Panjabi;

        // Define the input image or PDF file
        using (var Input = new OcrInput(@"images\Panjabi.png"))
        {
            // Perform OCR on the input file
            var Result = Ocr.Read(Input);

            // Extract and store the recognized text from the OCR result
            var AllText = Result.Text;
        }
    }
}
' Import the IronOcr namespace
Imports IronOcr

Friend Class Program
	Shared Sub Main()
		' Create a new instance of the IronTesseract class
		Dim Ocr = New IronTesseract()

		' Set the language to Panjabi
		Ocr.Language = OcrLanguage.Panjabi

		' Define the input image or PDF file
		Using Input = New OcrInput("images\Panjabi.png")
			' Perform OCR on the input file
			Dim Result = Ocr.Read(Input)

			' Extract and store the recognized text from the OCR result
			Dim AllText = Result.Text
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

Explanation

  • IronTesseract: This is the main class provided by IronOCR for OCR operations.
  • Ocr.Language: We specify which language the OCR engine should use. Here, it is set to Panjabi.
  • OcrInput: This class is used to specify the input file (image or PDF) on which OCR needs to be performed.
  • Ocr.Read(): This method performs the actual OCR task and returns a result containing the extracted text.
  • Result.Text: This contains the extracted text after OCR is performed on the input file.

This example demonstrates how to effectively use the IronOCR library to extract Panjabi text from images or PDF documents in a .NET application.