Sanskrit OCR in C# and .NET

126 More Languages

IronOCR is a C# software component that allows .NET coders to read text from images and PDF documents in 126 languages, including Sanskrit.

It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines in both speed and accuracy.

Contents of IronOcr.Languages.Sanskrit

This package contains 49 OCR languages for .NET:

  • Sanskrit
  • SanskritBest
  • SanskritFast

Download

Sanskrit Language Pack [संस्कृतम्]

Installation

The first step is to install the Sanskrit OCR package into your .NET project.

Install-Package IronOCR.Languages.Sanskrit

Code Example

This C# code example reads Sanskrit text from an Image or PDF document.

// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create an instance of IronTesseract for OCR operations
        var Ocr = new IronTesseract
        {
            // Set the OCR language to Sanskrit
            Language = OcrLanguage.Sanskrit
        };

        // Define the input image or PDF file containing Sanskrit text
        using (var Input = new OcrInput(@"images\Sanskrit.png"))
        {
            // Perform OCR to read the text from the Input
            var Result = Ocr.Read(Input);

            // Capture the extracted text
            var AllText = Result.Text;

            // Print the extracted text to the console
            System.Console.WriteLine(AllText);
        }
    }
}
// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create an instance of IronTesseract for OCR operations
        var Ocr = new IronTesseract
        {
            // Set the OCR language to Sanskrit
            Language = OcrLanguage.Sanskrit
        };

        // Define the input image or PDF file containing Sanskrit text
        using (var Input = new OcrInput(@"images\Sanskrit.png"))
        {
            // Perform OCR to read the text from the Input
            var Result = Ocr.Read(Input);

            // Capture the extracted text
            var AllText = Result.Text;

            // Print the extracted text to the console
            System.Console.WriteLine(AllText);
        }
    }
}
' Import the IronOcr namespace
Imports IronOcr

Friend Class Program
	Shared Sub Main()
		' Create an instance of IronTesseract for OCR operations
		Dim Ocr = New IronTesseract With {.Language = OcrLanguage.Sanskrit}

		' Define the input image or PDF file containing Sanskrit text
		Using Input = New OcrInput("images\Sanskrit.png")
			' Perform OCR to read the text from the Input
			Dim Result = Ocr.Read(Input)

			' Capture the extracted text
			Dim AllText = Result.Text

			' Print the extracted text to the console
			System.Console.WriteLine(AllText)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel
  • This example demonstrates how to configure IronTesseract to perform OCR on a Sanskrit image or PDF.
  • The Ocr.Read() method processes the input and extracts the text content, which is accessible via the Result.Text property.