Sundanese OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Sundanese.

It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.

Contents of IronOcr.Languages.Sundanese

This package contains 52 OCR languages for .NET:

  • Sundanese
  • SundaneseBest
  • SundaneseFast

Download

Sundanese Language Pack [Basa Sunda]

Installation

The first thing we have to do is install our Sundanese OCR package to your .NET project.

Install-Package IronOCR.Languages.Sundanese

Code Example

This C# code example reads Sundanese text from an Image or PDF document.

// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create a new instance of the IronTesseract class
        var Ocr = new IronTesseract();

        // Specify the language the OCR engine should use
        Ocr.Language = OcrLanguage.Sundanese;

        // Initialize the OCR input with an image file containing Sundanese text
        using (var Input = new OcrInput(@"images\Sundanese.png"))
        {
            // Process the input and get the result
            var Result = Ocr.Read(Input);

            // Extract all recognized text from the result
            var AllText = Result.Text;

            // Output the recognized text
            System.Console.WriteLine(AllText);
        }
    }
}
// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create a new instance of the IronTesseract class
        var Ocr = new IronTesseract();

        // Specify the language the OCR engine should use
        Ocr.Language = OcrLanguage.Sundanese;

        // Initialize the OCR input with an image file containing Sundanese text
        using (var Input = new OcrInput(@"images\Sundanese.png"))
        {
            // Process the input and get the result
            var Result = Ocr.Read(Input);

            // Extract all recognized text from the result
            var AllText = Result.Text;

            // Output the recognized text
            System.Console.WriteLine(AllText);
        }
    }
}
' Import the IronOcr namespace
Imports IronOcr

Friend Class Program
	Shared Sub Main()
		' Create a new instance of the IronTesseract class
		Dim Ocr = New IronTesseract()

		' Specify the language the OCR engine should use
		Ocr.Language = OcrLanguage.Sundanese

		' Initialize the OCR input with an image file containing Sundanese text
		Using Input = New OcrInput("images\Sundanese.png")
			' Process the input and get the result
			Dim Result = Ocr.Read(Input)

			' Extract all recognized text from the result
			Dim AllText = Result.Text

			' Output the recognized text
			System.Console.WriteLine(AllText)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

Explanation

  • We first import the IronOcr namespace to use its OCR functionality.
  • An instance of IronTesseract is created, which acts as our main OCR engine.
  • We set the Language property to OcrLanguage.Sundanese to specify that the engine should expect to read Sundanese text.
  • We create an OcrInput object to specify the image file source for our OCR engine.
  • The Read method processes the input and attempts to recognize text.
  • Recognized text is stored in the AllText variable and subsequently printed to the console.

This setup allows for robust recognition of Sundanese language text from images using the IronOCR library in a .NET environment.