Sinhala OCR in C# and .NET
Other versions of this document:
IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Sinhala.
It is an advanced fork of Tesseract, built exclusively for .NET developers, and regularly outperforms other Tesseract engines in both speed and accuracy.
Contents of IronOcr.Languages.Sinhala
This package contains 114 OCR languages for .NET:
- Sinhala
- SinhalaBest
- SinhalaFast
- SinhalaAlphabet
- SinhalaAlphabetBest
- SinhalaAlphabetFast
Download
Sinhala Language Pack [සිංහල]
Installation
The first thing we have to do is install the Sinhala OCR package into your .NET project.
Install-Package IronOCR.Languages.Sinhala
Code Example
This C# code example reads Sinhala text from an image or PDF document.
// Import the IronOcr namespace
using IronOcr;
class SinhalaOcrExample
{
static void Main()
{
// Initialize the IronTesseract OCR engine
var Ocr = new IronTesseract();
// Set the OCR engine to use the Sinhala language
Ocr.Language = OcrLanguage.Sinhala;
// Define the input image or PDF file
using (var Input = new OcrInput(@"images\Sinhala.png"))
{
// Perform OCR on the input
var Result = Ocr.Read(Input);
// Retrieve the recognized text
var AllText = Result.Text;
// Output the recognized text
Console.WriteLine(AllText);
}
}
}
// Import the IronOcr namespace
using IronOcr;
class SinhalaOcrExample
{
static void Main()
{
// Initialize the IronTesseract OCR engine
var Ocr = new IronTesseract();
// Set the OCR engine to use the Sinhala language
Ocr.Language = OcrLanguage.Sinhala;
// Define the input image or PDF file
using (var Input = new OcrInput(@"images\Sinhala.png"))
{
// Perform OCR on the input
var Result = Ocr.Read(Input);
// Retrieve the recognized text
var AllText = Result.Text;
// Output the recognized text
Console.WriteLine(AllText);
}
}
}
' Import the IronOcr namespace
Imports IronOcr
Friend Class SinhalaOcrExample
Shared Sub Main()
' Initialize the IronTesseract OCR engine
Dim Ocr = New IronTesseract()
' Set the OCR engine to use the Sinhala language
Ocr.Language = OcrLanguage.Sinhala
' Define the input image or PDF file
Using Input = New OcrInput("images\Sinhala.png")
' Perform OCR on the input
Dim Result = Ocr.Read(Input)
' Retrieve the recognized text
Dim AllText = Result.Text
' Output the recognized text
Console.WriteLine(AllText)
End Using
End Sub
End Class
Explanation:
- IronTesseract: This is the main OCR engine class used for text recognition.
- Language: Specifies the language of the text to be recognized; in this case, Sinhala.
- OcrInput: Represents the input file (image or PDF) where text recognition needs to be performed.
- Read: Executes the OCR process on the input file and returns the recognized text.
- Result.Text: Contains the OCR-recognized text from the input file, which can be used for further processing or display.