Tamil OCR in C# and .NET
Other versions of this document:
IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Tamil.
It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines in both speed and accuracy.
Contents of IronOcr.Languages.Tamil
This package contains 102 OCR languages for .NET:
- Tamil
- TamilBest
- TamilFast
- TamilAlphabet
- TamilAlphabetBest
- TamilAlphabetFast
Download
Tamil Language Pack [தமிழ்]
Installation
The first thing we have to do is install our Tamil OCR package into your .NET project.
Install-Package IronOCR.Languages.Tamil
Code Example
This C# code example reads Tamil text from an image or PDF document.
// Ensure IronOCR.Languages.Tamil package is installed
using IronOcr;
var Ocr = new IronTesseract();
// Set the language to Tamil for OCR processing
Ocr.Language = OcrLanguage.Tamil;
using (var Input = new OcrInput(@"images\Tamil.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Get the recognized text
var AllText = Result.Text;
// Display the recognized text (for example purpose)
Console.WriteLine(AllText);
}
// Ensure IronOCR.Languages.Tamil package is installed
using IronOcr;
var Ocr = new IronTesseract();
// Set the language to Tamil for OCR processing
Ocr.Language = OcrLanguage.Tamil;
using (var Input = new OcrInput(@"images\Tamil.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Get the recognized text
var AllText = Result.Text;
// Display the recognized text (for example purpose)
Console.WriteLine(AllText);
}
' Ensure IronOCR.Languages.Tamil package is installed
Imports IronOcr
Private Ocr = New IronTesseract()
' Set the language to Tamil for OCR processing
Ocr.Language = OcrLanguage.Tamil
Using Input = New OcrInput("images\Tamil.png")
' Perform OCR on the input image
Dim Result = Ocr.Read(Input)
' Get the recognized text
Dim AllText = Result.Text
' Display the recognized text (for example purpose)
Console.WriteLine(AllText)
End Using
- The
IronTesseract
class is used to initialize and set up the OCR engine. - The
Ocr.Language
property specifies the language pack to use for OCR. - The
OcrInput
class is used with the path to the image file containing Tamil text. - The
Ocr.Read()
method processes the image and extracts the text. - Finally, the recognized text is stored in
AllText
and can be utilized as needed.