Swahili OCR in C# and .NET
Other versions of this document:
IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Swahili. It is an advanced fork of Tesseract, built exclusively for .NET developers, and regularly outperforms other Tesseract engines for both speed and accuracy.
Contents of IronOcr.Languages.Swahili
This package contains 46 OCR languages for .NET:
- Swahili
- SwahiliBest
- SwahiliFast
Download
Swahili Language Pack [Kiswahili]
Installation
The first thing we have to do is install our Swahili OCR package to your .NET project.
Install-Package IronOCR.Languages.Swahili
Code Example
This C# code example reads Swahili text from an Image or PDF document.
using IronOcr;
var Ocr = new IronTesseract();
// Set the OCR language to Swahili
Ocr.Language = OcrLanguage.Swahili;
// Create an OCR input for the image or PDF file
using (var Input = new OcrInput(@"images\Swahili.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Retrieve the recognized text
var AllText = Result.Text;
// Output the recognized text to the console (optional)
Console.WriteLine(AllText);
}
using IronOcr;
var Ocr = new IronTesseract();
// Set the OCR language to Swahili
Ocr.Language = OcrLanguage.Swahili;
// Create an OCR input for the image or PDF file
using (var Input = new OcrInput(@"images\Swahili.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Retrieve the recognized text
var AllText = Result.Text;
// Output the recognized text to the console (optional)
Console.WriteLine(AllText);
}
Imports IronOcr
Private Ocr = New IronTesseract()
' Set the OCR language to Swahili
Ocr.Language = OcrLanguage.Swahili
' Create an OCR input for the image or PDF file
Using Input = New OcrInput("images\Swahili.png")
' Perform OCR on the input image
Dim Result = Ocr.Read(Input)
' Retrieve the recognized text
Dim AllText = Result.Text
' Output the recognized text to the console (optional)
Console.WriteLine(AllText)
End Using
Explanation:
Using IronOcr Namespace: We include the
IronOcr
namespace, which provides classes and methods for OCR operations.Initialize OCR Engine: We create an instance of
IronTesseract
, which is the OCR engine. Setting its language to Swahili allows it to recognize Swahili text.OCR Input: The
OcrInput
class is used to specify the file (image or PDF) from which we want to extract text.OCR Reading: The
Read
method processes the input and returns anOcrResult
object containing the recognized text.- Output: The recognized text is stored in
AllText
, which can be used as needed. In this example, it is printed to the console for demonstration purposes.