Czech OCR in C# and .NET

Other versions of this document:

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Czech.

It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines in both speed and accuracy.

Contents of IronOcr.Languages.Czech

This package contains 40 OCR languages for .NET:

  • Czech
  • CzechBest
  • CzechFast

Download

Czech Language Pack [čeština]

Installation

The first thing we have to do is install our Czech OCR package to your .NET project.

Install-Package IronOCR.Languages.Czech

Code Example

This C# code example reads Czech text from an image or PDF document.

// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create a new IronTesseract instance
        var Ocr = new IronTesseract();

        // Set the OCR language to Czech
        Ocr.Language = OcrLanguage.Czech;

        // Define the input image or PDF and perform OCR
        using (var Input = new OcrInput(@"images\Czech.png"))
        {
            // Read the input and perform OCR
            var Result = Ocr.Read(Input);

            // Extract all recognized text
            var AllText = Result.Text;

            // Output the recognized text to the console
            Console.WriteLine(AllText);
        }
    }
}
// Import the IronOcr namespace
using IronOcr;

class Program
{
    static void Main()
    {
        // Create a new IronTesseract instance
        var Ocr = new IronTesseract();

        // Set the OCR language to Czech
        Ocr.Language = OcrLanguage.Czech;

        // Define the input image or PDF and perform OCR
        using (var Input = new OcrInput(@"images\Czech.png"))
        {
            // Read the input and perform OCR
            var Result = Ocr.Read(Input);

            // Extract all recognized text
            var AllText = Result.Text;

            // Output the recognized text to the console
            Console.WriteLine(AllText);
        }
    }
}
' Import the IronOcr namespace
Imports IronOcr

Friend Class Program
	Shared Sub Main()
		' Create a new IronTesseract instance
		Dim Ocr = New IronTesseract()

		' Set the OCR language to Czech
		Ocr.Language = OcrLanguage.Czech

		' Define the input image or PDF and perform OCR
		Using Input = New OcrInput("images\Czech.png")
			' Read the input and perform OCR
			Dim Result = Ocr.Read(Input)

			' Extract all recognized text
			Dim AllText = Result.Text

			' Output the recognized text to the console
			Console.WriteLine(AllText)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel
  • The above code demonstrates how to configure and utilize the IronTesseract class to perform OCR on a given image or PDF.
  • Ensure the IronOCR.Languages.Czech package is installed in your environment for the code to execute correctly.
  • The OcrInput class is used to load the image from the specified path, and Ocr.Read() performs the OCR operation.
  • Result.Text will contain the OCR output which in this case is printed to the console.