Get started with IronOCR

Start using IronOCR in your project today with a free trial.

First Step:
green arrow pointer



Read Multi-Language PDF Example

IronOcr provides about 125 language packs; however, only English is installed by default. The rest can be downloaded from NuGet. You can have a look at all the available language packs here..

In the following example, I will show you the code for using multiple languages in IronOcr to extract text from a PDF file.

:path=/static-assets/ocr/content-code-examples/how-to/ocr-multiple-languages-pdf-input.cs
using IronOcr;
using System;

// Create an instance of IronTesseract for OCR processing
var ocrTesseract = new IronTesseract();

// Configure the OCR engine to add Russian as a secondary language
ocrTesseract.Configuration.AddSecondaryLanguage(OcrLanguage.Russian);

// Use a 'using' block to ensure proper disposal of resources for reading the PDF
using (var pdfInput = new OcrInput(@"example.pdf"))
{
    // Perform OCR on the provided PDF input
    var result = ocrTesseract.Read(pdfInput);

    // Output the extracted text to the console
    Console.WriteLine(result.Text);
}
Imports IronOcr

Imports System



' Create an instance of IronTesseract for OCR processing

Private ocrTesseract = New IronTesseract()



' Configure the OCR engine to add Russian as a secondary language

ocrTesseract.Configuration.AddSecondaryLanguage(OcrLanguage.Russian)



' Use a 'using' block to ensure proper disposal of resources for reading the PDF

Using pdfInput = New OcrInput("example.pdf")

	' Perform OCR on the provided PDF input

	Dim result = ocrTesseract.Read(pdfInput)



	' Output the extracted text to the console

	Console.WriteLine(result.Text)

End Using
$vbLabelText   $csharpLabel

You can add any number of secondary languages using the AddSecondaryLanguage method. However, please note that this addition may affect speed and performance. The priority of the language depends on the order in which it is added, with the first added having higher priority.

Read Multi-Language Image Example

The primary language is set to English by default. To change the primary language, set the Language property to the desired language. Afterward, you can also add secondary languages.

// Example code for reading multi-language image with IronOCR
using IronOcr;

// Initialize IronTesseract OCR engine
var Ocr = new IronTesseract();

:path=/static-assets/ocr/content-code-examples/how-to/ocr-multiple-languages-image-input.cs
// Example code for reading multi-language image with IronOCR
using IronOcr;

// Initialize IronTesseract OCR engine
var Ocr = new IronTesseract();

using IronOcr;
using System;

// This code performs Optical Character Recognition (OCR) on an image file using the IronOcr library.
// It reads text from an image, specifically focusing on the Russian language as the primary language,
// with Japanese as a secondary language for OCR processing.

// Instantiate an IronTesseract object
var ocrTesseract = new IronTesseract();

// Set primary language to Russian
ocrTesseract.Language = OcrLanguage.Russian;

// Add a secondary language for OCR to consider; in this case, Japanese
ocrTesseract.AddSecondaryLanguage(OcrLanguage.Japanese);

// Using statement ensures proper disposal of the imageInput object
// Load the image from which text needs to be extracted
using (var imageInput = new OcrInput(@"example.png"))
{
    // Perform OCR on the image
    OcrResult result = ocrTesseract.Read(imageInput);

    // Output the extracted text to the console
    Console.WriteLine(result.Text);
}
' Example code for reading multi-language image with IronOCR

Imports IronOcr



' Initialize IronTesseract OCR engine

Private Ocr = New IronTesseract()



Imports System



' This code performs Optical Character Recognition (OCR) on an image file using the IronOcr library.

' It reads text from an image, specifically focusing on the Russian language as the primary language,

' with Japanese as a secondary language for OCR processing.



' Instantiate an IronTesseract object

Private ocrTesseract = New IronTesseract()



' Set primary language to Russian

ocrTesseract.Language = OcrLanguage.Russian



' Add a secondary language for OCR to consider; in this case, Japanese

ocrTesseract.AddSecondaryLanguage(OcrLanguage.Japanese)



' Using statement ensures proper disposal of the imageInput object

' Load the image from which text needs to be extracted

Using imageInput = New OcrInput("example.png")

	' Perform OCR on the image

	Dim result As OcrResult = ocrTesseract.Read(imageInput)



	' Output the extracted text to the console

	Console.WriteLine(result.Text)

End Using
$vbLabelText   $csharpLabel

If you do this right, you can expect results like the ones below.

Russian and Japanese

Conclusion

In brief, IronOCR, backed by the powerful Tesseract engine, excels at extracting text from documents in multiple languages. It's an indispensable tool for handling the complexities of reading text in many languages, offering developers and curious minds a versatile solution. Whether you're processing PDFs with text in various languages or working with multilingual content in images, IronOCR simplifies the task of recognizing and extracting text in multiple languages.

Frequently Asked Questions

What is IronOCR?

IronOCR is a tool that uses the Tesseract Engine to perform Optical Character Recognition (OCR) on text in various languages and scripts.

How can I get started with IronOCR for multiple languages?

To get started with IronOCR, download the C# library from NuGet, prepare your PDF or image for reading, install additional language packs, and use the AddSecondaryLanguage method to enable the desired languages.

How do I add multiple languages to IronOCR?

You can add multiple languages to IronOCR by using the AddSecondaryLanguage method. Language packs can be installed via NuGet.

What is the default language for IronOCR?

The default language for IronOCR is English. You can change the primary language by setting the Language property to your desired language.

Can I change the primary language in IronOCR?

Yes, you can change the primary language by setting the Language property to the desired language.

Does adding multiple languages affect IronOCR performance?

Yes, adding multiple languages can affect the speed and performance of IronOCR. The priority of the language depends on the order in which it is added.

How can I read a multi-language PDF using IronOCR?

To read a multi-language PDF, initialize the IronTesseract OCR engine, set the primary language, add secondary languages, and then use the OCR engine to read the PDF.

What is the process for reading a multi-language image with IronOCR?

To read a multi-language image, initialize the IronTesseract OCR engine, set the primary language, add secondary languages, and then use the OCR engine to read the image.

How many language packs does IronOCR provide?

IronOCR provides approximately 125 language packs, although only English is installed by default.

Where can I find more information about available language packs for IronOCR?

You can view all available language packs for IronOCR on their website under the languages section.

Kannaopat Udonpant
Software Engineer
Before becoming a Software Engineer, Kannapat completed a Environmental Resources PhD from Hokkaido University in Japan. While pursuing his degree, Kannapat also became a member of the Vehicle Robotics Laboratory, which is part of the Department of Bioproduction Engineering. In 2022, he leveraged his C# skills to join Iron Software's engineering team, where he focuses on IronPDF. Kannapat values his job because he learns directly from the developer who writes most of the code used in IronPDF. In addition to peer learning, Kannapat enjoys the social aspect of working at Iron Software. When he's not writing code or documentation, Kannapat can usually be found gaming on his PS5 or rewatching The Last of Us.