IronOCR How-Tos Read Multiple Languages How to use Multiple Languages with Tesseract Kannapat Udonpant Updated:July 28, 2025 In the realm of Optical Character Recognition (OCR) technology, IronOCR is a well-regarded tool known for its ability to extract text from various languages and scripts. We use the Tesseract Engine to provide a reliable and easy-to-use OCR tool. In this article, we'll explore how IronOCR effectively handles text in multiple languages, thanks to Tesseract. Whether you're an experienced developer looking for a reliable multilingual OCR solution or simply curious about how it all works, this article will help you understand IronOCR and its Tesseract engine, shedding light on the capabilities of this invaluable tool. Get started with IronOCR Start using IronOCR in your project today with a free trial. First Step: Start for Free How to use Multiple Languages with Tesseract Download a C# library for reading multiple languages Prepare the PDF document and image for reading Install additional language packs via NuGet Use the AddSecondaryLanguage method to enable the desired languages Set the Language property to change the default language Read Multi-Language PDF Example IronOcr provides about 125 language packs; however, only English is installed by default. The rest can be downloaded from NuGet. You can have a look at all the available language packs here.. In the following example, I will show you the code for using multiple languages in IronOcr to extract text from a PDF file. :path=/static-assets/ocr/content-code-examples/how-to/ocr-multiple-languages-pdf-input.cs using IronOcr; using System; // Instantiate IronTesseract IronTesseract ocrTesseract = new IronTesseract(); // Set secondary language to Russian ocrTesseract.AddSecondaryLanguage(OcrLanguage.Russian); // Add PDF using var pdfInput = new OcrPdfInput(@"example.pdf"); // Perform OCR OcrResult result = ocrTesseract.Read(pdfInput); // Output extracted text to console Console.WriteLine(result.Text); Imports IronOcr Imports System ' Instantiate IronTesseract Private ocrTesseract As New IronTesseract() ' Set secondary language to Russian ocrTesseract.AddSecondaryLanguage(OcrLanguage.Russian) ' Add PDF Dim pdfInput = New OcrPdfInput("example.pdf") ' Perform OCR Dim result As OcrResult = ocrTesseract.Read(pdfInput) ' Output extracted text to console Console.WriteLine(result.Text) $vbLabelText $csharpLabel You can add any number of secondary languages using the AddSecondaryLanguage method. However, please note that this addition may affect speed and performance. The priority of the language depends on the order in which it is added, with the first added having higher priority. Read Multi-Language Image Example The primary language is set to English by default. To change the primary language, set the Language property to the desired language. Afterward, you can also add secondary languages. // Example code for reading multi-language image with IronOCR using IronOcr; // Initialize IronTesseract OCR engine var Ocr = new IronTesseract(); :path=/static-assets/ocr/content-code-examples/how-to/ocr-multiple-languages-image-input.cs // Example code for reading multi-language image with IronOCR using IronOcr; // Initialize IronTesseract OCR engine var Ocr = new IronTesseract(); using IronOcr; using System; // Instantiate IronTesseract IronTesseract ocrTesseract = new IronTesseract(); // Set primary language to Hindi ocrTesseract.Language = OcrLanguage.Russian; ocrTesseract.AddSecondaryLanguage(OcrLanguage.Japanese); // Add image using var imageInput = new OcrImageInput(@"example.png"); // Perform OCR OcrResult result = ocrTesseract.Read(imageInput); // Output extracted text to console Console.WriteLine(result.Text); ' Example code for reading multi-language image with IronOCR Imports IronOcr ' Initialize IronTesseract OCR engine Private Ocr = New IronTesseract() Imports System ' Instantiate IronTesseract Private ocrTesseract As New IronTesseract() ' Set primary language to Hindi ocrTesseract.Language = OcrLanguage.Russian ocrTesseract.AddSecondaryLanguage(OcrLanguage.Japanese) ' Add image Dim imageInput = New OcrImageInput("example.png") ' Perform OCR Dim result As OcrResult = ocrTesseract.Read(imageInput) ' Output extracted text to console Console.WriteLine(result.Text) $vbLabelText $csharpLabel If you do this right, you can expect results like the ones below. Conclusion In brief, IronOCR, backed by the powerful Tesseract engine, excels at extracting text from documents in multiple languages. It's an indispensable tool for handling the complexities of reading text in many languages, offering developers and curious minds a versatile solution. Whether you're processing PDFs with text in various languages or working with multilingual content in images, IronOCR simplifies the task of recognizing and extracting text in multiple languages. Frequently Asked Questions How can I use multiple languages in OCR processing? To use multiple languages in OCR processing with IronOCR, download the library from NuGet, prepare your document, install additional language packs, and use the AddSecondaryLanguage method to enable other languages. How do I extract text from a multi-language PDF? You can extract text from a multi-language PDF by initializing the IronTesseract OCR engine, setting the primary language, adding secondary languages using the AddSecondaryLanguage method, and processing the PDF to read its content. Is it possible to recognize text in multiple languages within an image? Yes, with IronOCR, you can recognize text in multiple languages within an image by setting the primary language and adding secondary languages to the OCR engine before processing the image. How does adding multiple languages affect OCR performance? Adding multiple languages in IronOCR can impact the speed and performance of the OCR process. The order of added languages determines their priority, with the first added language having higher priority. How can I change the default language in IronOCR? You can change the default language in IronOCR by setting the Language property to your desired language before processing your documents or images. How many language packs does IronOCR support? IronOCR supports approximately 125 language packs, although only the English language pack is installed by default. Additional language packs can be downloaded via NuGet. How do I install additional language packs in IronOCR? To install additional language packs in IronOCR, use the NuGet Package Manager to download the desired language packs and include them in your project. Can IronOCR read text in languages with different scripts? Yes, IronOCR can read text in various languages with different scripts by utilizing the Tesseract engine and enabling relevant language packs. What is the benefit of using IronOCR for multilingual content? IronOCR provides a versatile solution for recognizing and extracting text from multilingual content, making it ideal for developers dealing with documents or images containing text in multiple languages. Kannapat Udonpant Chat with engineering team now Software Engineer Before becoming a Software Engineer, Kannapat completed a Environmental Resources PhD from Hokkaido University in Japan. While pursuing his degree, Kannapat also became a member of the Vehicle Robotics Laboratory, which is part of the Department of Bioproduction Engineering. In 2022, he leveraged his C# skills to join Iron Software's engineering ...Read More Reviewed by Jeffrey T. Fritz Principal Program Manager - .NET Community Team Jeff is also a Principal Program Manager for the .NET and Visual Studio teams. He is the executive producer of the .NET Conf virtual conference series and hosts 'Fritz and Friends' a live stream for developers that airs twice weekly where he talks tech and writes code together with viewers. Jeff writes workshops, presentations, and plans content for the largest Microsoft developer events including Microsoft Build, Microsoft Ignite, .NET Conf, and the Microsoft MVP Summit Ready to Get Started? Free NuGet Download Total downloads: 4,306,473 View Licenses