Skip to footer content

Configurations & Languages

Fine-tune the OCR engine for perfect results. Get granular control over performance and accuracy, with out-of-the-box support for 125+ languages.

Icon Main related to Configurations & Languages
OCR Detailed Configurations

1

Normal OCR Configurations

Fine-tune the core Tesseract engine with granular control over dozens of parameters. This offers deep customization for advanced users looking to optimize performance for specific document types, languages, or quality challenges.

Learn how to:C# Tesseract OCR Configuration Variables
IronTesseract ocr = new IronTesseract
{
    Configuration = new TesseractConfiguration
    {
        ReadBarCodes = false,
        RenderHocr = true,
        TesseractVariables = null,
        WhiteListCharacters = null,
        BlackListCharacters = "`ë|^",
    },
    MultiThreaded = false,
    Language = OcrLanguage.English,
    EnableTesseractConsoleMessages = true, // False as default
};
C#
2

OCR Configurations for Advanced Reading

Learn more about OCR configuration settings and languages available for Advanced OCR reading methods.

using IronOcr;

IronTesseract ocr = new IronTesseract
{
    Configuration = new TesseractConfiguration
    {
        // Whitelist alphanumeric characters and common punctuation
        WhiteListCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.,-?!$() /",
        // Blacklist uncommon characters 
        BlackListCharacters = "`ë|^",
    },

    // Languages available for advanced reading are English, Japanese, Korean, LatinAlphabet   
    Language = OcrLanguage.English
};
C#
3

Fast OCR Configuration

Optimize the OCR engine for maximum speed. By adjusting settings to prioritize performance over absolute accuracy, you can rapidly process huge volumes of documents where speed is the critical factor.

Learn how to:Faster Tesseract OCR for .NET
C#

Icon Main related to Configurations & Languages
Languages

1

125 Languages OCR supported

Achieve high accuracy in 125+ international languages. Our packs include robust support for non-Latin scripts (e.g., Arabic, Chinese, Hebrew) and languages with diacritics. Easily add or switch languages with a single line of code.

Learn how to:Master Multi-Language OCR with Iron OCR: English & Japanese
using IronOcr;

var ocr = new IronTesseract();

// Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified;
using (var input = new OcrInput())
{
    var result = ocr.Read(input);

    // Store the recognized text in a string
    string testResult = result.Text;
}
C#
2

Multi-Language Reading

Accurately extract text from documents containing multiple languages on the same page. IronOcr automatically detects and switches between specified languages, eliminating the need for separate processing for each language content.

Learn how to:Use Multiple Languages with Tesseract
using IronOcr;

// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract()
{   // Set primary language to English
    Language = OcrLanguage.EnglishBest,
};

// Set secondary language to Russian
ocrTesseract.AddSecondaryLanguage(OcrLanguage.Russian);

// Add PDF
using var pdfInput = new OcrPdfInput(@"example.pdf");

// Perform OCR
OcrResult result = ocrTesseract.Read(pdfInput);

// Output extracted text to console
Console.WriteLine(result.Text);
C#
3

Customized Languages Reading

Go beyond the built-in language packs by providing your own trained language data. Achieve high accuracy on documents with rare languages, specialized fonts, or unique character sets.

Learn how to:Use Custom Language Files
using IronOcr;

var ocrTesseract = new IronTesseract();
ocrTesseract.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
using var ocrInput = new OcrInput();
ocrInput.LoadImage(@"sample.png");
var ocrResult = ocrTesseract.Read(ocrInput);
Console.WriteLine(ocrResult.Text);
C#
Ready to Get Started?
Nuget Downloads 5,058,051 | Version: 2025.11 just released