Additional OCR Language Packs
IronOCR supports 125 international languages, but only English is installed within IronOCR as standard.
Additional Language packs may be easily added to your C#, VB or ASP .NET project via NuGet or as DLLs which can be downloaded and added as project references.
Code Examples
International Language Example
Install-Package IronOcr.Languages.ChineseSimplified
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified;
using (var input = new OcrInput())
{
// Add an image to be processed
input.AddImage("img/chinese.gif");
// Optional: Enhance the input by deskewing or denoising the image
// input.Deskew();
// input.DeNoise();
// Process the image and retrieve the result
var result = ocr.Read(input);
// Store the recognized text in a string
string testResult = result.Text;
// Save the recognized text to a file since the console might not display Unicode characters properly
result.SaveAsTextFile("chinese.txt");
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified;
using (var input = new OcrInput())
{
// Add an image to be processed
input.AddImage("img/chinese.gif");
// Optional: Enhance the input by deskewing or denoising the image
// input.Deskew();
// input.DeNoise();
// Process the image and retrieve the result
var result = ocr.Read(input);
// Store the recognized text in a string
string testResult = result.Text;
// Save the recognized text to a file since the console might not display Unicode characters properly
result.SaveAsTextFile("chinese.txt");
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified
Using input = New OcrInput()
' Add an image to be processed
input.AddImage("img/chinese.gif")
' Optional: Enhance the input by deskewing or denoising the image
' input.Deskew();
' input.DeNoise();
' Process the image and retrieve the result
Dim result = ocr.Read(input)
' Store the recognized text in a string
Dim testResult As String = result.Text
' Save the recognized text to a file since the console might not display Unicode characters properly
result.SaveAsTextFile("chinese.txt")
End Using
Vertically Written Language Example
Dictionaries tuned for vertically written languages. Use 'Vertical' Variant of Korean and Japanese OcrLanguage.
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Custom Language Example
For using any Tesseract .traineddata language file you have downloaded or trained yourself.
using IronOcr;
var ocr = new IronTesseract();
// Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata")
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Multiple Language Example
More than one language at a time.
Install-Package IronOcr.Languages.Arabic
using IronOcr;
var ocr = new IronTesseract();
// Set the primary language to English
ocr.Language = OcrLanguage.English;
// Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages
using (var input = new OcrInput(@"images\multi-lang.pdf"))
{
// Process the PDF and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the primary language to English
ocr.Language = OcrLanguage.English;
// Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages
using (var input = new OcrInput(@"images\multi-lang.pdf"))
{
// Process the PDF and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the primary language to English
ocr.Language = OcrLanguage.English
' Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic)
' Add any number of languages
Using input = New OcrInput("images\multi-lang.pdf")
' Process the PDF and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Faster Language Example
Dictionaries tuned for speed. Use the 'Fast' variant of any OcrLanguage.
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Higher Accuracy Detail Language Example
Dictionaries tuned for accuracy, but much slower results. Use the 'Best' variant of any OcrLanguage.
Install-Package IronOcr.Languages.French
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
How To Install OCR Language Packs
Additional OCR Language packs are available for download below. Either
- Install the NuGet package. Search NuGet for IronOcr Languages.
- Or download the "ocrdata" file and add it to your .NET project in any folder you like. Set
CopyToOutputDirectory = CopyIfNewer
Download OCR Language Packs
-
Ancient Greek Language Pack Ἑλληνική Zip NuGet
-
Assamese Language Pack অসমীয়া Zip NuGet
-
Breton Language Pack brezhoneg Zip NuGet
-
Canadian Aboriginal Alphabet Language Pack Canadian First Nations Zip NuGet
-
Cebuano Language Pack Bisaya Zip NuGet
-
Cherokee Language Pack ᏣᎳᎩ ᎦᏬᏂᎯᏍᏗ Zip NuGet
-
Cyrillic Language Pack Cyrillic scripts Zip NuGet
-
Devanagari Language Pack Nagari Zip NuGet
-
Divehi Language Pack ދިވެހި Zip NuGet
-
Dzongkha Language Pack རྫོང་ཁ Zip NuGet
-
Faroese Language Pack føroyskt Zip NuGet
-
Filipino Language Pack The Philippines Zip NuGet
-
Financial Language Pack Spreadsheets & Numbers Zip NuGet
-
Fraktur Language Pack Generic Fraktur Zip NuGet
-
Frankish Language Pack Frenkisk Zip NuGet
-
Gurmukhi Alphabet Language Pack Gurmukhī Zip NuGet
-
Hangul Language Pack Hangul Alphabet Zip NuGet
-
Inuktitut Language Pack ᐃᓄᒃᑎᑐᑦ Zip NuGet
-
Javanese Language Pack basa Jawa Zip NuGet
-
Latin Alphabet Language Pack latine Zip NuGet
-
Malay Language Pack bahasa Melayu Zip NuGet
-
Malayalam Language Pack മലയാളം Zip NuGet
-
Maori Language Pack te reo Māori Zip NuGet
-
MICR Language Pack Magnetic Ink Character Recognition Zip NuGet
-
Middle English Language Pack English (1100-1500 AD) Zip NuGet
-
Middle French Language Pack Moyen Français Zip NuGet
-
Myanmar Language Pack Burmese Zip NuGet
-
Northern Kurdish Language Pack Kurmanji Zip NuGet
-
Norwegian Language Pack Norsk Zip NuGet
-
Occitan Language Pack occitan Zip NuGet
-
Quechua Language Pack Runa Simi Zip NuGet
-
Sanskrit Language Pack संस्कृतम् Zip NuGet
-
Scottish Gaelic Language Pack Gàidhlig Zip NuGet
-
Syriac Language Pack Syrian Zip NuGet
-
Tibetan Language Pack Tibetan Standard Zip NuGet
-
Tonga Language Pack faka Tonga Zip NuGet
- Yoruba Language Pack Yorùbá Zip NuGet
Help
If the language you are looking to read is not available in the list above, please get in touch with us. Many other languages are available on request.
Priority on production resources is given to IronOCR licensees, so please also consider licensing IronOCR for access to your desired language pack.