Pacotes de Idiomas Adicionais para OCR
IronOCR suporta 125 idiomas internacionais, mas apenas inglês está instalado no IronOCR como padrão.
Pacotes de idiomas adicionais podem ser facilmente adicionados ao seu projeto C#, VB ou ASP .NET via NuGet ou como DLLs que podem ser baixadas e adicionadas como referências do projeto.
Exemplos de código
Exemplo de Idioma Internacional
Install-Package IronOcr.Languages.ChineseSimplified
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified;
using (var input = new OcrInput())
{
// Add an image to be processed
input.AddImage("img/chinese.gif");
// Optional: Enhance the input by deskewing or denoising the image
// input.Deskew();
// input.DeNoise();
// Process the image and retrieve the result
var result = ocr.Read(input);
// Store the recognized text in a string
string testResult = result.Text;
// Save the recognized text to a file since the console might not display Unicode characters properly
result.SaveAsTextFile("chinese.txt");
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified;
using (var input = new OcrInput())
{
// Add an image to be processed
input.AddImage("img/chinese.gif");
// Optional: Enhance the input by deskewing or denoising the image
// input.Deskew();
// input.DeNoise();
// Process the image and retrieve the result
var result = ocr.Read(input);
// Store the recognized text in a string
string testResult = result.Text;
// Save the recognized text to a file since the console might not display Unicode characters properly
result.SaveAsTextFile("chinese.txt");
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified
Using input = New OcrInput()
' Add an image to be processed
input.AddImage("img/chinese.gif")
' Optional: Enhance the input by deskewing or denoising the image
' input.Deskew();
' input.DeNoise();
' Process the image and retrieve the result
Dim result = ocr.Read(input)
' Store the recognized text in a string
Dim testResult As String = result.Text
' Save the recognized text to a file since the console might not display Unicode characters properly
result.SaveAsTextFile("chinese.txt")
End Using
Exemplo de Idioma Escrito Verticalmente
Dicionários ajustados para idiomas escritos verticalmente. Use a variante 'Vertical' do OcrLanguage coreano e japonês.
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Exemplo de Idioma Personalizado
Para usar qualquer arquivo de idioma Tesseract .traineddata que você tenha baixado ou treinado.
using IronOcr;
var ocr = new IronTesseract();
// Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata")
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Exemplo de vários idiomas
Mais de um idioma ao mesmo tempo.
Install-Package IronOcr.Languages.Arabic
using IronOcr;
var ocr = new IronTesseract();
// Set the primary language to English
ocr.Language = OcrLanguage.English;
// Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages
using (var input = new OcrInput(@"images\multi-lang.pdf"))
{
// Process the PDF and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the primary language to English
ocr.Language = OcrLanguage.English;
// Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages
using (var input = new OcrInput(@"images\multi-lang.pdf"))
{
// Process the PDF and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the primary language to English
ocr.Language = OcrLanguage.English
' Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic)
' Add any number of languages
Using input = New OcrInput("images\multi-lang.pdf")
' Process the PDF and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Exemplo de Idioma Mais Rápido
Dicionários ajustados para velocidade. Use a variante 'Fast' de qualquer OcrLanguage.
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Exemplo de Idioma com Maior Precisão Detalhada
Dicionários ajustados para precisão, mas com resultados muito mais lentos. Use a variante 'Best' de qualquer OcrLanguage.
Install-Package IronOcr.Languages.French
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
using IronOcr;
var ocr = new IronTesseract();
// Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest;
using (var input = new OcrInput(@"images\image.png"))
{
// Process the image and get the OCR result
var result = ocr.Read(input);
// Output the recognized text to the console
Console.WriteLine(result.Text);
}
Imports IronOcr
Private ocr = New IronTesseract()
' Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest
Using input = New OcrInput("images\image.png")
' Process the image and get the OCR result
Dim result = ocr.Read(input)
' Output the recognized text to the console
Console.WriteLine(result.Text)
End Using
Como Instalar Pacotes de Idiomas para OCR
Pacotes adicionais de idiomas para OCR estão disponíveis para download abaixo. Ou
- Instale o pacote NuGet. Pesquise no NuGet por idiomas IronOcr.
- Ou baixe o arquivo "ocrdata" e adicione-o ao seu projeto .NET em qualquer pasta que desejar. Defina
CopyToOutputDirectory = CopyIfNewer
Baixar Pacotes de Idiomas para OCR
- Ancient Greek Language Pack Ἑλληνική Zip NuGet
- Assamese Language Pack অসমীযা Zip NuGet
- Breton Language Pack brezhoneg Zip NuGet
- Canadian Aboriginal Alphabet Language Pack Canadian First Nations Zip NuGet
- Cebuano Language Pack Bisaya Zip NuGet
- Cherokee Language Pack ᏣᎳᎩ ᎦᏬᏂᎯᏍᏗ Zip NuGet
- Cyrillic Language Pack Cyrillic scripts Zip NuGet
- Devanagari Language Pack Nagair Zip NuGet
- Faroese Language Pack føroyskt Zip NuGet
- Filipino Language Pack The Philippines Zip NuGet
- Financial Language Pack Spreadsheets & Numbers Zip NuGet
- Fraktur Language Pack Generic Fraktur Zip NuGet
- Frankish Language Pack Frenkisk Zip NuGet
- Gurmukhi Alphabet Language Pack Gurmukhī Zip NuGet
- Hangul Language Pack Hangul Alphabet Zip NuGet
- Inuktitut Language Pack ᐃᓄᒃᑎᑐᑦ Zip NuGet
- Javanese Language Pack basa Jawa Zip NuGet
- Latin Alphabet Language Pack latine Zip NuGet
- Malay Language Pack bahasa Melayu Zip NuGet
- Malayalam Language Pack മലയാളം Zip NuGet
- Maori Language Pack te reo Māori Zip NuGet
- MICR Language Pack Magnetic Ink Character Recognition Zip NuGet
- Middle English Language Pack English (1100-1500 AD) Zip NuGet
- Middle French Language Pack Moyen Français Zip NuGet
- Myanmar Language Pack Burmese Zip NuGet
- Northern Kurdish Language Pack Kurmanji Zip NuGet
- Norwegian Language Pack Norsk Zip NuGet
- Occitan Language Pack occitan Zip NuGet
- Quechua Language Pack Runa Simi Zip NuGet
- Sanskrit Language Pack ससकतम Zip NuGet
- Scottish Gaelic Language Pack Gàidhlig Zip NuGet
- Syriac Language Pack Syrian Zip NuGet
- Tibetan Language Pack Tibetan Standard Zip NuGet
- Tonga Language Pack faka Tonga Zip NuGet
Ajuda
Se o idioma que você procura não estiver disponível na lista acima, por favor, entre em contato conosco. Muitos outros idiomas estão disponíveis mediante solicitação.
Prioridade nos recursos de produção é dada aos licenciados do IronOCR, portanto, considere também licenciar o IronOCR para acesso ao pacote de idiomas desejado.

