如何在 IronOCR 中使用 OCR 语言包

附加 OCR 语言包

This article was translated from English: Does it need improvement?
Translated
View the article in English

IronOCR 支持 125 种国际语言,但 IronOCR 默认只安装了英语

可以通过 NuGet 或 DLL 文件轻松地将其他语言包添加到您的 C#、VB 或ASP.NET项目中,下载这些 DLL 文件并将其添加为项目引用。

代码示例

国际语言示例

Install-Package IronOcr.Languages.ChineseSimplified
using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified;

using (var input = new OcrInput())
{
    // Add an image to be processed
    input.AddImage("img/chinese.gif");

    // Optional: Enhance the input by deskewing or denoising the image
    // input.Deskew();
    // input.DeNoise();

    // Process the image and retrieve the result
    var result = ocr.Read(input);

    // Store the recognized text in a string
    string testResult = result.Text;

    // Save the recognized text to a file since the console might not display Unicode characters properly
    result.SaveAsTextFile("chinese.txt");
}
using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified;

using (var input = new OcrInput())
{
    // Add an image to be processed
    input.AddImage("img/chinese.gif");

    // Optional: Enhance the input by deskewing or denoising the image
    // input.Deskew();
    // input.DeNoise();

    // Process the image and retrieve the result
    var result = ocr.Read(input);

    // Store the recognized text in a string
    string testResult = result.Text;

    // Save the recognized text to a file since the console might not display Unicode characters properly
    result.SaveAsTextFile("chinese.txt");
}
Imports IronOcr

Private ocr = New IronTesseract()
' Set the OCR to use Chinese Simplified
ocr.Language = OcrLanguage.ChineseSimplified

Using input = New OcrInput()
	' Add an image to be processed
	input.AddImage("img/chinese.gif")

	' Optional: Enhance the input by deskewing or denoising the image
	' input.Deskew();
	' input.DeNoise();

	' Process the image and retrieve the result
	Dim result = ocr.Read(input)

	' Store the recognized text in a string
	Dim testResult As String = result.Text

	' Save the recognized text to a file since the console might not display Unicode characters properly
	result.SaveAsTextFile("chinese.txt")
End Using
$vbLabelText   $csharpLabel

竖式书写语言示例

专为竖写语言设计的词典。 使用韩语和日语 OCR 语言的"垂直"变体。

using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical;

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical;

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
Imports IronOcr

Private ocr = New IronTesseract()
' Set the OCR to use Japanese Vertical language
ocr.Language = OcrLanguage.JapaneseVertical

Using input = New OcrInput("images\image.png")
	' Process the image and get the OCR result
	Dim result = ocr.Read(input)
	' Output the recognized text to the console
	Console.WriteLine(result.Text)
End Using
$vbLabelText   $csharpLabel

自定义语言示例

适用于您下载或自行训练的任何 Tesseract .traineddata 语言文件。

using IronOcr;

var ocr = new IronTesseract();

// Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
using IronOcr;

var ocr = new IronTesseract();

// Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
Imports IronOcr

Private ocr = New IronTesseract()

' Use a custom Tesseract language file
ocr.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata")

Using input = New OcrInput("images\image.png")
	' Process the image and get the OCR result
	Dim result = ocr.Read(input)
	' Output the recognized text to the console
	Console.WriteLine(result.Text)
End Using
$vbLabelText   $csharpLabel

多语言示例

同时学习多种语言。

Install-Package IronOcr.Languages.Arabic
using IronOcr;

var ocr = new IronTesseract();

// Set the primary language to English
ocr.Language = OcrLanguage.English;
// Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages

using (var input = new OcrInput(@"images\multi-lang.pdf"))
{
    // Process the PDF and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
using IronOcr;

var ocr = new IronTesseract();

// Set the primary language to English
ocr.Language = OcrLanguage.English;
// Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic);
// Add any number of languages

using (var input = new OcrInput(@"images\multi-lang.pdf"))
{
    // Process the PDF and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
Imports IronOcr

Private ocr = New IronTesseract()

' Set the primary language to English
ocr.Language = OcrLanguage.English
' Add Arabic as a secondary language
ocr.AddSecondaryLanguage(OcrLanguage.Arabic)
' Add any number of languages

Using input = New OcrInput("images\multi-lang.pdf")
	' Process the PDF and get the OCR result
	Dim result = ocr.Read(input)
	' Output the recognized text to the console
	Console.WriteLine(result.Text)
End Using
$vbLabelText   $csharpLabel

更快的语言示例

专为快速使用而设计的词典。 使用任何 OcrLanguage 的"快速"版本。

using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast;

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast;

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
Imports IronOcr

Private ocr = New IronTesseract()
' Set the OCR to use the fast variant of English
ocr.Language = OcrLanguage.EnglishFast

Using input = New OcrInput("images\image.png")
	' Process the image and get the OCR result
	Dim result = ocr.Read(input)
	' Output the recognized text to the console
	Console.WriteLine(result.Text)
End Using
$vbLabelText   $csharpLabel

更高精度的详细语言示例

词典经过优化,准确性更高,但检索速度较慢。 使用任何 OcrLanguage 的"最佳"版本。

Install-Package IronOcr.Languages.French
using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest;

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
using IronOcr;

var ocr = new IronTesseract();
// Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest;

using (var input = new OcrInput(@"images\image.png"))
{
    // Process the image and get the OCR result
    var result = ocr.Read(input);
    // Output the recognized text to the console
    Console.WriteLine(result.Text);
}
Imports IronOcr

Private ocr = New IronTesseract()
' Set the OCR to use the best variant of French
ocr.Language = OcrLanguage.FrenchBest

Using input = New OcrInput("images\image.png")
	' Process the image and get the OCR result
	Dim result = ocr.Read(input)
	' Output the recognized text to the console
	Console.WriteLine(result.Text)
End Using
$vbLabelText   $csharpLabel

如何安装 OCR 语言包

其他 OCR 语言包可供下载,详情请见下方。 任何一个

  • 安装 NuGet 包。 在 NuGet 上搜索 IronOCR 语言。 或者下载"ocrdata"文件,并将其添加到您 .NET 项目中任意文件夹下。设置CopyToOutputDirectory = CopyIfNewer

下载 OCR 语言包

帮助

如果您要阅读的语言不在上面的列表中,请 [联系我们](https://ironsoftware.com/contact-us/)。

如有需要,我们还提供其他多种语言版本。

优先考虑 IronOCR 许可证持有者的生产资源,因此请考虑 [为 IronOCR 申请许可证](/csharp/ocr/licensing/),以访问您所需的语言包。