使用 IronOCR 的自定义 OCR 语言包

Curtis Chau

已更新:2026年4月21日

Translated

View the article in English

如何为 IronOCR 创建自定义语言包？

创建自定义语言包需要从字体训练一个新的 Tesseract 4 LSTM 语言文件/词典。

网上有很多教程讲解了完成此操作所需的步骤。这个过程并不简单，但好在有详细的文档记录。

作为入门指南，我们建议先观看Gabriel Garcia （与我们无任何关联）的YouTube 教程以及他们链接的GitHub 存储库。

翻译完成后，输出文件将为 .traineddata 文件。

随后可在 IronOCR 中按以下方式引用 .traineddata 文件：

文档： IronOCR 自定义语言

using IronOcr;

class Program
{
    static void Main()
    {
        // Initialize the IronTesseract OCR engine
        var Ocr = new IronTesseract();

        // Load your custom Tesseract language file (trained .traineddata file)
        Ocr.UseCustomTesseractLanguageFile("mydir/custom.traineddata");  //<--- your new font

        // Multiple fonts can be used by calling the method multiple times with different files

        // Load an image into the OCR Input for processing
        using (var Input = new OcrInput(@"images\image.png"))
        {
            // Perform OCR on the input image
            var Result = Ocr.Read(Input);

            // Output the recognized text to the console
            Console.WriteLine(Result.Text);
        }
    }
}

using IronOcr;

class Program
{
    static void Main()
    {
        // Initialize the IronTesseract OCR engine
        var Ocr = new IronTesseract();

        // Load your custom Tesseract language file (trained .traineddata file)
        Ocr.UseCustomTesseractLanguageFile("mydir/custom.traineddata");  //<--- your new font

        // Multiple fonts can be used by calling the method multiple times with different files

        // Load an image into the OCR Input for processing
        using (var Input = new OcrInput(@"images\image.png"))
        {
            // Perform OCR on the input image
            var Result = Ocr.Read(Input);

            // Output the recognized text to the console
            Console.WriteLine(Result.Text);
        }
    }
}

Imports IronOcr

Friend Class Program
	Shared Sub Main()
		' Initialize the IronTesseract OCR engine
		Dim Ocr = New IronTesseract()

		' Load your custom Tesseract language file (trained .traineddata file)
		Ocr.UseCustomTesseractLanguageFile("mydir/custom.traineddata") '<--- your new font

		' Multiple fonts can be used by calling the method multiple times with different files

		' Load an image into the OCR Input for processing
		Using Input = New OcrInput("images\image.png")
			' Perform OCR on the input image
			Dim Result = Ocr.Read(Input)

			' Output the recognized text to the console
			Console.WriteLine(Result.Text)
		End Using
	End Sub
End Class

$vbLabelText $csharpLabel