Chinese Simplified OCR in C# and .Net

Other versions of this document:

IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 language, including Chinese Simplified.

It is an advanced fork of Tesseract, built exclusively for the .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.

Contents of IronOcr.Languages.Chinese

This package contain 352 OCR languages for .NET:

  • ChineseSimplified
  • ChineseSimplifiedBest
  • ChineseSimplifiedFast
  • ChineseSimplifiedVertical
  • ChineseSimplifiedVerticalBest
  • ChineseSimplifiedVerticalFast
  • ChineseTraditional
  • ChineseTraditionalBest
  • ChineseTraditionalFast
  • ChineseTraditionalVertical
  • ChineseTraditionalVerticalBest
  • ChineseTraditionalVerticalFast

Download

Chinese Simplified Language Pack [中文 (Zhōngwén)]

Installation

The first thing we have to do is install our Chinese Simplified OCR package to your .NET project.

PM> Install-Package IronOCR.Languages.Chinese

Code Example

This C# code example reads Chinese Simplified text from an Image or PDF document.

//PM> Install-Package IronOcr.Languages.Chinese
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Chinese;
using (var Input = new OcrInput(@"images\Chinese.png"))
{
    var Result = Ocr.Read(Input);
    Var AllText =  Result.Text
}
//PM> Install-Package IronOcr.Languages.Chinese
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Chinese;
using (var Input = new OcrInput(@"images\Chinese.png"))
{
    var Result = Ocr.Read(Input);
    Var AllText =  Result.Text
}
'PM> Install-Package IronOcr.Languages.Chinese
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Chinese
Using Input = New OcrInput("images\Chinese.png")
	Dim Result = Ocr.Read(Input)
	Dim AllText As Var = Result.Text
End Using
VB   C#