OCR Yoruba ni C#ati .Net

Awọn ẹya miiran ti iwe yii:

IronOCR jẹ paati sọfitiwia C#ti o fun laaye awọn onkọwe NET lati ka ọrọ lati awọn aworan ati awọn iwe aṣẹ PDF ni ede 126, pẹlu Yoruba.

O jẹ orita ti o ni ilọsiwaju ti Tesseract, ti a ṣe ni iyasọtọ fun awọn aṣagbega NET ati ni deede ṣe deede awọn ẹrọ miiran ti Tesseract fun iyara mejeeji ati deede.

Awọn akoonu ti Awọn itọsọna IronOcr.Yoruba

Apakan yii ni awọn ede OCR 43 fun .NET:

  • Yoruba
  • YorubaBest
  • YorubaFast

Ṣe igbasilẹ

Akopọ Ede Yoruba [Yorùbá]
* Download as Zip
* Install with as
https://www.nuget.org/packages/IronOcr.Languages.Yoruba/'> NuGet

Fifi sori ẹrọ

Ohun akọkọ ti a ni lati ṣe ni fi sori ẹrọ package OCR Yoruba wa si iṣẹ .NET rẹ.

PM> Install-Package IronOCR.Languages.Yoruba

Koodu Apere

Apẹẹrẹ koodu C#yii ka ọrọ Yoruba lati inu Aworan tabi iwe-ipamọ PDF.

//PM> Install-Package IronOcr.Languages.Yoruba
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var Input = new OcrInput(@"images\Yoruba.png"))
{
var Result = Ocr.Read(Input);
Var AllText = Result.Text
}
//PM> Install-Package IronOcr.Languages.Yoruba
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var Input = new OcrInput(@"images\Yoruba.png"))
{
var Result = Ocr.Read(Input);
Var AllText = Result.Text
}
'PM> Install-Package IronOcr.Languages.Yoruba
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba
Using Input = New OcrInput("images\Yoruba.png")
Dim Result = Ocr.Read(Input)
Dim AllText As Var = Result.Text
End Using
VB   C#

Kini idi ti Yan IronOCR?

Iron OCR jẹ fifi sori ẹrọ ti o rọrun, ti pari ati akọsilẹ daradara .NET ile-ikawe sọfitiwia.

Yan IronOCR lati ṣaṣeyọri deede 99.8% + OCR laisi lilo eyikeyi awọn iṣẹ wẹẹbu ita, awọn idiyele ti nlọ lọwọ tabi fifiranṣẹ awọn iwe aṣẹ igbekele lori intanẹẹti.

Kini idi ti awọn Difelopa C#ṣe yan IronOCR lori Vanilla Tesseract:

  • Fi sii bi DLL kan tabi Nuget kan
  • Pẹlu fun Awọn ẹrọ Engin 5, 4 ati 3 jade kuro ninu apoti.
  • Yiye 99.8% ṣe pataki ṣe deede Tesseract deede.
  • Iyara gbigbona ati MultiThreading
  • MVC, WebApp, Ojú-iṣẹ, Console & Ohun elo Server ibaramu
  • Ko si Exes tabi koodu C ++ lati ṣiṣẹ pẹlu
  • Full PDF OCR atilẹyin
  • Lati ṣe OCR fere eyikeyi faili Aworan tabi PDF
  • Kikun Core Net, Standard ati atilẹyin FrameWork
  • Firanṣẹ lori Windows, Mac, Linux, Azure, Docker, Lambda, AWS
  • Ka awọn barcodes ati awọn koodu QR
  • Ṣe okeere OCR bi si XHTML
  • Si ilẹ okeere OCR si awọn iwe aṣẹ PDF ti o ṣawari
  • Multithreading atilẹyin
  • Awọn ede kariaye 126 gbogbo ṣakoso nipasẹ Nuget tabi awọn faili OcrData
  • Fa Awọn aworan jade, Awọn ipoidojuko, Awọn iṣiro ati awọn fọnti. Kii ṣe ọrọ nikan.
  • O le ṣee lo lati tun pin kaakiri Tesseract OCR laarin awọn ohun elo ti iṣowo & ohun-ini.

Iron OCR nmọlẹ nigbati o n ṣiṣẹ pẹlu awọn aworan agbaye gidi ati awọn iwe aipe gẹgẹbi awọn fọto, tabi awọn ọlọjẹ ti ipinnu kekere eyiti o le ni ariwo oni-nọmba tabi awọn aipe.

Awọn ile-ikawe OCR ọfẹ ọfẹ fun pẹpẹ .NET iru awọn .NET tesseract API miiran ati awọn iṣẹ wẹẹbu ko ṣe bẹ daradara lori awọn ọran lilo agbaye gidi wọnyi.

OCR pẹlu Tesseract 5 - Bẹrẹ Ifaminsi ni C #

Apẹẹrẹ koodu ti o wa ni isalẹ fihan bi o ṣe rọrun lati ka ọrọ lati aworan nipa lilo C#tabi VB .NET.

OneLiner

string Text = new IronTesseract().Read(@"img\Screenshot.png").Text;
string Text = new IronTesseract().Read(@"img\Screenshot.png").Text;
Dim Text As String = (New IronTesseract()).Read("img\Screenshot.png").Text
VB   C#

Configurable Hello World

// PM> Install-Package IronOCR.Languages.Yoruba
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var Input = new OcrInput()){
Input.AddImage("images/sample.jpeg")
//... o le ṣafikun eyikeyi nọmba awọn aworan
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
// PM> Install-Package IronOCR.Languages.Yoruba
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var Input = new OcrInput()){
Input.AddImage("images/sample.jpeg")
//... o le ṣafikun eyikeyi nọmba awọn aworan
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
' PM> Install-Package IronOCR.Languages.Yoruba
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba
Using Input = New OcrInput()
Input.AddImage("images/sample.jpeg") var Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

C#PDF OCR

Ona kanna ni a le lo bakanna lati yọ ọrọ jade lati eyikeyi iwe PDF.

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var input = new OcrInput())
{
input.AddPdf("example.pdf", "password");
// A tun le yan awọn nọmba oju-iwe PDF kan pato si OCR

var Result = Ocr.Read(input);

Console.WriteLine(Result.Text);
Console.WriteLine($"{Result.Pages.Count()} Pages");
// Oju-iwe 1 fun gbogbo oju-iwe ti PDF
}
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var input = new OcrInput())
{
input.AddPdf("example.pdf", "password");
// A tun le yan awọn nọmba oju-iwe PDF kan pato si OCR

var Result = Ocr.Read(input);

Console.WriteLine(Result.Text);
Console.WriteLine($"{Result.Pages.Count()} Pages");
// Oju-iwe 1 fun gbogbo oju-iwe ti PDF
}
Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba
Using input = New OcrInput()
input.AddPdf("example.pdf", "password")
' A tun le yan awọn nọmba oju-iwe PDF kan pato si OCR

Dim Result = Ocr.Read(input)

Console.WriteLine(Result.Text)
Console.WriteLine($"{Result.Pages.Count()} Pages")
' Oju-iwe 1 fun gbogbo oju-iwe ti PDF
End Using
VB   C#

OCR fun MultiPage TIFFs

OCR kika kika faili TIFF pẹlu awọn iwe oju-iwe pupọ. TIFF tun le yipada taara sinu faili PDF pẹlu ọrọ ti o ṣawari.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput()){
input.AddMultiFrameTiff("multi - frame.tiff");
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput()){
input.AddMultiFrameTiff("multi - frame.tiff");
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba

Using Input = New OcrInput()
input.AddMultiFrameTiff("multi - frame.tiff")
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

Barcodes ati QR

Ẹya ara ọtọ ti Iron OCR ni pe o le ka awọn koodu barcodes ati awọn koodu QR lati awọn iwe aṣẹ lakoko ti o n ṣayẹwo ọrọ. Awọn apeere ti Kilasi OcrResult.OcrBarcode fun Olùgbéejáde ni alaye ni kikun nipa OcrResult.OcrBarcode kọọkan.

// using IronOcr;
var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true;

using (var input = new OcrInput())
{
input.AddImage("img/Barcode.png");
var Result = Ocr.Read(input);
foreach (var Barcode in Result.Barcodes)
{
Console.WriteLine(Barcode.Value);
// iru ati awọn ohun-ini ipo tun farahan
}
}
// using IronOcr;
var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true;

using (var input = new OcrInput())
{
input.AddImage("img/Barcode.png");
var Result = Ocr.Read(input);
foreach (var Barcode in Result.Barcodes)
{
Console.WriteLine(Barcode.Value);
// iru ati awọn ohun-ini ipo tun farahan
}
}
' using IronOcr;
Dim Ocr = New IronTesseract()
Ocr.Configuration.ReadBarCodes = True

Using input = New OcrInput()
input.AddImage("img/Barcode.png")
Dim Result = Ocr.Read(input)
For Each Barcode In Result.Barcodes
Console.WriteLine(Barcode.Value)
' iru ati awọn ohun-ini ipo tun farahan
Next Barcode
End Using
VB   C#

OCR lori Awọn agbegbe pataki ti Awọn aworan

Gbogbo awọn ọlọjẹ Iron OCR ati awọn ọna kika n pese agbara pato pato apakan ti oju-iwe kan tabi awọn oju-iwe ti a fẹ lati ka ọrọ lati. Eyi wulo pupọ nigbati a n wo awọn fọọmu ti o ṣe deede ati pe o le fi akoko pupọ pamọ ki o mu ilọsiwaju ṣiṣe.

Lati lo awọn ẹkun irugbin, a yoo nilo lati ṣafikun itọkasi eto si System.Drawing ki a le lo ohun elo System.Drawing.Rectangle .

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput())
{
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
// Awọn iwọn wa ni px

Input.Add("document.png", ContentArea);

var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput())
{
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
// Awọn iwọn wa ni px

Input.Add("document.png", ContentArea);

var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba

Using Input = New OcrInput()
Dim ContentArea = New System.Drawing.Rectangle() With {
	.X = 215,
	.Y = 1250,
	.Height = 280,
	.Width = 1335
}
' Awọn iwọn wa ni px

Input.Add("document.png", ContentArea)

Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

OCR fun Awọn iwoye Didara Kekere

Kilasi OCR OcrInput Iron le ṣatunṣe awọn ọlọjẹ ti deede Tesseract ko le ka.

using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.DeNoise(); // tunṣe ariwo oni-nọmba ati iwoye ti ko dara
Input.Deskew(); // atunse iyipo ati irisi
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.DeNoise(); // tunṣe ariwo oni-nọmba ati iwoye ti ko dara
Input.Deskew(); // atunse iyipo ati irisi
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba

Using Input = New OcrInput("img\Potter.LowQuality.tiff")
Input.DeNoise() ' tunṣe ariwo oni-nọmba ati iwoye ti ko dara
Input.Deskew() ' atunse iyipo ati irisi
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

Si ilẹ okeere Awọn abajade OCR bi PDF Ti a Ṣawari

Aworan si PDF pẹlu awọn gbolohun ọrọ idaakọ. Le ṣe itọka nipasẹ awọn ẹrọ wiwa ati awọn apoti isura data.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput()){
input.Title = "Quarterly Report"
input.AddImage("image1.jpeg");
input.AddImage("image2.png");
input.AddImage("image3.gif");

var Result = Ocr.Read(input);
Result.SaveAsSearchablePdf("searchable.pdf")
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput()){
input.Title = "Quarterly Report"
input.AddImage("image1.jpeg");
input.AddImage("image2.png");
input.AddImage("image3.gif");

var Result = Ocr.Read(input);
Result.SaveAsSearchablePdf("searchable.pdf")
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba

Using Input = New OcrInput()
input.Title = "Quarterly Report" input.AddImage("image1.jpeg")
input.AddImage("image2.png")
input.AddImage("image3.gif")

Dim Result = Ocr.Read(input)
Result.SaveAsSearchablePdf("searchable.pdf")
End Using
VB   C#

TIFF si Iyipada PDF ti o wa

Ṣe atunṣe iwe TIFF kan (tabi eyikeyi ẹgbẹ awọn faili aworan) taara si PDF ti o le ṣawari eyiti o le ṣe itọka nipasẹ intranet, oju opo wẹẹbu ati awọn ẹrọ wiwa google.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput()){
input.AddMultiFrameTiff("example.tiff")
var Result = Ocr.Read(input).SaveAsSearchablePdf("searchable.pdf")
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput()){
input.AddMultiFrameTiff("example.tiff")
var Result = Ocr.Read(input).SaveAsSearchablePdf("searchable.pdf")
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba

Using Input = New OcrInput()
input.AddMultiFrameTiff("example.tiff") var Result = Ocr.Read(input).SaveAsSearchablePdf("searchable.pdf")
End Using
VB   C#

Gbe awọn esi OCR jade bi HTML

Aworan OCR si iyipada XHTML.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var Input = new OcrInput()){
input.Title = "Html Title"
input.AddImage("image1.jpeg");
var Result = Ocr.Read(input);
Result.SaveAsHocrFile("results.html");
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
using (var Input = new OcrInput()){
input.Title = "Html Title"
input.AddImage("image1.jpeg");
var Result = Ocr.Read(input);
Result.SaveAsHocrFile("results.html");
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba
Using Input = New OcrInput()
input.Title = "Html Title" input.AddImage("image1.jpeg")
Dim Result = Ocr.Read(input)
Result.SaveAsHocrFile("results.html")
End Using
VB   C#

Awọn Ajọ Imudarasi Aworan OCR

IronOCR n pese awọn asẹ alailẹgbẹ fun awọn ohun OcrInput lati mu ilọsiwaju OCR ṣiṣẹ.

Apere Igbega Aworan Aworan

Ṣe awọn aworan titẹ sii OCR ga julọ lati ṣe dara julọ, awọn abajade OCR yiyara.

using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput(@"LowQuality.jpeg"))
{
Input.DeNoise(); // tunṣe ariwo oni-nọmba ati iwoye ti ko dara
Input.Deskew(); // atunse iyipo ati irisi
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;

using (var Input = new OcrInput(@"LowQuality.jpeg"))
{
Input.DeNoise(); // tunṣe ariwo oni-nọmba ati iwoye ti ko dara
Input.Deskew(); // atunse iyipo ati irisi
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}
Imports IronOcr
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba

Using Input = New OcrInput("LowQuality.jpeg")
Input.DeNoise() ' tunṣe ariwo oni-nọmba ati iwoye ti ko dara
Input.Deskew() ' atunse iyipo ati irisi
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using
VB   C#

Akojọ ti awọn Ajọ Aworan OCR

Awọn asẹwọle input lati jẹki iṣẹ OCR eyiti a ṣe sinu IronOCR pẹlu:

  • OcrInput.Rate (awọn iwọn meji) - Awọn aworan yiyi nipasẹ nọmba awọn iwọn ni titọ. Fun alatako-aago, lo awọn nọmba odi.
  • OcrInput.Binarize () - Ajọ aworan yii yipada gbogbo ẹbun dudu tabi funfun pẹlu laisi ilẹ aarin. Le Mu Awọn ọran iṣe OCR ṣiṣẹ ti itansan kekere ti ọrọ si abẹlẹ.
  • OcrInput.ToGrayScale () - Ajọ aworan yii yipada gbogbo ẹbun sinu iboji ti grayscale. Ko seese lati mu ilọsiwaju OCR dara ṣugbọn o le mu iyara pọ si
  • OcrInput.Contrast () - Mu ki iyatọ pọ si laifọwọyi. Ajọ yi nigbagbogbo n mu iyara OCR pọ ati deede ni awọn iwoye itansan kekere.
  • OcrInput.DeNoise () - Yọ ariwo oni-nọmba. Ajọ yii yẹ ki o lo nikan nibiti o ti nireti ariwo.
  • OcrInput.Invert () - Yi gbogbo awọ pada. Eg White di dudu: dudu di funfun.
  • OcrInput.Dilate () - Mofoloji To ti ni ilọsiwaju. Dilation ṣe afikun awọn piksẹli si awọn aala ti awọn nkan ninu aworan kan. Idakeji ti Erode
  • OcrInput.Erode () - Mofoloji To ti ni ilọsiwaju. Erosion yọ awọn piksẹli lori awọn aala ohun Idojukọ Dilate
  • OcrInput.Deskew () - Yiyi aworan nitorina o jẹ ọna ti o tọ ati orthogonal. Eyi wulo pupọ fun OCR nitori ifarada Tesseract fun awọn iwo wiwọn le jẹ kekere bi awọn iwọn 5.
  • OcrInput.DeepCleanBackgroundNoise () - Iyọkuro ariwo isale Eru. Lo àlẹmọ yii nikan bi o ba jẹ pe ariwo ariwo lẹhin iwe ni a mọ, nitori pe asẹ yii yoo tun eewu atehinwa deede OCR ti awọn iwe mimọ, ati pe Sipiyu gbowolori pupọ.
  • OcrInput.EnhanceResolution - Ṣe iyi ipinnu awọn aworan didara kekere. A ko nilo àlẹmọ yii nigbagbogbo nitori OcrInput.MinimumDPI ati OcrInput.TargetDPI yoo mu laifọwọyi ati yanju awọn igbewọle ipinnu kekere.

CleanBackgroundNoise. Eyi jẹ eto eyiti o jẹ akoko to n gba; sibẹsibẹ, o gba aaye ikawe laaye lati nu ariwo oni-nọmba laifọwọyi, awọn iyọ iwe, ati awọn aipe miiran laarin aworan oni-nọmba eyiti yoo jẹ ki o jẹ ailagbara ti kika nipasẹ awọn ile-ikawe OCR miiran.

EnhanceContrast jẹ eto eyiti o fa ki Iron OCR mu alekun itansan ti ọrọ pọ si abẹlẹ ti aworan kan, jijẹ deede ti OCR ati ṣiṣe alekun gbogbogbo ati iyara ti OCR.

Imudarasi jẹ ipinnu ti yoo ṣe awari awọn aworan irẹwọn kekere (eyiti o wa labẹ d555 dida) ati gbe aworan soke ni adaṣe ati lẹhinna pọn gbogbo ọrọ ki o le ka ni pipe nipasẹ ile-ikawe OCR. Botilẹjẹpe iṣiṣẹ yii jẹ funrararẹ n gba akoko, o dinku gbogbo akoko ni apapọ fun iṣẹ OCR lori aworan kan.

Ede Iron OCR ṣe atilẹyin awọn akopọ ede kariaye 22, ati pe eto ede le ṣee lo lati yan ọkan tabi diẹ sii awọn ede pupọ lati lo fun iṣẹ OCR.

Nwon.Mirza Iron OCR ṣe atilẹyin awọn imọran meji. A le yan lati boya lọ fun iyara ati iwuwo deede ti iwe aṣẹ kan, tabi lo ilana ilọsiwaju ti o nlo diẹ ninu awọn awoṣe itetisi atọwọda lati mu ilọsiwaju deede ti ọrọ OCR ṣiṣẹ laifọwọyi nipa wiwo ibatan iṣiro ti awọn ọrọ si ara wa ni gbolohun ọrọ .

ColorSpace jẹ eto eyiti a le yan si OCR ni grayscale tabi awọ. Ni gbogbogbo, grayscale ni aṣayan ti o dara julọ. Sibẹsibẹ, nigbamiran nigbati awọn ọrọ tabi awọn ipilẹ ti hue ti o jọra yatọ ṣugbọn awọ ti o yatọ pupọ, aaye awọ awọ kikun yoo pese awọn abajade to dara julọ.

ṢawariWhiteTextOnDarkBackgrounds. Ni gbogbogbo, gbogbo awọn ile-ikawe OCR nireti lati wo ọrọ dudu lori awọn ipilẹ funfun. Eto yii gba Iron OCR laaye lati ṣe awari awọn odi, tabi awọn oju-iwe dudu pẹlu ọrọ funfun, ki o ka wọn.

Iru InputImageType. Eto yii ngbanilaaye fun Olùgbéejáde lati ṣe itọsọna ile-ikawe OCR bi boya o nwo iwe-ipamọ ni kikun tabi snippet kan, gẹgẹbi sikirinifoto.

RotateAndStraighten jẹ eto ilọsiwaju ti o fun laaye Iron OCR agbara alailẹgbẹ lati ka awọn iwe aṣẹ eyiti kii ṣe iyipo nikan, ṣugbọn boya irisi ti o ni ninu, gẹgẹbi awọn fọto ti awọn iwe ọrọ.

ReadBarcodes jẹ ẹya ti o wulo eyiti o fun laaye Iron OCR lati ka awọn barcodes laifọwọyi ati awọn koodu QR lori awọn oju-iwe bi o ṣe tun ka ọrọ, laisi fifi afikun akoko ẹru nla kun.

Iwọn awọ. Eto yii pinnu iye diẹ ninu ẹbun fun ẹbun ile-ikawe OCR yoo lo lati pinnu ijinle awọ kan. Ijinlẹ awọ ti o ga julọ le ṣe alekun didara OCR, ṣugbọn yoo tun mu akoko ti o nilo fun iṣẹ OCR lati pari.

Awọn akopọ Ede 126

Iron OCR ṣe atilẹyin awọn ede kariaye 126 nipasẹ awọn akopọ ede eyiti o pin bi awọn DLL, eyiti o le ṣe igbasilẹ lati oju opo wẹẹbu yii , tabi lati ọdọ NuGet Package Manager .

Awọn ede pẹlu Jẹmánì, Faranse, Gẹẹsi, Ṣaina, Japanese ati ọpọlọpọ diẹ sii. Awọn akopọ ede Pataki wa fun iwe irinna MRZ, awọn sọwedowo MICR, Data Owo, Awọn awo iwe-aṣẹ ati ọpọlọpọ diẹ sii. O tun le lo eyikeyi faili tesseract ".traineddata" - pẹlu awọn ti o ṣẹda funrararẹ.

Apeere Ede

Lilo awọn ede OCR Miiran.

// using IronOcr;
// PM> Install IronOcr.Languages.Arabic

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Arabic;

using (var input = new OcrInput())
{
input.AddImage("img/arabic.gif");
// Ṣafikun awọn awoṣe aworan ti o ba nilo
// Ni ọran yii, paapaa iṣagbewọle iṣaro jẹ didara kekere
// IronTesseract le ka kini Tesseract ti aṣa ko le ṣe.

var Result = Ocr.Read(input);

// Console ko le tẹ ede Arabic lori Windows ni rọọrun.
// Jẹ ki a fipamọ si disk dipo.
Result.SaveAsTextFile("arabic.txt");
}
// using IronOcr;
// PM> Install IronOcr.Languages.Arabic

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Arabic;

using (var input = new OcrInput())
{
input.AddImage("img/arabic.gif");
// Ṣafikun awọn awoṣe aworan ti o ba nilo
// Ni ọran yii, paapaa iṣagbewọle iṣaro jẹ didara kekere
// IronTesseract le ka kini Tesseract ti aṣa ko le ṣe.

var Result = Ocr.Read(input);

// Console ko le tẹ ede Arabic lori Windows ni rọọrun.
// Jẹ ki a fipamọ si disk dipo.
Result.SaveAsTextFile("arabic.txt");
}
' using IronOcr;
' PM> Install IronOcr.Languages.Arabic

Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Arabic

Using input = New OcrInput()
input.AddImage("img/arabic.gif")
' Ṣafikun awọn awoṣe aworan ti o ba nilo
' Ni ọran yii, paapaa iṣagbewọle iṣaro jẹ didara kekere
' IronTesseract le ka kini Tesseract ti aṣa ko le ṣe.

Dim Result = Ocr.Read(input)

' Console ko le tẹ ede Arabic lori Windows ni rọọrun.
' Jẹ ki a fipamọ si disk dipo.
Result.SaveAsTextFile("arabic.txt")
End Using
VB   C#

Apeere Ede Pupo

O tun ṣee ṣe si OCR nipa lilo awọn ede lọpọlọpọ ni akoko kanna. Eyi le ṣe iranlọwọ gaan lati gba metadata ede gẹẹsi ati awọn url ninu awọn iwe aṣẹ Unicode.

// using IronOcr;
// PM> Install IronOcr.Languages.ChineseSimplified

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified;
Ocr.AddSecondaryLanguage(OcrLanguage.Yoruba);

// A le ṣafikun nọmba eyikeyi ti awọn ede

using (var input = new OcrInput())
{
input.Add("multi - language.pdf");
var Result = Ocr.Read(input);
Result.SaveAsTextFile("results.txt");
}
// using IronOcr;
// PM> Install IronOcr.Languages.ChineseSimplified

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified;
Ocr.AddSecondaryLanguage(OcrLanguage.Yoruba);

// A le ṣafikun nọmba eyikeyi ti awọn ede

using (var input = new OcrInput())
{
input.Add("multi - language.pdf");
var Result = Ocr.Read(input);
Result.SaveAsTextFile("results.txt");
}
' using IronOcr;
' PM> Install IronOcr.Languages.ChineseSimplified

Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.ChineseSimplified
Ocr.AddSecondaryLanguage(OcrLanguage.Yoruba)

' A le ṣafikun nọmba eyikeyi ti awọn ede

Using input = New OcrInput()
input.Add("multi - language.pdf")
Dim Result = Ocr.Read(input)
Result.SaveAsTextFile("results.txt")
End Using
VB   C#

Alaye Awọn ohun OCR Awọn abajade

Iron OCR pada ohun abajade abajade OCR fun iṣẹ OCR kọọkan. Ni gbogbogbo, awọn olupilẹṣẹ lo ohun-ini ọrọ ti nkan yii nikan lati jẹ ki ọrọ ṣayẹwo lati aworan naa. Sibẹsibẹ, awọn abajade OCR DOM ti ni ilọsiwaju pupọ ju eyi lọ.

using IronOcr;
using System.Drawing; //Ṣafikun Apejọ Apejọ

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm;
Ocr.Configuration.ReadBarCodes = true; //Pataki

using (var Input = new OcrInput(@"images\sample.tiff"))
{
OcrResult Result = Ocr.Read(Input);
var Pages = Result.Pages;
var Words = Pages[0].Words;
var Barcodes = Result.Barcodes;
// Ṣawari nibi lati wa lowo, API alaye:
// - Awọn oju-iwe, Awọn bulọọki, Parafasi, Awọn ila, Awọn ọrọ, Chars
// - Si ilẹ okeere Aworan, Awọn ipoidojuko Fonts, Data iṣiro
}
using IronOcr;
using System.Drawing; //Ṣafikun Apejọ Apejọ

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Yoruba;
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm;
Ocr.Configuration.ReadBarCodes = true; //Pataki

using (var Input = new OcrInput(@"images\sample.tiff"))
{
OcrResult Result = Ocr.Read(Input);
var Pages = Result.Pages;
var Words = Pages[0].Words;
var Barcodes = Result.Barcodes;
// Ṣawari nibi lati wa lowo, API alaye:
// - Awọn oju-iwe, Awọn bulọọki, Parafasi, Awọn ila, Awọn ọrọ, Chars
// - Si ilẹ okeere Aworan, Awọn ipoidojuko Fonts, Data iṣiro
}
Imports IronOcr
Imports System.Drawing 'Ṣafikun Apejọ Apejọ

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Yoruba
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm
Ocr.Configuration.ReadBarCodes = True 'Pataki

Using Input = New OcrInput("images\sample.tiff")
Dim Result As OcrResult = Ocr.Read(Input)
Dim Pages = Result.Pages
Dim Words = Pages(0).Words
Dim Barcodes = Result.Barcodes
' Ṣawari nibi lati wa lowo, API alaye:
' - Awọn oju-iwe, Awọn bulọọki, Parafasi, Awọn ila, Awọn ọrọ, Chars
' - Si ilẹ okeere Aworan, Awọn ipoidojuko Fonts, Data iṣiro
End Using
VB   C#

Iṣe

IronOCR n ṣiṣẹ lati inu apoti laisi iwulo lati ṣe ohun orin tabi yi awọn aworan titẹ sii dara julọ.

Iyara jẹ gbigbona: IronOcr.2020 + ti to awọn akoko 10 yiyara ati ṣiṣe awọn aṣiṣe 250% diẹ ju awọn ti iṣaaju lọ.

Kọ ẹkọ diẹ si

Lati ni imọ siwaju sii nipa OCR ni C #, VB, F #, tabi eyikeyi ede .NET, jọwọ ka awọn itọnisọna agbegbe wa , eyiti o fun awọn apẹẹrẹ agbaye gidi ti bi a ṣe le lo Iron OCR ati pe o le fihan awọn nuances ti bawo ni a ṣe le rii dara julọ yi ìkàwé.

Itọkasi ohun ni kikun fun awọn Difelopa NET tun wa.