OCR Ayisyen nan C# ak .NET

Lòt vèsyon dokiman sa a:

IronOCR se yon eleman lojisyèl C# ki pèmèt kodè .NET li tèks nan imaj ak dokiman PDF nan 126 lang, tankou ayisyen.

Li se yon fouchèt avanse nan Tesseract, bati sèlman pou devlopè yo .NET ak regilyèman pèfòme lòt motè Tesseract pou tou de vitès ak presizyon.

Sa ki nan IronOcr.Languages.Haitian

Pake sa a gen ladan 46 lang OCR pou .NET:

  • Ayisyen
  • HaitianBest
  • HaitianFast

Telechaje

Haitian Language Pack [Kreyòl ayisyen]

Enstalasyon

Premye bagay nou dwe fè se enstale pake OCR ayisyen nou an nan pwojè .NET ou an.

Install-Package IronOCR.Languages.Haitian

Egzanp Kòd

Egzanp kòd C# sa a li tèks ayisyen ki soti nan yon imaj oswa yon dokiman PDF.

// Install the IronOcr languages package for Haitian support
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

// Load the image input
using (var Input = new OcrInput(@"images\Haitian.png"))
{
    // Read the image input
    var Result = Ocr.Read(Input);

    // Retrieve the text from the OCR result
    var AllText = Result.Text;

    // Print the extracted text
    Console.WriteLine(AllText);
}
// Install the IronOcr languages package for Haitian support
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

// Load the image input
using (var Input = new OcrInput(@"images\Haitian.png"))
{
    // Read the image input
    var Result = Ocr.Read(Input);

    // Retrieve the text from the OCR result
    var AllText = Result.Text;

    // Print the extracted text
    Console.WriteLine(AllText);
}
' Install the IronOcr languages package for Haitian support
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

' Load the image input
Using Input = New OcrInput("images\Haitian.png")
	' Read the image input
	Dim Result = Ocr.Read(Input)

	' Retrieve the text from the OCR result
	Dim AllText = Result.Text

	' Print the extracted text
	Console.WriteLine(AllText)
End Using
$vbLabelText   $csharpLabel

Poukisa Chwazi IronOCR?

IronOCR se yon fasil-a-enstale, konplè epi ki byen dokimante .NET bibliyotèk lojisyèl.

Chwazi IronOCR pou reyalize 99.8% + OCR presizyon san ou pa itilize okenn sèvis entènèt ekstèn, frè kontinyèl oswa voye dokiman konfidansyèl sou entènèt la.

Poukisa C# devlopè chwazi IronOCR sou Vanilla Tesseract:

  • Enstale kòm yon sèl DLL oswa NuGet
  • Gen ladan pou Tesseract 5, 4 ak 3 motè soti nan bwat la
  • Presizyon 99.8% siyifikativman depase Tesseract regilye
  • Tou limen an vitès ak MultiThreading
  • MVC, WebApp, Desktop, Console & Sèvè Aplikasyon konpatib
  • Pa gen Exes oswa C ++ kòd pou travay avèk yo
  • Sipò konplè PDF OCR
  • Fè OCR yon prèske nenpòt dosye imaj oswa PDF
  • Full .NET Nwayo, Creole ak FrameWork sipò
  • Deplwaye sou Windows, Mac, Linux, Azure, Docker, Lambda, AWS
  • Li kòd bar ak kòd QR
  • Ekspòte OCR kòm XHTML
  • Ekspòte OCR nan dokiman PDF rechèch
  • Multithreading sipò
  • 126 lang entènasyonal tout jere atravè dosye NuGet oswa OcrData
  • Ekstrè Imaj, Kowòdone, Estatistik ak Polis. Pa sèlman tèks.
  • Ka itilize pou redistribiye Tesseract OCR andedan aplikasyon komèsyal ak propriétaires.

IronOCR klere lè w ap travay ak imaj mond reyèl la ak dokiman enpafè tankou foto, oswa analiz de rezolisyon ki ba ki ka gen bri dijital oswa defo.

Lòt bibliyotèk OCR gratis pou platfòm la. NET tankou lòt .net tesseract APIs ak sèvis entènèt pa fè sa byen sou ka sa yo itilize mond reyèl la.

OCR ak Tesseract 5 - Kòmanse kodaj nan C#

Echantiyon kòd ki anba a montre kouman li fasil pou li tèks ki soti nan yon imaj lè l sèvi avèk C# oswa VB .NET.

OneLiner

string Text = new IronTesseract().Read(@"img\Screenshot.png").Text;
string Text = new IronTesseract().Read(@"img\Screenshot.png").Text;
Dim Text As String = (New IronTesseract()).Read("img\Screenshot.png").Text
$vbLabelText   $csharpLabel

Configurable Hello World

// PM> Install-Package IronOCR.Languages.Haitian
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.AddImage("images/sample.jpeg");
    // ... ou ka ajoute nenpòt ki kantite imaj

    var Result = Ocr.Read(Input);

    // Affiche le texte résultant
    Console.WriteLine(Result.Text);
}
// PM> Install-Package IronOCR.Languages.Haitian
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.AddImage("images/sample.jpeg");
    // ... ou ka ajoute nenpòt ki kantite imaj

    var Result = Ocr.Read(Input);

    // Affiche le texte résultant
    Console.WriteLine(Result.Text);
}
' PM> Install-Package IronOCR.Languages.Haitian
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput()
	Input.AddImage("images/sample.jpeg")
	' ... ou ka ajoute nenpòt ki kantite imaj

	Dim Result = Ocr.Read(Input)

	' Affiche le texte résultant
	Console.WriteLine(Result.Text)
End Using
$vbLabelText   $csharpLabel

C# PDF OCR

Apwòch la menm kapab menm jan an tou itilize ekstrè tèks nan nenpòt ki dokiman PDF.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var input = new OcrInput())
{
    input.AddPdf("example.pdf", "password"); // Ajoute dokiman PDF ak modpas li, si genyen

    // Nou kapab tou chwazi nimewo PDF espesifik paj OCR

    var Result = Ocr.Read(input);

    // Ajoute yon evalyasyon nan rezilta PDF la
    Console.WriteLine(Result.Text);
    Console.WriteLine($"{Result.Pages.Count()} Pages");
    // 1 paj pou chak paj nan PDF la
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var input = new OcrInput())
{
    input.AddPdf("example.pdf", "password"); // Ajoute dokiman PDF ak modpas li, si genyen

    // Nou kapab tou chwazi nimewo PDF espesifik paj OCR

    var Result = Ocr.Read(input);

    // Ajoute yon evalyasyon nan rezilta PDF la
    Console.WriteLine(Result.Text);
    Console.WriteLine($"{Result.Pages.Count()} Pages");
    // 1 paj pou chak paj nan PDF la
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using input = New OcrInput()
	input.AddPdf("example.pdf", "password") ' Ajoute dokiman PDF ak modpas li, si genyen

	' Nou kapab tou chwazi nimewo PDF espesifik paj OCR

	Dim Result = Ocr.Read(input)

	' Ajoute yon evalyasyon nan rezilta PDF la
	Console.WriteLine(Result.Text)
	Console.WriteLine($"{Result.Pages.Count()} Pages")
	' 1 paj pou chak paj nan PDF la
End Using
$vbLabelText   $csharpLabel

OCR pou MultiPage TIFFs

OCR Lekti fòma dosye TIFF ki gen ladan dokiman paj miltip. TIFF kapab tou konvèti dirèkteman nan yon dosye PDF ak tèks rechèch.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.AddMultiFrameTiff("multi-frame.tiff"); // Ajoute dokiman TIFF miltipaj

    var Result = Ocr.Read(Input);

    // Affiche le texte extrait de TIFF multi-page
    Console.WriteLine(Result.Text);
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.AddMultiFrameTiff("multi-frame.tiff"); // Ajoute dokiman TIFF miltipaj

    var Result = Ocr.Read(Input);

    // Affiche le texte extrait de TIFF multi-page
    Console.WriteLine(Result.Text);
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput()
	Input.AddMultiFrameTiff("multi-frame.tiff") ' Ajoute dokiman TIFF miltipaj

	Dim Result = Ocr.Read(Input)

	' Affiche le texte extrait de TIFF multi-page
	Console.WriteLine(Result.Text)
End Using
$vbLabelText   $csharpLabel

Barkodes ak QR

Yon karakteristik inik nan IronOCR se li ka li kòd bar ak kòd QR nan dokiman pandan ke li ap analysis pou tèks. Ka nan klas la OcrResult.OcrBarcode bay pwomotè a enfòmasyon detaye sou chak barcode tcheke.

// using IronOcr;

var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true;

using (var input = new OcrInput())
{
    input.AddImage("img/Barcode.png"); // Ajoute imaj kòd bar

    var Result = Ocr.Read(input);

    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
        // kalite ak kote pwopriyete ekspoze tou
    }
}
// using IronOcr;

var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true;

using (var input = new OcrInput())
{
    input.AddImage("img/Barcode.png"); // Ajoute imaj kòd bar

    var Result = Ocr.Read(input);

    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
        // kalite ak kote pwopriyete ekspoze tou
    }
}
' using IronOcr;

Dim Ocr = New IronTesseract()
Ocr.Configuration.ReadBarCodes = True

Using input = New OcrInput()
	input.AddImage("img/Barcode.png") ' Ajoute imaj kòd bar

	Dim Result = Ocr.Read(input)

	For Each Barcode In Result.Barcodes
		Console.WriteLine(Barcode.Value)
		' kalite ak kote pwopriyete ekspoze tou
	Next Barcode
End Using
$vbLabelText   $csharpLabel

OCR sou zòn espesifik nan imaj

Tout metòd optik ak lekti IronOCR bay kapasite pou presize egzakteman ki pati nan yon paj oswa paj nou ta renmen li tèks soti nan. Sa a se trè itil lè nou ap chèche nan fòm ofisyèl epi yo ka sove yon anpil terib nan tan ak amelyore efikasite.

Pou itilize rejyon rekòt, nou pral bezwen ajoute yon referans sistèm nan System.Drawing pou nou ka itilize System.Drawing.Rectangle objè a.

using IronOcr;
using System.Drawing; // Required for defining rectangle

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    var ContentArea = new Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
    // Dimansyon yo nan pixels

    Input.Add("document.png", ContentArea);

    var Result = Ocr.Read(Input);

    // Show the extracted text from the specified region
    Console.WriteLine(Result.Text);
}
using IronOcr;
using System.Drawing; // Required for defining rectangle

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    var ContentArea = new Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
    // Dimansyon yo nan pixels

    Input.Add("document.png", ContentArea);

    var Result = Ocr.Read(Input);

    // Show the extracted text from the specified region
    Console.WriteLine(Result.Text);
}
Imports IronOcr
Imports System.Drawing ' Required for defining rectangle

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput()
	Dim ContentArea = New Rectangle() With {
		.X = 215,
		.Y = 1250,
		.Height = 280,
		.Width = 1335
	}
	' Dimansyon yo nan pixels

	Input.Add("document.png", ContentArea)

	Dim Result = Ocr.Read(Input)

	' Show the extracted text from the specified region
	Console.WriteLine(Result.Text)
End Using
$vbLabelText   $csharpLabel

OCR pou analiz kalite ki ba

Klas OCR OcrInput an fè ka ranje analiz ki Tesseract nòmal pa ka li.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
    Input.DeNoise(); // Fix digital noise and poor optics
    Input.Deskew(); // Fix rotation and perspective

    var Result = Ocr.Read(Input);

    // Print extracted text from low quality image
    Console.WriteLine(Result.Text);
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
    Input.DeNoise(); // Fix digital noise and poor optics
    Input.Deskew(); // Fix rotation and perspective

    var Result = Ocr.Read(Input);

    // Print extracted text from low quality image
    Console.WriteLine(Result.Text);
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput("img\Potter.LowQuality.tiff")
	Input.DeNoise() ' Fix digital noise and poor optics
	Input.Deskew() ' Fix rotation and perspective

	Dim Result = Ocr.Read(Input)

	' Print extracted text from low quality image
	Console.WriteLine(Result.Text)
End Using
$vbLabelText   $csharpLabel

Ekspòtasyon rezilta OCR kòm yon PDF rechèch

Imaj PDF ak fisèl tèks kopi. Èske yo kapab Catalogue pa motè rechèch ak baz done.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.Title = "Quarterly Report"; // Assigning Title
    Input.AddImage("image1.jpeg");
    Input.AddImage("image2.png");
    Input.AddImage("image3.gif");

    var Result = Ocr.Read(Input);

    // Save OCR results as a searchable PDF
    Result.SaveAsSearchablePdf("searchable.pdf");
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.Title = "Quarterly Report"; // Assigning Title
    Input.AddImage("image1.jpeg");
    Input.AddImage("image2.png");
    Input.AddImage("image3.gif");

    var Result = Ocr.Read(Input);

    // Save OCR results as a searchable PDF
    Result.SaveAsSearchablePdf("searchable.pdf");
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput()
	Input.Title = "Quarterly Report" ' Assigning Title
	Input.AddImage("image1.jpeg")
	Input.AddImage("image2.png")
	Input.AddImage("image3.gif")

	Dim Result = Ocr.Read(Input)

	' Save OCR results as a searchable PDF
	Result.SaveAsSearchablePdf("searchable.pdf")
End Using
$vbLabelText   $csharpLabel

TIFF konvèsyon PDF rechèch

Konvèti yon dokiman TIFF (oswa nenpòt gwoup dosye imaj) dirèkteman nan yon PDF rechèch ki ka endèks moute pa intranèt, sit entènèt ak motè rechèch Google.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.AddMultiFrameTiff("example.tiff");
    var Result = Ocr.Read(Input).SaveAsSearchablePdf("searchable.pdf");
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.AddMultiFrameTiff("example.tiff");
    var Result = Ocr.Read(Input).SaveAsSearchablePdf("searchable.pdf");
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput()
	Input.AddMultiFrameTiff("example.tiff")
	Dim Result = Ocr.Read(Input).SaveAsSearchablePdf("searchable.pdf")
End Using
$vbLabelText   $csharpLabel

Ekspòte rezilta OCR kòm HTML

OCR Imaj konvèsyon XHTML.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.Title = "Html Title"; // Title for the HTML document
    Input.AddImage("image1.jpeg");

    var Result = Ocr.Read(Input);

    // Save OCR results as an HTML file
    Result.SaveAsHocrFile("results.html");
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput())
{
    Input.Title = "Html Title"; // Title for the HTML document
    Input.AddImage("image1.jpeg");

    var Result = Ocr.Read(Input);

    // Save OCR results as an HTML file
    Result.SaveAsHocrFile("results.html");
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput()
	Input.Title = "Html Title" ' Title for the HTML document
	Input.AddImage("image1.jpeg")

	Dim Result = Ocr.Read(Input)

	' Save OCR results as an HTML file
	Result.SaveAsHocrFile("results.html")
End Using
$vbLabelText   $csharpLabel

Filtè Amelyorasyon Imaj OCR

IronOCR bay filtè inik pou objè OcrInput pou amelyore pèfòmans OCR.

Amelyorasyon Imaj Kòd Egzanp

Fè imaj opinyon OCR pi wo kalite yo pwodwi pi bon, pi vit rezilta OCR.

using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput(@"LowQuality.jpeg"))
{
    Input.DeNoise(); // Fix digital noise and poor optics
    Input.Deskew(); // Fix rotation and perspective

    var Result = Ocr.Read(Input);

    // Affiche le texte extrait
    Console.WriteLine(Result.Text);
}
using IronOcr;

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;

using (var Input = new OcrInput(@"LowQuality.jpeg"))
{
    Input.DeNoise(); // Fix digital noise and poor optics
    Input.Deskew(); // Fix rotation and perspective

    var Result = Ocr.Read(Input);

    // Affiche le texte extrait
    Console.WriteLine(Result.Text);
}
Imports IronOcr

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian

Using Input = New OcrInput("LowQuality.jpeg")
	Input.DeNoise() ' Fix digital noise and poor optics
	Input.Deskew() ' Fix rotation and perspective

	Dim Result = Ocr.Read(Input)

	' Affiche le texte extrait
	Console.WriteLine(Result.Text)
End Using
$vbLabelText   $csharpLabel

Lis OCR Filtè Imaj

Filtè Antre amelyore pèfòmans OCR ki fè yo bati nan IronOCR gen ladan yo:

  • OcrInput.Rotate(double degrees) - Thorne imaj pa yon kantite degre goch. Pou anti-goch, itilize nimewo negatif.
  • OcrInput.Binarize() - Sa a filtre imaj vire chak pixel nwa oswa blan ki pa gen okenn tè presegondè. Ka amelyore ka pèfòmans OCR nan kontras ki ba anpil nan tèks background.
  • OcrInput.ToGrayScale() - Sa a filtre imaj vire chak pixel nan yon lonbraj nan gri. Fasil amelyore presizyon OCR men li ka amelyore vitès.
  • OcrInput.Contrast() - Ogmante kontras otomatikman. Sa a filtre souvan amelyore OCR vitès ak presizyon nan analiz kontras ki ba.
  • OcrInput.DeNoise() - Retire bri dijital. Filtè sa a ta dwe itilize sèlman kote bri espere.
  • OcrInput.Invert() - Envèse chak koulè. Eg. Blan vin nwa: nwa vin blan.
  • OcrInput.Dilate() - Mòfoloji avanse. Dilatasyon ajoute piksèl nan limit objè nan yon imaj. Opoze a Erode.
  • OcrInput.Erode() - Mòfoloji avanse. Ewozyon retire piksèl sou limit objè. Opoze nan dilate.
  • OcrInput.Deskew() - Thorne yon imaj kidonk li se yon fason a dwat leve, li orthogonal. Sa a trè itil pou OCR paske tolerans Tesseract pou analiz fose ka osi ba ke 5 degre.
  • OcrInput.DeepCleanBackgroundNoise() - Gwo retire bri background. Sèvi sèlman ak filtè sa a nan ka ekstrèm bri background dokiman li te ye, paske sa a filtre ap riske tou diminye OCR presizyon nan dokiman pwòp, e li trè CPU chè.
  • OcrInput.EnhanceResolution - Amelyore rezolisyon an nan imaj bon jan kalite ki ba. Filtre sa a pa souvan nesesè paske OcrInput.MinimumDPI ak OcrInput.TargetDPI pral otomatikman trape ak rezoud opinyon rezolisyon ki ba.

CleanBackgroundNoise. Sa a se yon anviwònman ki se yon ti jan tan konsome; sepandan, li pèmèt bibliyotèk la otomatikman netwaye bri dijital, papye chifonnen, ak lòt defo nan yon imaj dijital ki ta otreman rann li anmezi pou yo te li pa lòt bibliyotèk OCR.

EnhanceContrast se yon anviwònman ki lakòz IronOCR otomatikman ogmante kontras nan tèks kont background nan nan yon imaj, ogmante presizyon nan OCR epi jeneralman ogmante pèfòmans ak vitès la nan OCR.

EnhanceResolution se yon anviwònman ki pral otomatikman detekte imaj ki ba-rezolisyon (ki se anba 275 DPI) ak otomatikman pwolongasyon imaj la ak Lè sa a egwize tout tèks la pou li ka li parfe pa yon bibliyotèk OCR. Malgre ke operasyon sa a se nan tèt li tan konsome, li jeneralman diminye tan an jeneral pou yon operasyon OCR sou yon imaj.

Lang IronOCR sipòte 22 pake lang entènasyonal, epi yo ka anviwònman an lang dwe itilize yo chwazi youn oswa plis plizyè lang yo dwe aplike pou yon operasyon OCR.

Estrateji IronOCR sipòte de estrateji. Nou ka chwazi swa ale pou yon eskanè vit ak mwens egzat nan yon dokiman, oswa itilize yon estrateji avanse ki itilize kèk modèl entèlijans atifisyèl otomatikman amelyore presizyon nan tèks OCR pa gade nan relasyon estatistik mo youn ak lòt nan yon fraz.

ColorSpace se yon anviwònman kote nou ka chwazi OCR nan gri oswa koulè. Anjeneral, gri se opsyon ki pi bon. Sepandan, pafwa lè gen tèks oswa orijin nan Hue ki sanble, men trè diferan koulè, yon espas koulè plen-koulè ap bay pi bon rezilta.

DetectWhiteTextOnDarkBackgrounds. Anjeneral, tout bibliyotèk OCR espere wè tèks nwa sou orijin blan. Anviwònman sa a pèmèt IronOCR otomatikman detekte negatif, oswa paj nwa ak tèks blan, epi li yo.

InputImageType. Anviwònman sa a pèmèt pwomotè a gide bibliyotèk la OCR kòm si li se kap nan yon dokiman konplè oswa yon brib, tankou yon D.

RotateAndStraighten se yon anviwònman avanse ki pèmèt IronOCR kapasite inik nan li dokiman ki pa sèlman Thorne, men petèt ki gen pèspektiv, tankou foto nan dokiman tèks.

ReadBarcodes se yon karakteristik itil ki pèmèt IronOCR otomatikman li kòd bar ak kòd QR nan paj jan li li tèks tou, san yo pa ajoute yon gwo chay tan anplis.

KoulèDepth. Anviwònman sa a detèmine konbyen Bits pou chak pixel bibliyotèk la OCR pral itilize detèmine pwofondè nan yon koulè. Yon pwofondè koulè ki pi wo ka ogmante bon jan kalite OCR, men ap ogmante tou tan ki nesesè pou operasyon OCR la fini.

126 pake lang

IronOCR sipòte 126 lang entènasyonal atravè pake lang ki distribiye kòm DLL, ki ka telechaje sou sit entènèt sa a, oswa tou nan Manadjè Pakè NuGet la.

Lang yo enkli Alman, franse, angle, Chinwa, Japonè ak anpil plis. Pake lang espesyalis egziste pou paspò MRZ, chèk MICR, done finansye, plak machin ak anpil plis. Ou kapab tou itilize nenpòt dosye tesseract ".traineddata" ki gen ladan yo ou kreye tèt ou.

Egzanp lang

Sèvi ak lòt lang OCR.

// using IronOcr;
// PM> Install-Package IronOcr.Languages.Arabic

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Arabic;

using (var input = new OcrInput())
{
    input.AddImage("img/arabic.gif");
    // Ajoute filtè imaj si sa nesesè
    // Nan ka sa a, menm opinyon panse se bon jan kalite ki ba anpil
    // IronTesseract ka li sa konvansyonèl Tesseract pa kapab.

    var Result = Ocr.Read(input);

    // Konsole pa ka enprime arab sou Windows fasil.
    // Ann sove sou disk la pito.
    Result.SaveAsTextFile("arabic.txt");
}
// using IronOcr;
// PM> Install-Package IronOcr.Languages.Arabic

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Arabic;

using (var input = new OcrInput())
{
    input.AddImage("img/arabic.gif");
    // Ajoute filtè imaj si sa nesesè
    // Nan ka sa a, menm opinyon panse se bon jan kalite ki ba anpil
    // IronTesseract ka li sa konvansyonèl Tesseract pa kapab.

    var Result = Ocr.Read(input);

    // Konsole pa ka enprime arab sou Windows fasil.
    // Ann sove sou disk la pito.
    Result.SaveAsTextFile("arabic.txt");
}
' using IronOcr;
' PM> Install-Package IronOcr.Languages.Arabic

Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Arabic

Using input = New OcrInput()
	input.AddImage("img/arabic.gif")
	' Ajoute filtè imaj si sa nesesè
	' Nan ka sa a, menm opinyon panse se bon jan kalite ki ba anpil
	' IronTesseract ka li sa konvansyonèl Tesseract pa kapab.

	Dim Result = Ocr.Read(input)

	' Konsole pa ka enprime arab sou Windows fasil.
	' Ann sove sou disk la pito.
	Result.SaveAsTextFile("arabic.txt")
End Using
$vbLabelText   $csharpLabel

Egzanp plizyè lang

Li posib tou pou OCR lè l sèvi avèk plizyè lang an menm tan. Sa ka vrèman ede jwenn metadata ak URL urls nan dokiman Unicode yo.

// using IronOcr;
// PM> Install-Package IronOcr.Languages.ChineseSimplified

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified;
Ocr.AddSecondaryLanguage(OcrLanguage.Haitian);

// Nou ka ajoute nenpòt ki kantite lang

using (var input = new OcrInput())
{
    input.Add("multi-language.pdf");
    var Result = Ocr.Read(input);

    Result.SaveAsTextFile("results.txt");
}
// using IronOcr;
// PM> Install-Package IronOcr.Languages.ChineseSimplified

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified;
Ocr.AddSecondaryLanguage(OcrLanguage.Haitian);

// Nou ka ajoute nenpòt ki kantite lang

using (var input = new OcrInput())
{
    input.Add("multi-language.pdf");
    var Result = Ocr.Read(input);

    Result.SaveAsTextFile("results.txt");
}
' using IronOcr;
' PM> Install-Package IronOcr.Languages.ChineseSimplified

Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.ChineseSimplified
Ocr.AddSecondaryLanguage(OcrLanguage.Haitian)

' Nou ka ajoute nenpòt ki kantite lang

Using input = New OcrInput()
	input.Add("multi-language.pdf")
	Dim Result = Ocr.Read(input)

	Result.SaveAsTextFile("results.txt")
End Using
$vbLabelText   $csharpLabel

Rezilta OCR Detaye objè yo

IronOCR retounen yon objè rezilta OCR pou chak operasyon OCR. Anjeneral, devlopè sèlman itilize pwopriyete tèks objè sa a pou yo ka tcheke tèks la nan imaj la. Sepandan, rezilta OCR DOM yo pi avanse pase sa.

using IronOcr;
using System.Drawing; // Ajoute Referans Asanble

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm;
Ocr.Configuration.ReadBarCodes = true; // Enpòtan

using (var Input = new OcrInput(@"images\sample.tiff"))
{
    OcrResult Result = Ocr.Read(Input);
    var Pages = Result.Pages;
    var Words = Pages[0].Words;
    var Barcodes = Result.Barcodes;
    // Eksplore isit la pou jwenn yon masiv, API detaye:
    // - Paj, blòk, paraphaf, Liy, Mo, Chars
    // - Imaj Export, Polis kowòdone, Done estatistik
}
using IronOcr;
using System.Drawing; // Ajoute Referans Asanble

var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Haitian;
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm;
Ocr.Configuration.ReadBarCodes = true; // Enpòtan

using (var Input = new OcrInput(@"images\sample.tiff"))
{
    OcrResult Result = Ocr.Read(Input);
    var Pages = Result.Pages;
    var Words = Pages[0].Words;
    var Barcodes = Result.Barcodes;
    // Eksplore isit la pou jwenn yon masiv, API detaye:
    // - Paj, blòk, paraphaf, Liy, Mo, Chars
    // - Imaj Export, Polis kowòdone, Done estatistik
}
Imports IronOcr
Imports System.Drawing ' Ajoute Referans Asanble

Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Haitian
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm
Ocr.Configuration.ReadBarCodes = True ' Enpòtan

Using Input = New OcrInput("images\sample.tiff")
	Dim Result As OcrResult = Ocr.Read(Input)
	Dim Pages = Result.Pages
	Dim Words = Pages(0).Words
	Dim Barcodes = Result.Barcodes
	' Eksplore isit la pou jwenn yon masiv, API detaye:
	' - Paj, blòk, paraphaf, Liy, Mo, Chars
	' - Imaj Export, Polis kowòdone, Done estatistik
End Using
$vbLabelText   $csharpLabel

Pèfòmans

IronOCR travay soti nan bwat la ki pa gen okenn bezwen melodi pèfòmans oswa lou modifye imaj opinyon.

Vitès se tou limen an: IronOcr.2020 + se jiska 10 fwa pi vit epi fè plis pase 250% mwens erè pase bati anvan yo.

Aprann plis

Pou aprann plis sou OCR nan C#, VB, F#, oswa nenpòt ki lòt lang .NET, tanpri li tutoryèl kominote nou an, ki bay egzanp mond reyèl sou ki jan IronOCR ka itilize epi yo ka montre nuans yo ki jan yo ka jwenn pi bon an soti nan bibliyotèk sa a.

Yon referans konplè pou devlopè .NET disponib tou.