VERGLEICH MIT ANDEREN KOMPONENTEN

Ein Vergleich zwischen IronOCR und Dynamsoft OCR

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

Kannapat Udonpant

13. Juni 2022

Teilen Sie:

Optical Character Recognition (OCR) ist ein Datenerfassungsprozess, der die Erkennung und Digitalisierung von geschriebenem und gedrucktem Text umfasst. Dabei handelt es sich um eine Computertechnologie, bei der mittels Bildanalyse digitale Fotos von gedrucktem Text in Buchstaben und Zahlen umgewandelt werden, die von anderen Programmen wie Textverarbeitungsprogrammen verwendet werden können. Der Text wird in Zeichencodes umgewandelt, so dass er auf einem Computer durchsucht und geändert werden kann.

Während die Vergangenheit eine Welt war, in der alle Dokumente physisch waren, und die Zukunft vielleicht eine Gesellschaft sein wird, in der alle Dokumente digital sind, ist die Gegenwart im Wandel begriffen. Physische und digitale Dokumente koexistieren in diesem Übergangsstadium - daher sind Technologien wie OCR für die Hin- und Her-Konvertierung entscheidend.

Dokumentenwiederherstellung, Dateneingabe und Barrierefreiheit sind nur einige der Anwendungsmöglichkeiten für OCR. Die meisten OCR-Anwendungen basieren auf gescannten Papieren, gelegentlich werden aber auch Fotos verwendet. OCR ist eine wertvolle Zeitersparnis, da das Abtippen des Materials oft die einzige andere Möglichkeit ist. Im Folgenden finden Sie einige Beispiele für den Einsatz von OCR:

Bearbeitbare Textdateien können aus gescannten Dokumenten, einschließlich Faxen, wiederhergestellt werden.
Scannen von Büchern, um durchsuchbare und bearbeitbare E-Books zu erstellen.
Verwendung von Screenshot-Fotos zum Suchen und Ändern von Text.
Die Text-to-Speech-Technologie wird eingesetzt, um sehbehinderten Menschen Bücher vorzulesen.
Dies sind nur einige wenige Anwendungen für OCR, die jedoch die Vielseitigkeit der Technologie in einer Vielzahl von Branchen belegen. Fast alle Mitarbeiter in allen Unternehmen sind täglich in erheblichem Maße auf Dokumente angewiesen, daher ist die geschäftliche Nutzung ein wichtiger Aspekt bei der Entwicklung von OCR-Systemen.
In diesem Artikel werden wir die beiden leistungsstärksten OCR-Lesegeräte vergleichen:
IronOCR
Dynamsoft OCR
IronOCR und Dynamsoft OCR sind zwei .NET OCR-Bibliotheken, die die Konvertierung von gescannten Bildern und die OCR-Verarbeitung von PDF-Dokumenten unterstützen. Mit nur wenigen Codezeilen können Sie Bilder in durchsuchbaren Text umwandeln. Sie können auch einzelne Wörter, Buchstaben und Absätze abrufen.

IronOCR - die herausragenden Merkmale

IronOCR bietet die einzigartige Fähigkeit, Text aus Bildern und PDF-Dokumenten, die nicht präzise gescannt wurden, zu erkennen, zu lesen und zu interpretieren. IronOCR bietet die einfachste Methode zur Extraktion von Text aus Dokumenten und Fotos, auch wenn sie nicht immer die schnellste ist, denn sie schärft und korrigiert automatisch minderwertige Scans, indem sie Schräglage, Verzerrungen, Hintergrundrauschen und perspektivische Probleme reduziert und gleichzeitig Auflösung und Kontrast verbessert.

IronOCR ermöglicht es Entwicklern, gescannte Bilder mit einer oder mehreren Seiten zu senden, und es gibt den gesamten Text, die Barcodes und die QR-Informationen zurück. Eine Reihe von Klassen in der OCR-Bibliothek erweitert webbasierte, Desktop- oder Konsolenanwendungen um OCR-Funktionen. Tesseract OCR C#, sowie Net Apps JPG, PNG, TIFF, PDF, GIF und BMP sind nur einige der Formate, die als Eingabe verwendet werden können.

Die Optical Character Recognition (OCR)-Engine von IronOCR kann Text lesen, der mit vielen gängigen Schriftarten, Kursivschrift, Gewichtungen und Unterstreichungen erstellt wurde. Cropping-Klassen ermöglichen es der OCR, schnell und präzise zu arbeiten. Wenn Sie mit mehrseitigen Dokumenten arbeiten, beschleunigt die Multithreading-Engine von IronOCR die OCR.

IronOCR-Merkmale

Für die Tesseract-Verwaltung setzen wir IronOCR ein, weil es in folgender Hinsicht einzigartig ist:

Funktioniert direkt nach dem Auspacken in reinem .NET
Tesseract muss nicht auf Ihrem Rechner installiert sein
Verwendet die neuesten Engines: Tesseract 5 (sowie Tesseract 4 & 3)
Ist für jedes .NET-Projekt verfügbar: .NET-Framework 4.5 +, .NET-Standard 2 + und .NET Core 2, 3 & .NET 5
Verbesserte Genauigkeit und Geschwindigkeit gegenüber dem herkömmlichen Tesseract
Unterstützt Xamarin, Mono, Azure und Docker
Es verwaltet das komplexe Tesseract-Wörterbuchsystem mit NuGet-Paketen
Unterstützt PDFS, MultiFrame Tiffs, und alle wichtigen Bildformate ohne Konfiguration
Kann minderwertige und verzerrte Scans korrigieren, um die besten Ergebnisse mit Tesseract zu erzielen.

Dynamsoft OCR - Merkmale

Die Dynamsoft.NET OCR-Bibliothek ist eine .NET-Komponente, die eine schnelle und zuverlässige optische Zeichenerkennung ermöglicht. Es wird verwendet, um .NET-Desktop-Anwendungen in C# oder VB.NET zu erstellen. Mit den grundlegenden OCR-APIs können Sie ganz einfach Code erstellen, um den nutzlosen Text in PDF-Dateien oder Fotos in digitalen Text für die Bearbeitung, Suche, Archivierung und vieles mehr umzuwandeln.

Bilder von Scannern und anderen TWAIN-kompatiblen Geräten können auf folgende Weise erfasst werden:

Native, gepufferte Speicher- und Festplattendatei-Image-Übertragungsmechanismen werden alle unterstützt.
Mit dem automatischen Dokumenteneinzug ist das Stapel-Scannen möglich (ADF).
Mit Hilfe von TWAIN-Attributen lassen sich gängige Gerätefunktionen ändern.
IfAutoFeed, IfAutoScan, Auflösung, BitDepth, Helligkeit, Kontrast, Einheit, Duplex und andere Funktionen können geändert werden.
Unterstützt die Erkennung von leeren Seiten.
Ermöglicht Ihnen das Ändern und Speichern von Scannerprofilen.
Erfassen Sie Bilder von Webcams, die UVC- und WIA-kompatibel sind:
Zeigen Sie einen Live-Video-Feed, während Sie Fotos von einer ausgewählten Webcam aufnehmen.
sie können die Einstellungen der Kamera anpassen: Helligkeit, Kontrast, Farbton, Sättigung, Schärfe, Gamma, Weißabgleich, Gegenlichtkompensation, Verstärkung, Farbaktivierung, Zoom, Fokus, Belichtung, Blende, Schwenken, Neigen, Rollen.
Robuste Bildlade-/Anzeigefunktion
Es können Bilder im BMP-, JPEG-, PNG-, TIFF- und mehrseitigen TIFF-Format geladen werden.
Das Ein- und Auszoomen von Fotos wird unterstützt.
Bilder können von einem lokalen Laufwerk, einem FTP-Server, einem HTTP-Server oder einer Datenbank abgerufen werden.
Bilddekodierung für BMP, JPEG, PNG und TIFF mit einem der umfassendsten Sätze von .NET-Imaging-Komponenten
Speichern und Hochladen/Herunterladen
Ermöglicht das Lesen und Schreiben von Fotos über einen Dateistrom.
Unterstützt das Speichern von aufgenommenen Fotos als BMP, JPEG, PNG, TIFF oder mehrseitiges TIFF auf einem lokalen Laufwerk, einem Webserver oder einer Datenbank.
RLE-, G3/G4-, LZW-, PackBits- und TIFF-Kompression werden alle unterstützt.
HTTPS-Uploads und -Downloads werden unterstützt.
Einer der umfangreichsten Sätze von .NET-Imaging-Komponenten unterstützt BMP-, JPEG-, PNG- und TIFF-Bildkodierung.
Ermöglicht es Ihnen, neu gewonnene Fotos an bestehende TIFF-Dateien anzuhängen.

Lesen von Text aus gescannten PDFs oder anderen Bildern in ASP.NET (Optical Character Recognition)

In der schnelllebigen Welt von heute wollen die Kunden, dass ihre Arbeit schnell erledigt wird. Kunden mit dringenden Projekten kontaktieren uns häufig. Unsere Technologie kann einfach den Inhalt eines Bildes erkennen und in Text umwandeln, wenn das Projekt das Scannen von Papieren mit Bildern beinhaltet. Optische Zeichenerkennung (OCR) spart Ihrem Unternehmen Zeit und Geld und reduziert gleichzeitig Fehler bei der Dateneingabe.

IronOCR verwenden

IronOCR verwendet die Klasse IronOcr.IronTesseract, um seine OCR-Konvertierungen durchzuführen.

In diesem einfachen Beispiel verwenden wir die Klasse IronOcr.IronTesseract, um Text aus einem Bild zu lesen und das Ergebnis automatisch als String zurückzugeben.

// PM> Install-Package IronOcr
using IronOcr;
var Result = new IronTesseract().Read(@"img\Screenshot.png");
Console.WriteLine(Result.Text);

// PM> Install-Package IronOcr
using IronOcr;
var Result = new IronTesseract().Read(@"img\Screenshot.png");
Console.WriteLine(Result.Text);

' PM> Install-Package IronOcr
Imports IronOcr
Private Result = (New IronTesseract()).Read("img\Screenshot.png")
Console.WriteLine(Result.Text)

$vbLabelText $csharpLabel

Daher ist der folgende Absatz zu 100 Prozent korrekt:


IronOCR Einfaches Beispiel

In diesem einfachen Beispiel testen wir die Genauigkeit unserer C# OCR-Bibliothek beim Lesen von Text aus einem PNG

Bild. Dies ist ein sehr einfacher Test, aber die Dinge werden im weiteren Verlauf des Tutorials komplizierter.

Der schnelle braune Fuchs springt über den faulen Hund

Obwohl es oberflächlich betrachtet einfach erscheinen mag, steckt dahinter ein ausgeklügeltes Verhalten: das Scannen des Bildes auf Ausrichtung, Qualität und Auflösung, die Prüfung seiner Attribute, die Optimierung der OCR-Engine und schließlich das Lesen des Textes, wie es ein Mensch tun könnte.

OCR ist eine schwierige Aufgabe für eine Maschine, und die Lesegeschwindigkeit kann mit der eines Menschen vergleichbar sein. Mit anderen Worten: OCR ist kein schnelles Verfahren. In diesem Fall ist sie jedoch absolut korrekt.

C# OCR-Anwendungsergebnisse Genauigkeit

In den meisten realen Szenarien wollen Entwickler, dass ihre Projekte so schnell wie möglich laufen. In diesem Szenario schlagen wir vor, dass Sie stattdessen die OcrInput- und IronTesseract-Klassen des IronOCR-Add-On-Namespaces verwenden.

Mit OcrInput können Sie die genauen Merkmale eines OCR-Auftrags festlegen, z. B.:

JPEG, TIFF, GIF, BMP und PNG sind nur einige der Bildformate, die verwendet werden können
Importieren von PDF-Dokumenten in ihrer Gesamtheit oder in Teilen
Verbesserung von Kontrast, Auflösung und Größe des Bildes
Korrektur von Drehung, Abtastrauschen, digitalem Rauschen, Schräglage und negativem Bild
EisenTesserakt
Wählen Sie aus Hunderten von vorgefertigten Sprachen und Dialekten
Sofortige Nutzung der OCR-Engines von Tesseract 5, 4 oder 3
Wenn wir uns einen Screenshot, einen Ausschnitt oder das gesamte Dokument ansehen, geben Sie den Dokumenttyp an
Erkennen von Barcodes
archivierbare PDFs, Hocr HTML, ein DOM und Strings sind alle Optionen für OCR-Ergebnisse

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.tiff")) {
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.tiff")) {
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput("img\Potter.tiff")
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

Wir können dies sogar bei einem Scan mittlerer Qualität mit 100%iger Genauigkeit anwenden.

Wie Sie sehen können, war das Lesen von Text (und, falls gewünscht, Barcodes) aus einem gescannten Bild wie einem TIFF ziemlich einfach. Die Genauigkeit dieses OCR-Jobs beträgt 100 Prozent.

Als Nächstes versuchen wir es mit einem Scan derselben Seite in viel schlechterer Qualität, mit niedriger DPI und mit vielen Verzerrungen und digitalem Rauschen sowie mit Schäden am Originalpapier.

Dies ist der Punkt, an dem IronOCR im Vergleich zu anderen OCR-Bibliotheken wie Tesseract wirklich glänzt. Wir werden feststellen, dass andere OCR-Projekte es vermeiden, die Verwendung von OCR auf realen gescannten Bildern zu diskutieren, anstatt unrealistisch "perfekte" Testfälle, die digital erstellt wurden, um eine 100%ige OCR-Genauigkeit zu erreichen.

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.Deskew(); // removes rotation and perspective
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.Deskew(); // removes rotation and perspective
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput("img\Potter.LowQuality.tiff")
Input.Deskew() ' removes rotation and perspective
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

Ohne die Hinzufügung von Input.Deskew(), um das Bild zu begradigen, erreichen wir eine Genauigkeit von 52,5 %. Das ist nicht gut genug.

Das Hinzufügen von Input.Deskew() bringt uns auf eine Genauigkeit von 99,8%, was nahezu so genau ist wie die Texterkennung eines hochwertigen Scans.

Dynamsoft OCR verwenden

Wir werden einige Codeschnipsel zur Verwendung von Dynamic Web TWAIN für TWAIN-Scans und client-seitige OCR in JavaScript vorstellen.

Bilder scannen

Mit den einfachen APIs von Dynamic Web TWAIN können Sie die Scaneinstellungen ändern und Fotos von TWAIN-Scannern erfassen.

function acquireImage()
{
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex); //select an available TWAIN scanners

    //set scanning settings like pixel type, resolution, ADF etc.
    DWObject.IfShowUI = false; //don't show the user interface of the scanner
    DWObject.PixelType = 1; //scan in gray
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; //scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    //acquire images from scanners
    DWObject.AcquireImage();
}

function acquireImage()
{
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex); //select an available TWAIN scanners

    //set scanning settings like pixel type, resolution, ADF etc.
    DWObject.IfShowUI = false; //don't show the user interface of the scanner
    DWObject.PixelType = 1; //scan in gray
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; //scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    //acquire images from scanners
    DWObject.AcquireImage();
}

Private Function acquireImage() As [function]
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex) 'select an available TWAIN scanners

	'set scanning settings like pixel type, resolution, ADF etc.
	DWObject.IfShowUI = False 'don't show the user interface of the scanner
	DWObject.PixelType = 1 'scan in gray
	DWObject.Resolution = 300
	DWObject.IfFeederEnabled = True 'scan from auto feeder
	DWObject.IfDuplexEnabled = False
	DWObject.IfDisableSourceAfterAcquire = True

	'acquire images from scanners
	DWObject.AcquireImage()
End Function

$vbLabelText $csharpLabel

Das professionelle OCR-Modul herunterladen

Um das OCR Professional-Modul für die clientseitige OCR zu verwenden, müssen Sie ocrpro.js in den Head einbinden und die OCR Pro DLL herunterladen.

<script type="text/javascript" src="Resources/addon/dynamsoft.webtwain.addon.ocrpro.js"> </script>

Make edits to the .js file:

var CurrentPathName = unescape(location.pathname);
CurrentPath = CurrentPathName.substring(0, CurrentPathName.lastIndexOf("/") + 1);
DWObject.Addon.OCRPro.Download(CurrentPath + "Resources/addon/OCRPro.zip", OnSuccess, OnFailure);

js

JAVASCRIPT

Recognize text using OCR

Using the JS OCR recognition API to extract text from scanned images is as simple as inserting the code below.

DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image

DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image

DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo) ' 0 is the index of the image

$vbLabelText $csharpLabel

Reading Cropped Regions of Images

Both sets of software offer solutions for cropping images for OCR.

Reading cropped regions with IronOCR

Iron's branch of Tesseract OCR is adept at reading specific regions of images, as shown in the following code sample.

We can make use of System.Drawing.Rectangle that is used to describe the exact region of an image to be read in pixels.

When dealing with a standardized form that is filled out, and only a portion of the content changes from case to case, this can be really handy.

Scanning a Section of a Page: We can make use of System.Drawing.Rectangle to designate a region in which we shall read a document. Pixels are always the unit of measurement.

We shall find that this improves speed while also avoiding reading needless text. In this example, we will read a student's name from a central region of a standardized paper.

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// a 41% improvement on speed
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
Input.AddImage("img/ComSci.png", ContentArea);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// a 41% improvement on speed
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
Input.AddImage("img/ComSci.png", ContentArea);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput()
' a 41% improvement on speed
Dim ContentArea = New System.Drawing.Rectangle() With {
	.X = 215,
	.Y = 1250,
	.Height = 280,
	.Width = 1335
}
Input.AddImage("img/ComSci.png", ContentArea)
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

This results in a 41 percent boost in speed, while also allowing us to be more specific. This is extremely valuable for .NET OCR applications involving documents that are comparable and consistent, including invoices, receipts, checks, forms, expense claims, and so on.

When reading PDFs, ContentAreas (OCR cropping) is also supported.

Reading cropped regions with Dynamsoft OCR

To begin, launch Visual Studio and build a new C# Windows Forms Application, or open an existing one.

We will need to include DynamicDotNetTWAIN.dll, DynamicOCR.dll, and the appropriate language package. To do so, navigate to Tools -> Choose Toolbox Items, then to the.NET Framework Components tab, click the Browse... button, and locate DynamicDotNetTWAIN.dll in "..Program Files (x86)DynamsoftDynamic.NET TWAIN 4.3 TrialBinv4.0" or v2.0 (depends on the .NET Framework version you are using). Click the OK button. The DynamicDotNetTwain component will then appear in the Toolbox dialog (under the View menu), as illustrated in the accompanying image.

Right-click the project file in Solution Explorer and select Add-> Existing Item... Then, in the file type filter's drop-down list, select All Files. Navigate to “..\Program Files (x86)\Dynamsoft\Dynamic .NET TWAIN 4.3 Trial\Bin\OCRResources” to add items to the project folder. The .NET TWAIN component can then be dragged and dropped onto the form.

This is the code for clicking the LoadImage button:

private void button1_Click(object sender, EventArgs e) { OpenFileDialog filedlg = new OpenFileDialog(); if (filedlg.ShowDialog() == DialogResult.OK) { dynamicDotNetTwain1.LoadImage(filedlg.FileName);
// choose an image from your local disk and load it into Dynamic .NET TWAIN
} }

We can now attempt to OCR the loaded image and turn it into a searchable text file.

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) { dynamicDotNetTwain1.OCRTessDataPath = "../../"; // the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng";
// the language type
dynamicDotNetTwain1.OCRDllPath = "../../";
//the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text; byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
// OCR the selected area of the image
if (sbytes != null) { SaveFileDialog filedlg = new SaveFileDialog(); filedlg.Filter = "Text File(*.txt) *.txt"; if (filedlg.ShowDialog() == DialogResult.OK) { FileStream fs = File.OpenWrite(filedlg.FileName); fs.Write(sbytes, 0, sbytes.Length);
//save the OCR result as a text file
fs.Close(); } MessageBox.Show("OCR successful"); } else { MessageBox.Show(dynamicDotNetTwain1.ErrorString); } }

private void button1_Click(object sender, EventArgs e) { OpenFileDialog filedlg = new OpenFileDialog(); if (filedlg.ShowDialog() == DialogResult.OK) { dynamicDotNetTwain1.LoadImage(filedlg.FileName);
// choose an image from your local disk and load it into Dynamic .NET TWAIN
} }

We can now attempt to OCR the loaded image and turn it into a searchable text file.

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) { dynamicDotNetTwain1.OCRTessDataPath = "../../"; // the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng";
// the language type
dynamicDotNetTwain1.OCRDllPath = "../../";
//the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text; byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
// OCR the selected area of the image
if (sbytes != null) { SaveFileDialog filedlg = new SaveFileDialog(); filedlg.Filter = "Text File(*.txt) *.txt"; if (filedlg.ShowDialog() == DialogResult.OK) { FileStream fs = File.OpenWrite(filedlg.FileName); fs.Write(sbytes, 0, sbytes.Length);
//save the OCR result as a text file
fs.Close(); } MessageBox.Show("OCR successful"); } else { MessageBox.Show(dynamicDotNetTwain1.ErrorString); } }

Private Sub button1_Click(ByVal sender As Object, ByVal e As EventArgs)
	Dim filedlg As New OpenFileDialog()
	If filedlg.ShowDialog() = DialogResult.OK Then
		dynamicDotNetTwain1.LoadImage(filedlg.FileName)
' choose an image from your local disk and load it into Dynamic .NET TWAIN
	End If
End Sub

We can now attempt [to] OCR the loaded image [and] turn it into a searchable text file.private Sub dynamicDotNetTwain1_OnImageAreaSelected(ByVal sImageIndex As Short, ByVal left As Integer, ByVal top As Integer, ByVal right As Integer, ByVal bottom As Integer)
	dynamicDotNetTwain1.OCRTessDataPath = "../../" ' the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng"
' the language type
dynamicDotNetTwain1.OCRDllPath = "../../"
'the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text
Dim sbytes() As Byte = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom)
' OCR the selected area of the image
If sbytes IsNot Nothing Then
	Dim filedlg As New SaveFileDialog()
	filedlg.Filter = "Text File(*.txt) *.txt"
	If filedlg.ShowDialog() = DialogResult.OK Then
		Dim fs As FileStream = File.OpenWrite(filedlg.FileName)
		fs.Write(sbytes, 0, sbytes.Length)
'save the OCR result as a text file
fs.Close()
	End If
	MessageBox.Show("OCR successful")
Else
	MessageBox.Show(dynamicDotNetTwain1.ErrorString)
End If
End Sub

$vbLabelText $csharpLabel

This is how the application looks.

Image Performance Tuning

The quality of the input image is the most crucial determinant in the speed of an OCR task. The lower the background noise and the higher the dpi, with a great goal value of around 200 dpi, the faster and more accurate the OCR output.

Image Processing Techniques for Dynamsoft OCR

We need to use OCR in a variety of situations, such as scanning a credit card number with our phone or extracting text from paper documents. OCR capabilities are included in Dynamsoft Label Recognition (DLR) and Dynamic Web TWAIN (DWT).

Although they can do an excellent job in general, we can improve the results by using various image processing techniques.

Lighten/remove shadows

Poor illumination may have an impact on the OCR result. To improve the outcome, we can whiten photos or eliminate shadows from images.

Invert

Because the OCR engine is often trained on text in dark colors, text in light colors can be harder to discover and recognize.

It will be easier to recognize if we invert its color

To perform the inversion, we can use the GrayscaleTransformationModes parameter in DLR.

Here are the JSON settings:

"GrayscaleTransformationModes": [
    {
        "Mode": "DLR_GTM_INVERTED"
    }
]

"GrayscaleTransformationModes": [
    {
        "Mode": "DLR_GTM_INVERTED"
    }
]

'INSTANT VB TODO TASK: The following line uses invalid syntax:
'"GrayscaleTransformationModes": [{ "Mode": "DLR_GTM_INVERTED" }]

$vbLabelText $csharpLabel

DLR .net’s reading result:

Rescale

If the letter height is too low, the OCR engine may not produce a good result. In general, the image should have a DPI of at least 300.

There is a ScaleUpModes parameter in DLR 1.1 that allows you to scale up letters. We may, of course, scale the image ourselves.

Reading the image directly yields the incorrect result:

After scaling up the image x2, the result is correct:

Deskew

It is fine if the text is a little distorted. However, if it is overly skewed, the outcome will be adversely altered. To improve the outcome, we need to crop the image.

To accomplish this, we can use the Hough Line Transform in OpenCV.

Here is the code to deskew the image above.

#coding=utf-8
import numpy as np
import cv2
import math
from PIL import Image

def deskew():
src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR)
gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5,5),np.uint8)
erode_Img = cv2.erode(gray,kernel)
eroDil = cv2.dilate(erode_Img,kernel) # erode and dilate
showAndWaitKey("eroDil",eroDil)

    canny = cv2.Canny(eroDil,50,150) # edge detection
    showAndWaitKey("canny",canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform
    drawing = np.zeros(src.shape [:], dtype=np.uint8)

    maxY=0
    degree_of_bottomline=0
    index=0
    for line in lines:        
        x1, y1, x2, y2 = line [0]            
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index==0:
            maxY=y1
            degree_of_bottomline=degree # take the degree of the line at the bottom
        else:        
            if y1>maxY:
                maxY=y1
                degree_of_bottomline=degree
        index=index+1
    showAndWaitKey("houghP",drawing)

    img=Image.fromarray(src)
    rotateImg = img.rotate(degree_of_bottomline)
    rotateImg_cv = np.array(rotateImg) 
    cv2.imshow("rotateImg",rotateImg_cv)
    cv2.imwrite("deskewed.jpg",rotateImg_cv)
    cv2.waitKey()

def showAndWaitKey(winName,img):
cv2.imshow(winName,img)
cv2.waitKey()

if __name__ == "__main__":              
deskew()

#coding=utf-8
import numpy as np
import cv2
import math
from PIL import Image

def deskew():
src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR)
gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5,5),np.uint8)
erode_Img = cv2.erode(gray,kernel)
eroDil = cv2.dilate(erode_Img,kernel) # erode and dilate
showAndWaitKey("eroDil",eroDil)

    canny = cv2.Canny(eroDil,50,150) # edge detection
    showAndWaitKey("canny",canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform
    drawing = np.zeros(src.shape [:], dtype=np.uint8)

    maxY=0
    degree_of_bottomline=0
    index=0
    for line in lines:        
        x1, y1, x2, y2 = line [0]            
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index==0:
            maxY=y1
            degree_of_bottomline=degree # take the degree of the line at the bottom
        else:        
            if y1>maxY:
                maxY=y1
                degree_of_bottomline=degree
        index=index+1
    showAndWaitKey("houghP",drawing)

    img=Image.fromarray(src)
    rotateImg = img.rotate(degree_of_bottomline)
    rotateImg_cv = np.array(rotateImg) 
    cv2.imshow("rotateImg",rotateImg_cv)
    cv2.imwrite("deskewed.jpg",rotateImg_cv)
    cv2.waitKey()

def showAndWaitKey(winName,img):
cv2.imshow(winName,img)
cv2.waitKey()

if __name__ == "__main__":              
deskew()

#coding=utf-8
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'import TryCast(numpy, np) import cv2 import math from PIL import Image def deskew(): src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR) gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY) kernel = np.ones((5,5),np.uint8) erode_Img = cv2.erode(gray,kernel) eroDil = cv2.dilate(erode_Img,kernel) # erode @and dilate showAndWaitKey("eroDil",eroDil) canny = cv2.Canny(eroDil,50,150) # edge detection showAndWaitKey("canny",canny) lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform drawing = np.zeros(src.shape [:], dtype=np.uint8) maxY=0 degree_of_bottomline=0 index=0 for line in lines: x1, y1, x2, y2 = line [0] cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA) k = float(y1-y2)/(x1-x2) degree = np.degrees(math.atan(k)) if index==0: maxY=y1 degree_of_bottomline=degree # take the degree @of the line at the bottom else: if y1> maxY: maxY=y1 degree_of_bottomline=degree index=index+1 showAndWaitKey("houghP",drawing) img=Image.fromarray(src) rotateImg = img.rotate(degree_of_bottomline) rotateImg_cv = np.array(rotateImg) cv2.imshow("rotateImg",rotateImg_cv) cv2.imwrite("deskewed.jpg",rotateImg_cv) cv2.waitKey() def showAndWaitKey(winName,img): cv2.imshow(winName,img) cv2.waitKey() if __name__ == "__main__": deskew()

$vbLabelText $csharpLabel

Lines detected:

Deskewed:

Image Processing Techniques for IronOCR

The quality of the input image is not important here because IronOCR excels at repairing defective documents (though this is time-consuming and will cause your OCR jobs to use more CPU cycles).

Choosing input image formats with less digital noise, such as TIFF or PNG, can also result in speedier outcomes than lossy image formats, such as JPEG.

The image filters listed below can significantly enhance performance:

OcrInput.Rotate (double degrees) — Rotates images clockwise by a specified number of degrees. Negative integers are used for anti-clockwise rotation.

OcrInput.Binarize() — This image filter makes every pixel either black or white, with no in-between. It may improve OCR performance in circumstances where the text-to-background contrast is very low.

OcrInput.ToGrayScale() — This image filter converts every pixel to a grayscale shade. It is unlikely to improve OCR accuracy, but it may increase speed.

OcrInput.Contrast() — Automatically increases contrast. In low-contrast scans, this filter frequently improves OCR speed and accuracy.

OcrInput.DeNoise() — This filter should be used only when noise is expected.

OcrInput.Invert() — Reverses all colors. For example, white becomes black: black becomes white.

OcrInput.Dilate() — Advanced morphology. Dilation is the process of adding pixels to the edges of objects in an image. (Erode's inverse)

OcrInput. Erode() — an advanced morphology function. Erosion is the process of removing pixels from the edges of objects. (Dilate's inverse)

OcrInput. Deskew() — Rotates an image so that it is orthogonal and the right way up. Because Tesseract tolerance for skewed scans can be as low as 5 degrees, this is quite useful for OCR.

DeepCleanBackgroundNoise() — Removes a lot of background noise. Only use this filter if you know there is a lot of background noise in the document because it can reduce OCR accuracy on clear documents and is quite CPU intensive.

OcrInput.EnhanceResolution — Improves the resolution of low-resolution photos. Because of OcrInput, this filter is rarely used. OcrInput and will detect and resolve low resolution automatically.

We may want to use Iron Tesseract to speed up OCR on higher-quality scans.

If we're looking for speed, we might start here and subsequently turn features back on until the proper balance is struck.

using IronOcr;
var Ocr = new IronTesseract();
// Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\\";
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"img\Potter.tiff"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
// Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\\";
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"img\Potter.tiff"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
' Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\"
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly
Ocr.Language = OcrLanguage.EnglishFast
Using Input = New OcrInput("img\Potter.tiff")
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

This result is 99.8% accurate compared to the baseline of 100% — but 35% faster.

Licensing and Pricing

Dynamsoft Licensing and Pricing

Per year license. All rates include one year of maintenance, which includes free software upgrades and premium support.

Dynamsoft offers two types of licenses:

Per client device license

The "One Client Device License" provides access to a same-origin Application (same protocol, same host, and same port) to use the software's features from a single client device. An inactive client device is one that has not accessed any software capability for 90 days in a row. An inactive client device's license seat will be instantly freed and made available for usage by any other active client device. When you reach the maximum number of license seats allowed, Dynamsoft will give you an extra 10% of your client device allowance for emergency use. Once the additional client device allowance has been depleted, no new client devices can access and use the software until there are available license seats again. Please keep in mind that exceeding your client device allowance has no effect on any client devices that have already been licensed.

Per-server license

To deploy the application to a single server, a "One Server License" is required. Servers refer to both physical and virtual servers and include, but are not limited to, production servers, failover servers, development servers that are also used for testing, quality assurance servers, testing servers, and staging servers, all of which require a license. Additional licenses are not required for continuous integration servers (build servers) or localhost development servers. The per-server license is only valid for on-premises server installations, and not for cloud deployments.

Pricing for Dynamsoft OCR starts at USD 1,249/year.

IronOCR Licensing and Pricing

As developers, we all want to accomplish our projects with the least amount of money and resources possible — budgeting is critical. Examine the chart to determine which license is best suited to your requirements and budget.

IronOCR provides licenses with a customizable number of developers, projects, and locations, allowing you to fulfill the needs of your project while only paying for the coverage you require.

IronOCR licensing keys enable you to publish your product without a watermark.

Licenses start from $749 and include one year of support and upgrades.

You can also use a trial license key to try IronOCR for free.

Conclusion

Tesseract OCR on Mac, Windows, Linux, Azure OCR, and Docker are all available with IronOCR for C#. .NET Framework 4.0 or above is required, .NET Standard 2.0+, .NET Core 2.0+, .NET 5, Mono for macOS and Linux, and Xamarin for macOS are all examples of cross-platform development. IronOCR also uses the latest Tesseract 5 engine to read text, barcodes, and QR codes from all major image and PDF formats. In minutes, this library adds OCR functionality to your desktop, console, or web apps! The OCR can also read PDFs and multi-page TIFFs, and it can be saved as a searchable PDF document or XHTML in any OCR Scan. Plain text, barcode data, and an OCR result class encompassing paragraphs, lines, words, and characters are among its data output choices. It is available in 125 languages, including Arabic, Chinese, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Portuguese, Russian, and Spanish, but keep in mind that bespoke language packs can also be generated.

The Dynamic .NET TWAIN OCR add-on is a quick and reliable .NET component for Optical Character Recognition that you can use in WinForms and WPF applications written in C# or VB .NET. You can scan documents or capture photos from webcams using Dynamic .NET TWAIN's image capture module, and then conduct OCR on the images to convert the text in the images to text, searchable PDF files, or strings. Multiple Asian languages, as well as Arabic, are offered in addition to English.

IronOCR offers better licensing than Dynamsoft OCR; IronOcr starts at $749 with one year free, while Dynamsoft starts at $1249 with a free trial. IronOCR also offers licenses for multiple users, while with Dynamsoft, you only get one license per user.

While both sets of software aim at offering the best performance in terms of OCR readings of barcodes, image to text, and image to text, IronOCR stands out in that it shines its light even on images that are in pretty bad shape. It automatically puts in place its sophisticated tuning methods to give you the best OCR results. IronOCR also makes use of Tesseract to give you optimal results with little or no errors.

Iron Software is also offering its customers and users the option to grab its entire suite of software in just two clicks. This means that for the price of two of the components in the Iron Software suite, you can currently get all five components and uninterrupted support.

Kannapat Udonpant

Jetzt mit dem Ingenieurteam chatten

Software-Ingenieur

Bevor er Software-Ingenieur wurde, promovierte Kannapat an der Universität Hokkaido in Japan im Bereich Umweltressourcen. Während seines Studiums wurde Kannapat auch Mitglied des Vehicle Robotics Laboratory, das Teil der Abteilung für Bioproduktionstechnik ist. Im Jahr 2022 wechselte er mit seinen C#-Kenntnissen zum Engineering-Team von Iron Software, wo er sich auf IronPDF konzentriert. Kannapat schätzt an seiner Arbeit, dass er direkt von dem Entwickler lernt, der den Großteil des in IronPDF verwendeten Codes schreibt. Neben dem kollegialen Lernen genießt Kannapat auch den sozialen Aspekt der Arbeit bei Iron Software. Wenn er nicht gerade Code oder Dokumentationen schreibt, kann man Kannapat normalerweise beim Spielen auf seiner PS5 oder beim Wiedersehen mit The Last of Us antreffen.

< PREVIOUS
Ein Vergleich zwischen IronOCR und Tesseract.NET

NÄCHSTES >
Ein Vergleich zwischen IronOCR und Abbyy Finereader