COMPARAISON AVEC D'AUTRES COMPOSANTS

Comparaison entre IronOCR et Dynamsoft OCR

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

Kannapat Udonpant

juin 13, 2022

La reconnaissance optique de caractères, ou OCR, est un processus de saisie de données qui implique la reconnaissance et la numérisation de textes, qu'ils soient écrits ou imprimés. Il s'agit d'un type de technologie informatique qui utilise l'analyse d'images pour convertir des photographies numériques de textes imprimés en lettres et en chiffres qui peuvent être utilisés par d'autres programmes tels que les traitements de texte. Le texte est converti en codes de caractères afin de pouvoir être recherché et modifié sur un ordinateur.

Si le passé était un monde où tous les documents étaient physiques, et si l'avenir pourrait être une société où tous les documents seraient numériques, le présent est en mutation. Les documents physiques et numériques coexistent dans cet état de transition - c'est pourquoi des technologies telles que l'OCR sont essentielles pour la conversion dans les deux sens.

La récupération de documents, la saisie de données et l'accessibilité ne sont que quelques-unes des applications de l'OCR. La majorité des applications d'OCR proviennent de documents numérisés, bien que des photographies soient également utilisées à l'occasion. L'OCR est un gain de temps précieux, car la ressaisie des documents est souvent la seule autre option. Voici quelques exemples d'utilisation de l'OCR :

Des fichiers texte modifiables peuvent être récupérés à partir de documents numérisés, y compris des télécopies.
Utilisation de scans de livres pour créer des livres électroniques consultables et modifiables.
Utilisation de captures d'écran pour rechercher et modifier du texte.
La technologie de synthèse vocale est utilisée pour lire des livres aux personnes malvoyantes.
Il ne s'agit là que de quelques-unes des applications de l'OCR, mais elles démontrent la polyvalence de cette technologie dans un large éventail d'industries. Presque tous les employés de toutes les entreprises utilisent quotidiennement des documents, et l'usage professionnel est donc un élément clé dans le développement des systèmes d'OCR.
Dans cet article, nous allons comparer les deux lecteurs OCR les plus puissants :
IronOCR
Dynamsoft OCR
IronOCR et Dynamsoft OCR sont deux bibliothèques OCR .NET qui prennent en charge la conversion d'images numérisées et le traitement OCR de documents PDF. Quelques lignes de code suffisent pour transformer des images en texte consultable. Vous pouvez également récupérer des mots, des lettres et des paragraphes individuels.

IronOCR - les caractéristiques exceptionnelles

IronOCR offre la possibilité unique de détecter, de lire et d'interpréter du texte à partir d'images et de documents PDF qui n'ont pas été scannés avec précision. IronOCR offre l'approche la plus simple pour extraire du texte à partir de documents et de photos, même si elle n'est pas toujours la plus rapide, car elle affine et corrige automatiquement les numérisations de faible qualité, en réduisant l'obliquité, la distorsion, le bruit de fond et les problèmes de perspective, tout en améliorant la résolution et le contraste.

IronOCR permet aux développeurs de lui envoyer des images scannées d'une ou plusieurs pages, et il renverra tout le texte, les codes-barres et les informations QR. Un ensemble de classes de la bibliothèque OCR permet d'ajouter des fonctionnalités d'OCR à des applications Web, de bureau ou de console. Tesseract OCR C#, ainsi que les applications NET JPG, PNG, TIFF, PDF, GIF et BMP, ne sont que quelques-uns des formats qui peuvent être utilisés en entrée.

Le moteur de reconnaissance optique de caractères (OCR) d'IronOCR peut lire le texte préparé en utilisant de nombreuses polices courantes, en italiques, poids et soulignements. Les classes de recadrage permettent à l'OCR de travailler rapidement et avec précision. Lorsque vous travaillez avec des documents multipages, le moteur multithread d'IronOCR accélère l'OCR.

Caractéristiques de l'IronOCR

Pour la gestion de Tesseract, nous utilisons IronOCR parce qu'il est unique dans les domaines suivants :

Fonctionne dès sa sortie de la boîte en .NET pur
Ne nécessite pas l'installation de Tesseract sur votre machine
Exécute les derniers moteurs : Tesseract 5 (ainsi que Tesseract 4 et 3)
Est disponible pour tout projet .NET : .NET Framework 4.5 +, .NET Standard 2 + et .NET Core 2, 3 & .NET 5
Amélioration de la précision et de la vitesse par rapport au Tesseract traditionnel
Prise en charge de Xamarin, Mono, Azure et Docker
Il gère le système complexe de dictionnaires Tesseract à l'aide de paquets NuGet
Prise en charge des PDFS, des Tiffs multi-trames et de tous les principaux formats d'image sans configuration
Permet de corriger les scans de mauvaise qualité et biaisés afin d'obtenir les meilleurs résultats avec Tesseract.

Dynamsoft OCR - caractéristiques

La bibliothèque Dynamsoft.NET OCR est un composant .NET qui permet une reconnaissance optique de caractères rapide et fiable. Il est utilisé pour créer des applications de bureau .NET en C# ou VB.NET. Vous pouvez simplement créer un code pour convertir le texte inutile des PDF ou des photos en texte numérique pour l'édition, la recherche, l'archivage, etc. à l'aide des API OCR de base.

Les images provenant de scanners et d'autres appareils compatibles TWAIN peuvent être acquises de la manière suivante :

Les mécanismes de transfert d'images natives, de mémoire tampon et de fichiers disque sont tous pris en charge.
Avec le chargeur automatique de documents, la numérisation par lots est possible (ADF).
Les attributs TWAIN peuvent être utilisés pour modifier les fonctionnalités communes des appareils.
Les fonctions IfAutoFeed, IfAutoScan, Résolution, BitDepth, Luminosité, Contraste, Unité, Recto-verso et autres peuvent toutes être modifiées.
Prise en charge de la détection des pages vierges.
Permet de modifier et d'enregistrer les profils du scanner.
Capturez des images à partir de webcams conformes à UVC et WIA :
Affichez un flux vidéo en direct tout en capturant des photos à partir d'une webcam choisie.
personnalisez les paramètres de l'appareil photo : Luminosité, contraste, teinte, saturation, netteté, gamma, balance des blancs, compensation de contre-jour, gain, activation des couleurs, zoom, mise au point, exposition, iris, panoramique, inclinaison, roulis.
Chargement/Visualisation d'Images Robuste
Il est possible de charger des images aux formats BMP, JPEG, PNG, TIFF et TIFF multipage.
Le zoom avant et arrière sur les photos est pris en charge.
Les images peuvent être récupérées à partir d'un lecteur local, d'un serveur FTP, d'un serveur HTTP ou d'une base de données.
Décodage d'images BMP, JPEG, PNG et TIFF à l'aide de l'un des ensembles les plus complets de composants d'imagerie .NET
Enregistrer et Télécharger/Téléverser
Permet de lire et d'écrire des photos sur un flux de fichiers.
Permet d'enregistrer les photos capturées au format BMP, JPEG, PNG, TIFF ou TIFF multipage sur un disque local, un serveur web ou une base de données.
Les compressions RLE, G3/G4, LZW, PackBits et TIFF sont toutes prises en charge.
Les téléchargements HTTPS sont pris en charge.
L'un des ensembles les plus complets de composants d'imagerie .NET disponibles prend en charge le codage d'images BMP, JPEG, PNG et TIFF.
Permet de joindre des photos nouvellement obtenues à des fichiers TIFF existants.

Lire du texte à partir de PDF scannés ou d'autres images en ASP.NET (Reconnaissance Optique de Caractères)

Les clients veulent que le travail soit effectué rapidement dans le monde d'aujourd'hui, où tout va très vite. Les clients ayant des projets urgents nous contactent fréquemment. Notre technologie peut simplement reconnaître le contenu d'une image et le convertir en texte si le projet consiste à numériser des documents contenant des images. La reconnaissance optique de caractères (OCR) permet à votre entreprise de gagner du temps et de l'argent tout en réduisant les erreurs de saisie de données.

Utilisation de l'IronOCR

IronOCR utilise la classe IronOcr.IronTesseract pour effectuer ses conversions OCR.

Dans cet exemple de base, nous utilisons la classe IronOcr.IronTesseract pour lire le texte d'une image et renvoyer automatiquement le résultat sous forme de chaîne de caractères.

// PM> Install-Package IronOcr
using IronOcr;
var Result = new IronTesseract().Read(@"img\Screenshot.png");
Console.WriteLine(Result.Text);

// PM> Install-Package IronOcr
using IronOcr;
var Result = new IronTesseract().Read(@"img\Screenshot.png");
Console.WriteLine(Result.Text);

' PM> Install-Package IronOcr
Imports IronOcr
Private Result = (New IronTesseract()).Read("img\Screenshot.png")
Console.WriteLine(Result.Text)

$vbLabelText $csharpLabel

Par conséquent, le paragraphe suivant est tout à fait exact : ``

IronOCR Exemple simple

Dans cet exemple simple, nous allons tester la précision de notre bibliothèque C# OCR pour lire du texte à partir d'un PNG

Image. Il s'agit d'un test très basique, mais les choses se compliqueront au fur et à mesure que le tutoriel se poursuivra.

Le renard rapide saute par-dessus le chien paresseux ``

Bien que cela puisse paraître simple en apparence, il se passe des choses sophistiquées derrière la surface : analyse de l'image pour en vérifier l'alignement, la qualité et la résolution, examen de ses attributs, optimisation du moteur OCR et, enfin, lecture du texte comme le ferait un être humain.

L'OCR est une tâche difficile pour une machine, et la vitesse de lecture peut être comparable à celle d'un humain. En d'autres termes, l'OCR n'est pas une procédure rapide. Dans ce cas, cependant, elle est tout à fait correcte.

Dans la plupart des scénarios réels, les développeurs souhaitent que leurs projets soient exécutés le plus rapidement possible. Dans ce cas, nous vous proposons d'utiliser les classes OcrInput et IronTesseract de l'espace de noms IronOCR add ons.

Vous pouvez définir les caractéristiques exactes d'un travail d'OCR avec OcrInput, par exemple :

JPEG, TIFF, GIF, BMP et PNG ne sont que quelques-uns des formats d'image qui peuvent être utilisés
Importation de documents PDF en totalité ou en partie
Amélioration du contraste, de la résolution et de la taille de l'image
Rotation, bruit de balayage, bruit numérique, biais et correction de l'image négative
Tesseract de fer
Choisissez parmi des centaines de langues et de dialectes préemballés
Utiliser immédiatement les moteurs OCR Tesseract 5, 4 ou 3
Si nous recherchons une capture d'écran, un extrait ou le document entier, indiquez le type de document
Reconnaître les codes-barres
les PDF archivables, le HTML Hocr, le DOM et les chaînes de caractères sont autant d'options pour les résultats de l'OCR

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.tiff")) {
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.tiff")) {
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput("img\Potter.tiff")
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

Nous pouvons l'utiliser même sur un scan de qualité moyenne avec une précision de 100 %.

Comme vous pouvez le voir, lire du texte (et, si désiré, des codes-barres) à partir d'une image numérisée telle qu'un TIFF était plutôt facile. La précision de ce travail d'OCR est de 100 %.

Ensuite, nous allons essayer un scan de qualité bien inférieure de la même page, à un faible DPI et avec beaucoup de distorsion et de bruit numérique, ainsi que des dégâts sur le papier d'origine.

C'est là qu'IronOCR brille vraiment par rapport à d'autres bibliothèques d'OCR telles que Tesseract, et nous constaterons que d'autres projets d'OCR évitent de discuter de l'utilisation de l'OCR sur des images scannées réelles plutôt que sur des cas de test "parfaits" irréalistes créés numériquement afin d'atteindre une précision d'OCR de 100 %.

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.Deskew(); // removes rotation and perspective
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.Deskew(); // removes rotation and perspective
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput("img\Potter.LowQuality.tiff")
Input.Deskew() ' removes rotation and perspective
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

Sans ajouter Input.Deskew() pour redresser l'image, nous obtenons une précision de 52,5 %. Ce n'est pas suffisant.

Ajouter Input.Deskew() nous amène à une précision de 99,8%, ce qui est presque aussi précis que l'OCR d'un scan de haute qualité.

Utilisation de Dynamsoft OCR

Nous présenterons quelques extraits de code permettant d'utiliser Dynamic Web TWAIN pour effectuer des numérisations TWAIN et des OCR côté client en JavaScript.

Numériser des images

Vous pouvez modifier les paramètres de numérisation et acquérir des photos à partir de scanners TWAIN en utilisant les API simples de Dynamic Web TWAIN.

function acquireImage()
{
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex); //select an available TWAIN scanners

    //set scanning settings like pixel type, resolution, ADF etc.
    DWObject.IfShowUI = false; //don't show the user interface of the scanner
    DWObject.PixelType = 1; //scan in gray
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; //scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    //acquire images from scanners
    DWObject.AcquireImage();
}

function acquireImage()
{
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex); //select an available TWAIN scanners

    //set scanning settings like pixel type, resolution, ADF etc.
    DWObject.IfShowUI = false; //don't show the user interface of the scanner
    DWObject.PixelType = 1; //scan in gray
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; //scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    //acquire images from scanners
    DWObject.AcquireImage();
}

Private Function acquireImage() As [function]
DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex) 'select an available TWAIN scanners

	'set scanning settings like pixel type, resolution, ADF etc.
	DWObject.IfShowUI = False 'don't show the user interface of the scanner
	DWObject.PixelType = 1 'scan in gray
	DWObject.Resolution = 300
	DWObject.IfFeederEnabled = True 'scan from auto feeder
	DWObject.IfDuplexEnabled = False
	DWObject.IfDisableSourceAfterAcquire = True

	'acquire images from scanners
	DWObject.AcquireImage()
End Function

$vbLabelText $csharpLabel

Téléchargez le module professionnel OCR

Pour utiliser le module OCR Professional pour l'OCR côté client, vous devez inclure ocrpro.js dans l'en-tête et télécharger la DLL OCR Pro. ``


Make edits to the .js file:

```js
var CurrentPathName = unescape(location.pathname);
CurrentPath = CurrentPathName.substring(0, CurrentPathName.lastIndexOf("/") + 1);
DWObject.Addon.OCRPro.Download(CurrentPath + "Resources/addon/OCRPro.zip", OnSuccess, OnFailure);

Recognize text using OCR

Using the JS OCR recognition API to extract text from scanned images is as simple as inserting the code below.

DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image

DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image

DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo) ' 0 is the index of the image

$vbLabelText $csharpLabel

Reading Cropped Regions of Images

Both sets of software offer solutions for cropping images for OCR.

Reading cropped regions with IronOCR

Iron's branch of Tesseract OCR is adept at reading specific regions of images, as shown in the following code sample.

We can make use of System.Drawing.Rectangle that is used to describe the exact region of an image to be read in pixels.

When dealing with a standardized form that is filled out, and only a portion of the content changes from case to case, this can be really handy.

Scanning a Section of a Page: We can make use of System.Drawing.Rectangle to designate a region in which we shall read a document. Pixels are always the unit of measurement.

We shall find that this improves speed while also avoiding reading needless text. In this example, we will read a student's name from a central region of a standardized paper.

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// a 41% improvement on speed
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
Input.AddImage("img/ComSci.png", ContentArea);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
// a 41% improvement on speed
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
Input.AddImage("img/ComSci.png", ContentArea);
var Result = Ocr.Read(Input);
Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
Using Input = New OcrInput()
' a 41% improvement on speed
Dim ContentArea = New System.Drawing.Rectangle() With {
	.X = 215,
	.Y = 1250,
	.Height = 280,
	.Width = 1335
}
Input.AddImage("img/ComSci.png", ContentArea)
Dim Result = Ocr.Read(Input)
Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

This results in a 41 percent boost in speed, while also allowing us to be more specific. This is extremely valuable for .NET OCR applications involving documents that are comparable and consistent, including invoices, receipts, checks, forms, expense claims, and so on.

When reading PDFs, ContentAreas (OCR cropping) is also supported.

Reading cropped regions with Dynamsoft OCR

To begin, launch Visual Studio and build a new C# Windows Forms Application, or open an existing one.

We will need to include DynamicDotNetTWAIN.dll, DynamicOCR.dll, and the appropriate language package. To do so, navigate to Tools -> Choose Toolbox Items, then to the.NET Framework Components tab, click the Browse... button, and locate DynamicDotNetTWAIN.dll in "..Program Files (x86)DynamsoftDynamic.NET TWAIN 4.3 TrialBinv4.0" or v2.0 (depends on the .NET Framework version you are using). Click the OK button. The DynamicDotNetTwain component will then appear in the Toolbox dialog (under the View menu), as illustrated in the accompanying image.

Right-click the project file in Solution Explorer and select Add-> Existing Item... Then, in the file type filter's drop-down list, select All Files. Navigate to “..\Program Files (x86)\Dynamsoft\Dynamic .NET TWAIN 4.3 Trial\Bin\OCRResources” to add items to the project folder. The .NET TWAIN component can then be dragged and dropped onto the form.

This is the code for clicking the LoadImage button:

private void button1_Click(object sender, EventArgs e) { OpenFileDialog filedlg = new OpenFileDialog(); if (filedlg.ShowDialog() == DialogResult.OK) { dynamicDotNetTwain1.LoadImage(filedlg.FileName);
// choose an image from your local disk and load it into Dynamic .NET TWAIN
} }

We can now attempt to OCR the loaded image and turn it into a searchable text file.

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) { dynamicDotNetTwain1.OCRTessDataPath = "../../"; // the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng";
// the language type
dynamicDotNetTwain1.OCRDllPath = "../../";
//the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text; byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
// OCR the selected area of the image
if (sbytes != null) { SaveFileDialog filedlg = new SaveFileDialog(); filedlg.Filter = "Text File(*.txt) *.txt"; if (filedlg.ShowDialog() == DialogResult.OK) { FileStream fs = File.OpenWrite(filedlg.FileName); fs.Write(sbytes, 0, sbytes.Length);
//save the OCR result as a text file
fs.Close(); } MessageBox.Show("OCR successful"); } else { MessageBox.Show(dynamicDotNetTwain1.ErrorString); } }

private void button1_Click(object sender, EventArgs e) { OpenFileDialog filedlg = new OpenFileDialog(); if (filedlg.ShowDialog() == DialogResult.OK) { dynamicDotNetTwain1.LoadImage(filedlg.FileName);
// choose an image from your local disk and load it into Dynamic .NET TWAIN
} }

We can now attempt to OCR the loaded image and turn it into a searchable text file.

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) { dynamicDotNetTwain1.OCRTessDataPath = "../../"; // the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng";
// the language type
dynamicDotNetTwain1.OCRDllPath = "../../";
//the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text; byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
// OCR the selected area of the image
if (sbytes != null) { SaveFileDialog filedlg = new SaveFileDialog(); filedlg.Filter = "Text File(*.txt) *.txt"; if (filedlg.ShowDialog() == DialogResult.OK) { FileStream fs = File.OpenWrite(filedlg.FileName); fs.Write(sbytes, 0, sbytes.Length);
//save the OCR result as a text file
fs.Close(); } MessageBox.Show("OCR successful"); } else { MessageBox.Show(dynamicDotNetTwain1.ErrorString); } }

Private Sub button1_Click(ByVal sender As Object, ByVal e As EventArgs)
	Dim filedlg As New OpenFileDialog()
	If filedlg.ShowDialog() = DialogResult.OK Then
		dynamicDotNetTwain1.LoadImage(filedlg.FileName)
' choose an image from your local disk and load it into Dynamic .NET TWAIN
	End If
End Sub

We can now attempt [to] OCR the loaded image [and] turn it into a searchable text file.private Sub dynamicDotNetTwain1_OnImageAreaSelected(ByVal sImageIndex As Short, ByVal left As Integer, ByVal top As Integer, ByVal right As Integer, ByVal bottom As Integer)
	dynamicDotNetTwain1.OCRTessDataPath = "../../" ' the path of the language package (tessdata)
dynamicDotNetTwain1.OCRLanguage = "eng"
' the language type
dynamicDotNetTwain1.OCRDllPath = "../../"
'the relative path of the OCR DLL file
dynamicDotNetTwain1.OCRResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text
Dim sbytes() As Byte = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom)
' OCR the selected area of the image
If sbytes IsNot Nothing Then
	Dim filedlg As New SaveFileDialog()
	filedlg.Filter = "Text File(*.txt) *.txt"
	If filedlg.ShowDialog() = DialogResult.OK Then
		Dim fs As FileStream = File.OpenWrite(filedlg.FileName)
		fs.Write(sbytes, 0, sbytes.Length)
'save the OCR result as a text file
fs.Close()
	End If
	MessageBox.Show("OCR successful")
Else
	MessageBox.Show(dynamicDotNetTwain1.ErrorString)
End If
End Sub

$vbLabelText $csharpLabel

This is how the application looks.

Image Performance Tuning

The quality of the input image is the most crucial determinant in the speed of an OCR task. The lower the background noise and the higher the dpi, with a great goal value of around 200 dpi, the faster and more accurate the OCR output.

Image Processing Techniques for Dynamsoft OCR

We need to use OCR in a variety of situations, such as scanning a credit card number with our phone or extracting text from paper documents. OCR capabilities are included in Dynamsoft Label Recognition (DLR) and Dynamic Web TWAIN (DWT).

Although they can do an excellent job in general, we can improve the results by using various image processing techniques.

Lighten/remove shadows

Poor illumination may have an impact on the OCR result. To improve the outcome, we can whiten photos or eliminate shadows from images.

Invert

Because the OCR engine is often trained on text in dark colors, text in light colors can be harder to discover and recognize.

It will be easier to recognize if we invert its color

To perform the inversion, we can use the GrayscaleTransformationModes parameter in DLR.

Here are the JSON settings:

"GrayscaleTransformationModes": [
    {
        "Mode": "DLR_GTM_INVERTED"
    }
]

"GrayscaleTransformationModes": [
    {
        "Mode": "DLR_GTM_INVERTED"
    }
]

'INSTANT VB TODO TASK: The following line uses invalid syntax:
'"GrayscaleTransformationModes": [{ "Mode": "DLR_GTM_INVERTED" }]

$vbLabelText $csharpLabel

DLR .net’s reading result:

Rescale

If the letter height is too low, the OCR engine may not produce a good result. In general, the image should have a DPI of at least 300.

There is a ScaleUpModes parameter in DLR 1.1 that allows you to scale up letters. We may, of course, scale the image ourselves.

Reading the image directly yields the incorrect result:

After scaling up the image x2, the result is correct:

Deskew

It is fine if the text is a little distorted. However, if it is overly skewed, the outcome will be adversely altered. To improve the outcome, we need to crop the image.

To accomplish this, we can use the Hough Line Transform in OpenCV.

Here is the code to deskew the image above.

#coding=utf-8
import numpy as np
import cv2
import math
from PIL import Image

def deskew():
src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR)
gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5,5),np.uint8)
erode_Img = cv2.erode(gray,kernel)
eroDil = cv2.dilate(erode_Img,kernel) # erode and dilate
showAndWaitKey("eroDil",eroDil)

    canny = cv2.Canny(eroDil,50,150) # edge detection
    showAndWaitKey("canny",canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform
    drawing = np.zeros(src.shape [:], dtype=np.uint8)

    maxY=0
    degree_of_bottomline=0
    index=0
    for line in lines:        
        x1, y1, x2, y2 = line [0]            
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index==0:
            maxY=y1
            degree_of_bottomline=degree # take the degree of the line at the bottom
        else:        
            if y1>maxY:
                maxY=y1
                degree_of_bottomline=degree
        index=index+1
    showAndWaitKey("houghP",drawing)

    img=Image.fromarray(src)
    rotateImg = img.rotate(degree_of_bottomline)
    rotateImg_cv = np.array(rotateImg) 
    cv2.imshow("rotateImg",rotateImg_cv)
    cv2.imwrite("deskewed.jpg",rotateImg_cv)
    cv2.waitKey()

def showAndWaitKey(winName,img):
cv2.imshow(winName,img)
cv2.waitKey()

if __name__ == "__main__":              
deskew()

#coding=utf-8
import numpy as np
import cv2
import math
from PIL import Image

def deskew():
src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR)
gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
kernel = np.ones((5,5),np.uint8)
erode_Img = cv2.erode(gray,kernel)
eroDil = cv2.dilate(erode_Img,kernel) # erode and dilate
showAndWaitKey("eroDil",eroDil)

    canny = cv2.Canny(eroDil,50,150) # edge detection
    showAndWaitKey("canny",canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform
    drawing = np.zeros(src.shape [:], dtype=np.uint8)

    maxY=0
    degree_of_bottomline=0
    index=0
    for line in lines:        
        x1, y1, x2, y2 = line [0]            
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index==0:
            maxY=y1
            degree_of_bottomline=degree # take the degree of the line at the bottom
        else:        
            if y1>maxY:
                maxY=y1
                degree_of_bottomline=degree
        index=index+1
    showAndWaitKey("houghP",drawing)

    img=Image.fromarray(src)
    rotateImg = img.rotate(degree_of_bottomline)
    rotateImg_cv = np.array(rotateImg) 
    cv2.imshow("rotateImg",rotateImg_cv)
    cv2.imwrite("deskewed.jpg",rotateImg_cv)
    cv2.waitKey()

def showAndWaitKey(winName,img):
cv2.imshow(winName,img)
cv2.waitKey()

if __name__ == "__main__":              
deskew()

#coding=utf-8
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'import TryCast(numpy, np) import cv2 import math from PIL import Image def deskew(): src = cv2.imread("neg.jpg",cv2.IMREAD_COLOR) gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY) kernel = np.ones((5,5),np.uint8) erode_Img = cv2.erode(gray,kernel) eroDil = cv2.dilate(erode_Img,kernel) # erode @and dilate showAndWaitKey("eroDil",eroDil) canny = cv2.Canny(eroDil,50,150) # edge detection showAndWaitKey("canny",canny) lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90,minLineLength=100,maxLineGap=10) # Hough Lines Transform drawing = np.zeros(src.shape [:], dtype=np.uint8) maxY=0 degree_of_bottomline=0 index=0 for line in lines: x1, y1, x2, y2 = line [0] cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA) k = float(y1-y2)/(x1-x2) degree = np.degrees(math.atan(k)) if index==0: maxY=y1 degree_of_bottomline=degree # take the degree @of the line at the bottom else: if y1> maxY: maxY=y1 degree_of_bottomline=degree index=index+1 showAndWaitKey("houghP",drawing) img=Image.fromarray(src) rotateImg = img.rotate(degree_of_bottomline) rotateImg_cv = np.array(rotateImg) cv2.imshow("rotateImg",rotateImg_cv) cv2.imwrite("deskewed.jpg",rotateImg_cv) cv2.waitKey() def showAndWaitKey(winName,img): cv2.imshow(winName,img) cv2.waitKey() if __name__ == "__main__": deskew()

$vbLabelText $csharpLabel

Lines detected:

Deskewed:

Image Processing Techniques for IronOCR

The quality of the input image is not important here because IronOCR excels at repairing defective documents (though this is time-consuming and will cause your OCR jobs to use more CPU cycles).

Choosing input image formats with less digital noise, such as TIFF or PNG, can also result in speedier outcomes than lossy image formats, such as JPEG.

The image filters listed below can significantly enhance performance:

OcrInput.Rotate (double degrees) — Rotates images clockwise by a specified number of degrees. Negative integers are used for anti-clockwise rotation.

OcrInput.Binarize() — This image filter makes every pixel either black or white, with no in-between. It may improve OCR performance in circumstances where the text-to-background contrast is very low.

OcrInput.ToGrayScale() — This image filter converts every pixel to a grayscale shade. It is unlikely to improve OCR accuracy, but it may increase speed.

OcrInput.Contrast() — Automatically increases contrast. In low-contrast scans, this filter frequently improves OCR speed and accuracy.

OcrInput.DeNoise() — This filter should be used only when noise is expected.

OcrInput.Invert() — Reverses all colors. For example, white becomes black: black becomes white.

OcrInput.Dilate() — Advanced morphology. Dilation is the process of adding pixels to the edges of objects in an image. (Erode's inverse)

OcrInput. Erode() — an advanced morphology function. Erosion is the process of removing pixels from the edges of objects. (Dilate's inverse)

OcrInput. Deskew() — Rotates an image so that it is orthogonal and the right way up. Because Tesseract tolerance for skewed scans can be as low as 5 degrees, this is quite useful for OCR.

DeepCleanBackgroundNoise() — Removes a lot of background noise. Only use this filter if you know there is a lot of background noise in the document because it can reduce OCR accuracy on clear documents and is quite CPU intensive.

OcrInput.EnhanceResolution — Improves the resolution of low-resolution photos. Because of OcrInput, this filter is rarely used. OcrInput and will detect and resolve low resolution automatically.

We may want to use Iron Tesseract to speed up OCR on higher-quality scans.

If we're looking for speed, we might start here and subsequently turn features back on until the proper balance is struck.

using IronOcr;
var Ocr = new IronTesseract();
// Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\\";
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"img\Potter.tiff"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

using IronOcr;
var Ocr = new IronTesseract();
// Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\\";
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
Ocr.Language = OcrLanguage.EnglishFast;
using (var Input = new OcrInput(@"img\Potter.tiff"))
{
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

Imports IronOcr
Private Ocr = New IronTesseract()
' Configure for speed
Ocr.Configuration.BlackListCharacters = "~`$#^*_}{][\"
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly
Ocr.Language = OcrLanguage.EnglishFast
Using Input = New OcrInput("img\Potter.tiff")
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

This result is 99.8% accurate compared to the baseline of 100% — but 35% faster.

Licensing and Pricing

Dynamsoft Licensing and Pricing

Per year license. All rates include one year of maintenance, which includes free software upgrades and premium support.

Dynamsoft offers two types of licenses:

Per client device license

The "One Client Device License" provides access to a same-origin Application (same protocol, same host, and same port) to use the software's features from a single client device. An inactive client device is one that has not accessed any software capability for 90 days in a row. An inactive client device's license seat will be instantly freed and made available for usage by any other active client device. When you reach the maximum number of license seats allowed, Dynamsoft will give you an extra 10% of your client device allowance for emergency use. Once the additional client device allowance has been depleted, no new client devices can access and use the software until there are available license seats again. Please keep in mind that exceeding your client device allowance has no effect on any client devices that have already been licensed.

Per-server license

To deploy the application to a single server, a "One Server License" is required. Servers refer to both physical and virtual servers and include, but are not limited to, production servers, failover servers, development servers that are also used for testing, quality assurance servers, testing servers, and staging servers, all of which require a license. Additional licenses are not required for continuous integration servers (build servers) or localhost development servers. The per-server license is only valid for on-premises server installations, and not for cloud deployments.

Pricing for Dynamsoft OCR starts at USD 1,249/year.

IronOCR Licensing and Pricing

As developers, we all want to accomplish our projects with the least amount of money and resources possible — budgeting is critical. Examine the chart to determine which license is best suited to your requirements and budget.

IronOCR provides licenses with a customizable number of developers, projects, and locations, allowing you to fulfill the needs of your project while only paying for the coverage you require.

IronOCR licensing keys enable you to publish your product without a watermark.

Licenses start from $749 and include one year of support and upgrades.

You can also use a trial license key to try IronOCR for free.

Conclusion

Tesseract OCR on Mac, Windows, Linux, Azure OCR, and Docker are all available with IronOCR for C#. .NET Framework 4.0 or above is required, .NET Standard 2.0+, .NET Core 2.0+, .NET 5, Mono for macOS and Linux, and Xamarin for macOS are all examples of cross-platform development. IronOCR also uses the latest Tesseract 5 engine to read text, barcodes, and QR codes from all major image and PDF formats. In minutes, this library adds OCR functionality to your desktop, console, or web apps! The OCR can also read PDFs and multi-page TIFFs, and it can be saved as a searchable PDF document or XHTML in any OCR Scan. Plain text, barcode data, and an OCR result class encompassing paragraphs, lines, words, and characters are among its data output choices. It is available in 125 languages, including Arabic, Chinese, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Portuguese, Russian, and Spanish, but keep in mind that bespoke language packs can also be generated.

The Dynamic .NET TWAIN OCR add-on is a quick and reliable .NET component for Optical Character Recognition that you can use in WinForms and WPF applications written in C# or VB .NET. You can scan documents or capture photos from webcams using Dynamic .NET TWAIN's image capture module, and then conduct OCR on the images to convert the text in the images to text, searchable PDF files, or strings. Multiple Asian languages, as well as Arabic, are offered in addition to English.

IronOCR offers better licensing than Dynamsoft OCR; IronOcr starts at $749 with one year free, while Dynamsoft starts at $1249 with a free trial. IronOCR also offers licenses for multiple users, while with Dynamsoft, you only get one license per user.

While both sets of software aim at offering the best performance in terms of OCR readings of barcodes, image to text, and image to text, IronOCR stands out in that it shines its light even on images that are in pretty bad shape. It automatically puts in place its sophisticated tuning methods to give you the best OCR results. IronOCR also makes use of Tesseract to give you optimal results with little or no errors.

Iron Software is also offering its customers and users the option to grab its entire suite of software in just two clicks. This means that for the price of two of the components in the Iron Software suite, you can currently get all five components and uninterrupted support.

Kannapat Udonpant

Discutez avec l'équipe d'ingénierie maintenant

Ingénieur logiciel

Avant de devenir ingénieur logiciel, Kannapat a obtenu un doctorat en ressources environnementales à l'université d'Hokkaido au Japon. Tout en poursuivant ses études, Kannapat est également devenu membre du Vehicle Robotics Laboratory, qui fait partie du Department of Bioproduction Engineering (département d'ingénierie de la bioproduction). En 2022, il a mis à profit ses compétences en C# pour rejoindre l'équipe d'ingénieurs d'Iron Software, où il se concentre sur IronPDF. Kannapat apprécie son travail car il apprend directement auprès du développeur qui écrit la majeure partie du code utilisé dans IronPDF. Outre l'apprentissage par les pairs, Kannapat apprécie l'aspect social du travail chez Iron Software. Lorsqu'il n'écrit pas de code ou de documentation, Kannapat peut généralement être trouvé en train de jouer sur sa PS5 ou de revoir The Last of Us.

< PRÉCÉDENT
Comparaison entre IronOCR et Tesseract.NET

SUIVANT >
Comparaison entre IronOCR et Abbyy Finereader