跳至页脚内容
与其他组件比较

IronOCR 和 Dynamsoft OCR 之间的比较

Optical Character Recognition, or OCR, is a data entry process that involves the recognition and digitization of text, both written and printed. It is a type of computer technology that employs image analysis to convert digital photographs of printed text into letters and numbers that can be used by other programs such as word processors. The text is converted to character codes so that it may be searched and changed on a computer.

While the past was a world in which all documents were physical, and the future may be a society in which all documents are digital, the present is in flux. Physical and digital documents coexist in this transitional state — therefore technologies like OCR are critical for back-and-forth conversion.

Document recovery, data entry, and accessibility are just a few of the applications for OCR. The majority of OCR applications come from scanned papers, although photographs are also occasionally employed. OCR is a valuable time-saver because retyping the material is often the only other option. The following are some examples of how OCR can be used:

  • Editable text files can be recovered from scanned documents, including faxes.
  • Using book scans to create searchable and editable e-books.
  • Using screenshot photos to search for and change text.
  • Text-to-speech technology is used to read books to visually challenged people.

While these are just a few of the applications for OCR, they demonstrate the technology's versatility in a wide range of industries. Almost all employees in all firms rely substantially on documents on a daily basis, hence business usage is a key consideration in OCR system development.

In this article, we will be comparing the two most powerful OCR readers:

  • IronOCR
  • Dynamsoft OCR

IronOCR and Dynamsoft OCR are two .NET OCR libraries that support the conversion of scanned images and the OCR processing of PDF documents. You can transform images into searchable text with just a few lines of code. You can also retrieve individual words, letters, and paragraphs.

IronOCR — the outstanding features

IronOCR offers the unique ability to detect, read, and interpret text from pictures and PDF documents that haven't been scanned precisely. IronOCR offers the simplest approach to extracting text from documents and photos, even if it isn't always the fastest, because it automatically sharpens and corrects low-quality scans, reducing skew, distortion, background noise, and perspective issues, while also improving resolution and contrast.

IronOCR allows developers to send it single or multiple page scanned images, and it will return all the text, barcodes, and QR information. A set of classes in the OCR library adds OCR capability to web-based, desktop, or console applications. Tesseract OCR C#, as well as net apps JPG, PNG, TIFF, PDF, GIF, and BMP, are just a few of the formats that can be used as input.

IronOCR's Optical Character Recognition (OCR) engine can read text prepared using many common fonts, italics, weights, and underlines. Cropping classes make it possible for the OCR to work swiftly and precisely. When working with multipage documents, IronOCR's multithreaded engine speeds up OCR.

IronOCR features

For Tesseract management, we utilize IronOCR because it is unique in the following ways:

  • Works straight out of the box in pure .NET
  • Does not require Tesseract to be installed on your machine
  • Runs the latest engines: Tesseract 5 (as well as Tesseract 4 & 3)
  • Is available for any .NET project: .NET Framework 4.5 +, .NET Standard 2 + and .NET Core 2, 3 & .NET 5
  • Has improved accuracy and speed over traditional Tesseract
  • Supports Xamarin, Mono, Azure, and Docker
  • It manages the complex Tesseract dictionary system using NuGet packages
  • Supports PDFS, MultiFrame Tiffs, and all major image formats without configuration
  • Can correct low-quality and skewed scans to get the best results from Tesseract.

Dynamsoft OCR — features

The Dynamsoft.NET OCR library is a .NET component that provides rapid and reliable optical character recognition. It is used to create .NET desktop applications in C# or VB.NET. You can simply create code to convert the useless text in PDF or photos to digital text for editing, searching, archiving, and more using the basic OCR APIs.

Images from scanners and other TWAIN-compliant devices can be acquired in the following ways:

  • Native, buffered memory and disk file image transfer mechanisms are all supported.
  • With the auto document feeder, batch scanning is possible (ADF).
  • TWAIN attributes can be used to modify common device functionalities.
  • IfAutoFeed, IfAutoScan, Resolution, BitDepth, Brightness, Contrast, Unit, Duplex, and other features can all be changed.
  • Supports the detection of blank pages.
  • Allows you to change and save scanner profiles.

Capture images from webcams that are UVC and WIA compliant:

  • Show a live video feed while capturing photos from a chosen webcam.
  • Customize the camera's settings: Brightness, Contrast, Hue, Saturation, Sharpness, Gamma, White Balance, Backlight Compensation, Gain, Color Enable, Zoom, Focus, Exposure, Iris, Pan, Tilt, Roll.

Robust Image Loading/Viewing

  • Images in BMP, JPEG, PNG, TIFF, and multi-page TIFF formats can be loaded.
  • Zooming in and out on photos is supported.
  • Images can be retrieved from a local drive, an FTP server, an HTTP server, or a database.
  • Image decoding for BMP, JPEG, PNG, and TIFF using one of the most comprehensive sets of .NET imaging components

Save and Upload/Download

  • Allows you to read and write photos over a file stream.
  • Supports saving captured photos as BMP, JPEG, PNG, TIFF, or multi-page TIFF to a local drive, a web server, or a database.
  • RLE, G3/G4, LZW, PackBits, and TIFF compression are all supported.
  • HTTPS uploads and downloads are supported.
  • One of the most extensive sets of.NET imaging components available supports BMP, JPEG, PNG, and TIFF image encoding.
  • Allows you to attach newly obtained photos to existing TIFF files.

Read Text from Scanned PDFs or Other Images in ASP.NET (Optical Character Recognition)

Customers want work to be completed quickly in today's fast-paced world. Customers with urgent projects contact us frequently. Our technology can simply recognize the content of an image and convert it to text if the project entails scanning papers that contain images. Optical Character Recognition (OCR) saves your firm time and money while also reducing data entry errors.

Using IronOCR

IronOCR makes use of the IronOcr.IronTesseract class to perform its OCR conversions.

We use the IronOcr.IronTesseract class to read text from an image and automatically return its result as a string in this basic example.

// PM> Install-Package IronOcr
using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Create a new instance of the IronTesseract class
        var ironOcr = new IronTesseract();

        // Read the text from the image
        var result = ironOcr.Read(@"img\Screenshot.png");

        // Output the text to the console
        Console.WriteLine(result.Text);
    }
}
// PM> Install-Package IronOcr
using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Create a new instance of the IronTesseract class
        var ironOcr = new IronTesseract();

        // Read the text from the image
        var result = ironOcr.Read(@"img\Screenshot.png");

        // Output the text to the console
        Console.WriteLine(result.Text);
    }
}
' PM> Install-Package IronOcr
Imports IronOcr

Friend Class Program
	Shared Sub Main(ByVal args() As String)
		' Create a new instance of the IronTesseract class
		Dim ironOcr = New IronTesseract()

		' Read the text from the image
		Dim result = ironOcr.Read("img\Screenshot.png")

		' Output the text to the console
		Console.WriteLine(result.Text)
	End Sub
End Class
$vbLabelText   $csharpLabel

As a result, the following paragraph is 100 percent accurate:

IronOCR Simple Example

In this simple example we will test the accuracy of our C# OCR library to read text from a PNG
Image. This is a very basic test, but things will get more complicated as the tutorial continues.

The quick brown fox jumps over the lazy dog

Although it may appear simple on the surface, there is sophisticated behavior going on behind the surface: scanning the image for alignment, quality, and resolution, looking at its attributes, optimizing the OCR engine, and finally reading the text as a human might.

OCR is a difficult task for a machine to do, and reading speeds may be comparable to that of a human. To put it another way, OCR is not a quick procedure. In this case, though, it is absolutely correct.

C# OCR application results accuracy

In most real-world scenarios, developers will want their projects to run as quickly as possible. In this scenario, we propose that you utilize the IronOCR add-ons namespace's OcrInput and IronTesseract classes instead.

You can set the exact features of an OCR job with OcrInput, such as:

  • JPEG, TIFF, GIF, BMP, and PNG are just a few of the image formats that can be used
  • Importing PDF documents in their entirety or in portions
  • Enhancing the contrast, resolution, and size of the image
  • Rotation, scan noise, digital noise, skew, and negative picture correction

IronTesseract

Choose from hundreds of pre-packaged languages and dialects

  • Use Tesseract 5, 4, or 3 OCR engines right away
  • If we're looking at a screenshot, a snippet, or the entire document, specify the document type
  • Recognize barcodes
  • Searchable PDFs, Hocr HTML, a DOM, and Strings are all options for OCR results
using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Create an instance of IronTesseract
        var ocr = new IronTesseract();

        // Use the OcrInput class to read from an image file
        using (var input = new OcrInput(@"img\Potter.tiff"))
        {
            // Perform the OCR operation
            var result = ocr.Read(input);

            // Output the recognized text to the console
            Console.WriteLine(result.Text);
        }
    }
}
using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Create an instance of IronTesseract
        var ocr = new IronTesseract();

        // Use the OcrInput class to read from an image file
        using (var input = new OcrInput(@"img\Potter.tiff"))
        {
            // Perform the OCR operation
            var result = ocr.Read(input);

            // Output the recognized text to the console
            Console.WriteLine(result.Text);
        }
    }
}
Imports IronOcr

Friend Class Program
	Shared Sub Main(ByVal args() As String)
		' Create an instance of IronTesseract
		Dim ocr = New IronTesseract()

		' Use the OcrInput class to read from an image file
		Using input = New OcrInput("img\Potter.tiff")
			' Perform the OCR operation
			Dim result = ocr.Read(input)

			' Output the recognized text to the console
			Console.WriteLine(result.Text)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

We can use this even on a medium-quality scan with 100% accuracy.

C# OCR Scan From Tiff Example

As you can see, reading text (and, if desired, barcodes) from a scanned image such as a TIFF was rather easy. The accuracy of this OCR job is 100 percent.

Next, we will try a much lower quality scan of the same page, at a low DPI and with lots of distortion and digital noise, as well as damage to the original paper.

C# OCR Low Resolution Scan with Digital Noise

This is where IronOCR truly shines in comparison to other OCR libraries such as Tesseract, and we'll find that other OCR projects avoid discussing the use of OCR on real-world scanned images rather than unrealistically 'perfect' test cases created digitally in order to achieve 100% OCR accuracy.

using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Create an instance of IronTesseract
        var ocr = new IronTesseract();

        // Use the OcrInput class to read from a low-quality image file
        using (var input = new OcrInput(@"img\Potter.LowQuality.tiff"))
        {
            // Deskew the image to improve accuracy
            input.Deskew();

            // Perform the OCR operation
            var result = ocr.Read(input);

            // Output the recognized text to the console
            Console.WriteLine(result.Text);
        }
    }
}
using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Create an instance of IronTesseract
        var ocr = new IronTesseract();

        // Use the OcrInput class to read from a low-quality image file
        using (var input = new OcrInput(@"img\Potter.LowQuality.tiff"))
        {
            // Deskew the image to improve accuracy
            input.Deskew();

            // Perform the OCR operation
            var result = ocr.Read(input);

            // Output the recognized text to the console
            Console.WriteLine(result.Text);
        }
    }
}
Imports IronOcr

Friend Class Program
	Shared Sub Main(ByVal args() As String)
		' Create an instance of IronTesseract
		Dim ocr = New IronTesseract()

		' Use the OcrInput class to read from a low-quality image file
		Using input = New OcrInput("img\Potter.LowQuality.tiff")
			' Deskew the image to improve accuracy
			input.Deskew()

			' Perform the OCR operation
			Dim result = ocr.Read(input)

			' Output the recognized text to the console
			Console.WriteLine(result.Text)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

Without adding Input.Deskew() to straighten the image we get a 52.5% accuracy. This is not good enough.

Adding Input.Deskew() brings us to 99.8% accuracy which is almost as accurate as the OCR of a high-quality scan.

Using Dynamsoft OCR

We will present some code snippets for using Dynamic Web TWAIN to do TWAIN scanning and client-side OCR in JavaScript.

Scan Images

You may change scanning settings and acquire photos from TWAIN scanners using Dynamic Web TWAIN's simple APIs.

function acquireImage() {
    // Select an available TWAIN scanner
    DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex);

    // Set scanning settings like pixel type, resolution, ADF, etc.
    DWObject.IfShowUI = false; // Do not show the user interface of the scanner
    DWObject.PixelType = 1; // Scan in grayscale
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; // Scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    // Acquire images from scanners
    DWObject.AcquireImage();
}
function acquireImage() {
    // Select an available TWAIN scanner
    DWObject.SelectSourceByIndex(document.getElementById("source").selectedIndex);

    // Set scanning settings like pixel type, resolution, ADF, etc.
    DWObject.IfShowUI = false; // Do not show the user interface of the scanner
    DWObject.PixelType = 1; // Scan in grayscale
    DWObject.Resolution = 300;
    DWObject.IfFeederEnabled = true; // Scan from auto feeder
    DWObject.IfDuplexEnabled = false;
    DWObject.IfDisableSourceAfterAcquire = true;

    // Acquire images from scanners
    DWObject.AcquireImage();
}
JAVASCRIPT

Download the OCR Professional Module

To use the OCR Professional module for client-side OCR, you will need to include ocrpro.js in the head and also download the OCR Pro DLL.

<script type="text/javascript" src="Resources/addon/dynamsoft.webtwain.addon.ocrpro.js"></script>
<script type="text/javascript" src="Resources/addon/dynamsoft.webtwain.addon.ocrpro.js"></script>
HTML

Make edits to the .js file:

// Define base path
var CurrentPathName = unescape(location.pathname);
var CurrentPath = CurrentPathName.substring(0, CurrentPathName.lastIndexOf("/") + 1);

// Download the OCR Pro add-on
DWObject.Addon.OCRPro.Download(CurrentPath + "Resources/addon/OCRPro.zip", OnSuccess, OnFailure);
// Define base path
var CurrentPathName = unescape(location.pathname);
var CurrentPath = CurrentPathName.substring(0, CurrentPathName.lastIndexOf("/") + 1);

// Download the OCR Pro add-on
DWObject.Addon.OCRPro.Download(CurrentPath + "Resources/addon/OCRPro.zip", OnSuccess, OnFailure);
JAVASCRIPT

Recognize text using OCR

Using the JS OCR recognition API to extract text from scanned images is as simple as inserting the code below.

// Recognize text from the image at index 0
DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image
// Recognize text from the image at index 0
DWObject.Addon.OCRPro.Recognize(0, GetOCRProInfo, GetErrorInfo); // 0 is the index of the image
JAVASCRIPT

Reading Cropped Regions of Images

Both sets of software offer solutions for cropping images for OCR.

Reading cropped regions with IronOCR

Iron's branch of Tesseract OCR is adept at reading specific regions of images, as shown in the following code sample.

We use System.Drawing.Rectangle to describe the exact region of an image to be read in pixels.

When dealing with a standardized form that is filled out, and only a portion of the content changes from case to case, this can be really handy.

Scanning a Section of a Page: We use System.Drawing.Rectangle to designate a region to read from a document, improving speed and accuracy.

C# OCR Scan From Tiff Example
C# OCR Scan From Tiff Example
using IronOcr;
using System.Drawing;

class Program
{
    static void Main(string[] args)
    {
        // Create an instance of IronTesseract
        var ocr = new IronTesseract();

        using (var input = new OcrInput())
        {
            // Define content area of interest
            var contentArea = new Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };

            // Add the specific region to the input
            input.AddImage("img/ComSci.png", contentArea);

            // Perform OCR operation
            var result = ocr.Read(input);

            // Output recognized text to console
            Console.WriteLine(result.Text);
        }
    }
}
using IronOcr;
using System.Drawing;

class Program
{
    static void Main(string[] args)
    {
        // Create an instance of IronTesseract
        var ocr = new IronTesseract();

        using (var input = new OcrInput())
        {
            // Define content area of interest
            var contentArea = new Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };

            // Add the specific region to the input
            input.AddImage("img/ComSci.png", contentArea);

            // Perform OCR operation
            var result = ocr.Read(input);

            // Output recognized text to console
            Console.WriteLine(result.Text);
        }
    }
}
Imports IronOcr
Imports System.Drawing

Friend Class Program
	Shared Sub Main(ByVal args() As String)
		' Create an instance of IronTesseract
		Dim ocr = New IronTesseract()

		Using input = New OcrInput()
			' Define content area of interest
			Dim contentArea = New Rectangle() With {
				.X = 215,
				.Y = 1250,
				.Height = 280,
				.Width = 1335
			}

			' Add the specific region to the input
			input.AddImage("img/ComSci.png", contentArea)

			' Perform OCR operation
			Dim result = ocr.Read(input)

			' Output recognized text to console
			Console.WriteLine(result.Text)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

This results in a 41 percent boost in speed, while also allowing us to be more specific. This is extremely valuable for .NET OCR applications involving documents that are comparable and consistent, including invoices, receipts, checks, forms, expense claims, and so on.

When reading PDFs, ContentAreas (OCR cropping) is also supported.

Reading cropped regions with Dynamsoft OCR

To begin, launch Visual Studio and build a new C# Windows Forms Application, or open an existing one. We will need to include DynamicDotNetTWAIN.dll, DynamicOCR.dll, and the appropriate language package.

  1. Navigate to Tools -> Choose Toolbox Items, then to the .NET Framework Components tab, click the Browse button, and locate DynamicDotNetTWAIN.dll.

  2. Right-click the project file in Solution Explorer and select Add-> Existing Item... then add the necessary items from the OCR resources directory.

This is the code for clicking the LoadImage button:

private void button1_Click(object sender, EventArgs e) 
{
    OpenFileDialog filedlg = new OpenFileDialog(); 
    if (filedlg.ShowDialog() == DialogResult.OK) 
    { 
        dynamicDotNetTwain1.LoadImage(filedlg.FileName);
        // Choose an image from your local disk and load it into Dynamic .NET TWAIN
    } 
}

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) 
{
    dynamicDotNetTwain1.OCRTessDataPath = "../../"; 
    dynamicDotNetTwain1.OCRLanguage = "eng";
    OcrResultFormat ocrResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text;

    byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
    // OCR the selected area of the image

    if (sbytes != null) 
    { 
        SaveFileDialog filedlg = new SaveFileDialog(); 
        filedlg.Filter = "Text File(*.txt)| *.txt"; 
        if (filedlg.ShowDialog() == DialogResult.OK) 
        {
            FileStream fs = File.OpenWrite(filedlg.FileName); 
            fs.Write(sbytes, 0, sbytes.Length);
            // Save the OCR result as a text file
            fs.Close(); 
        }
        MessageBox.Show("OCR successful");
    } 
    else 
    {
        MessageBox.Show(dynamicDotNetTwain1.ErrorString); 
    }
}
private void button1_Click(object sender, EventArgs e) 
{
    OpenFileDialog filedlg = new OpenFileDialog(); 
    if (filedlg.ShowDialog() == DialogResult.OK) 
    { 
        dynamicDotNetTwain1.LoadImage(filedlg.FileName);
        // Choose an image from your local disk and load it into Dynamic .NET TWAIN
    } 
}

private void dynamicDotNetTwain1_OnImageAreaSelected(short sImageIndex, int left, int top, int right, int bottom) 
{
    dynamicDotNetTwain1.OCRTessDataPath = "../../"; 
    dynamicDotNetTwain1.OCRLanguage = "eng";
    OcrResultFormat ocrResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text;

    byte [] sbytes = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom);
    // OCR the selected area of the image

    if (sbytes != null) 
    { 
        SaveFileDialog filedlg = new SaveFileDialog(); 
        filedlg.Filter = "Text File(*.txt)| *.txt"; 
        if (filedlg.ShowDialog() == DialogResult.OK) 
        {
            FileStream fs = File.OpenWrite(filedlg.FileName); 
            fs.Write(sbytes, 0, sbytes.Length);
            // Save the OCR result as a text file
            fs.Close(); 
        }
        MessageBox.Show("OCR successful");
    } 
    else 
    {
        MessageBox.Show(dynamicDotNetTwain1.ErrorString); 
    }
}
Private Sub button1_Click(ByVal sender As Object, ByVal e As EventArgs)
	Dim filedlg As New OpenFileDialog()
	If filedlg.ShowDialog() = DialogResult.OK Then
		dynamicDotNetTwain1.LoadImage(filedlg.FileName)
		' Choose an image from your local disk and load it into Dynamic .NET TWAIN
	End If
End Sub

Private Sub dynamicDotNetTwain1_OnImageAreaSelected(ByVal sImageIndex As Short, ByVal left As Integer, ByVal top As Integer, ByVal right As Integer, ByVal bottom As Integer)
	dynamicDotNetTwain1.OCRTessDataPath = "../../"
	dynamicDotNetTwain1.OCRLanguage = "eng"
	Dim ocrResultFormat As OcrResultFormat = Dynamsoft.DotNet.TWAIN.OCR.ResultFormat.Text

	Dim sbytes() As Byte = dynamicDotNetTwain1.OCR(dynamicDotNetTwain1.CurrentImageIndexInBuffer, left, top, right, bottom)
	' OCR the selected area of the image

	If sbytes IsNot Nothing Then
		Dim filedlg As New SaveFileDialog()
		filedlg.Filter = "Text File(*.txt)| *.txt"
		If filedlg.ShowDialog() = DialogResult.OK Then
			Dim fs As FileStream = File.OpenWrite(filedlg.FileName)
			fs.Write(sbytes, 0, sbytes.Length)
			' Save the OCR result as a text file
			fs.Close()
		End If
		MessageBox.Show("OCR successful")
	Else
		MessageBox.Show(dynamicDotNetTwain1.ErrorString)
	End If
End Sub
$vbLabelText   $csharpLabel

This is how the application looks:

Demo App of Zone OCR using Dynamic .NET TWAIN OCR SDK

Image Performance Tuning

The quality of the input image is the most crucial determinant in the speed of an OCR task. The lower the background noise and the higher the dpi, with a great goal value of around 200 dpi, the faster and more accurate the OCR output.

Image Processing Techniques for Dynamsoft OCR

We need to use OCR in a variety of situations, such as scanning a credit card number with our phone or extracting text from paper documents. OCR capabilities are included in Dynamsoft Label Recognition (DLR) and Dynamic Web TWAIN (DWT).

Although they can do an excellent job in general, we can improve the results by using various image processing techniques.

Lighten/remove shadows

Poor illumination may have an impact on the OCR result. To improve the outcome, we can whiten photos or eliminate shadows from images.

Invert

Because the OCR engine is often trained on text in dark colors, text in light colors can be harder to discover and recognize.

Light text

It will be easier to recognize if we invert its color

Light text inverted

To perform the inversion, we can use the GrayscaleTransformationModes parameter in DLR.

Here are the JSON settings:

"GrayscaleTransformationModes": [
    {
        "Mode": "DLR_GTM_INVERTED"
    }
]

DLR .net’s reading result:

Light text result

Rescale

If the letter height is too low, the OCR engine may not produce a good result. In general, the image should have a DPI of at least 300.

There is a ScaleUpModes parameter in DLR 1.1 that allows you to scale up letters. We may, of course, scale the image ourselves.

Reading the image directly yields the incorrect result:

1x image

After scaling up the image x2, the result is correct:

2x image

Deskew

It is fine if the text is a little distorted. However, if it is overly skewed, the outcome will be adversely altered. To improve the outcome, we need to crop the image.

To accomplish this, we can use the Hough Line Transform in OpenCV and Python.

Here is the code to deskew the image above:

import cv2
import numpy as np
import math
from PIL import Image

def deskew():
    src = cv2.imread("neg.jpg", cv2.IMREAD_COLOR)
    gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
    kernel = np.ones((5,5), np.uint8)
    erode_img = cv2.erode(gray, kernel)
    eroDil = cv2.dilate(erode_img, kernel)
    show_and_wait_key("eroDil", eroDil)

    canny = cv2.Canny(eroDil, 50, 150)
    show_and_wait_key("canny", canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90, minLineLength=100, maxLineGap=10)
    drawing = np.zeros(src.shape[:], dtype=np.uint8)

    maxY = 0
    degree_of_bottomline = 0
    index = 0
    for line in lines:
        x1, y1, x2, y2 = line[0]
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index == 0:
            maxY = y1
            degree_of_bottomline = degree
        else:        
            if y1 > maxY:
                maxY = y1
                degree_of_bottomline = degree
        index += 1
    show_and_wait_key("houghP", drawing)

    img = Image.fromarray(src)
    rotate_img = img.rotate(degree_of_bottomline)
    rotate_img_cv = np.array(rotate_img)
    cv2.imshow("rotateImg", rotate_img_cv)
    cv2.imwrite("deskewed.jpg", rotate_img_cv)
    cv2.waitKey()

def show_and_wait_key(win_name, img):
    cv2.imshow(win_name, img)
    cv2.waitKey()

if __name__ == "__main__":
    deskew()
import cv2
import numpy as np
import math
from PIL import Image

def deskew():
    src = cv2.imread("neg.jpg", cv2.IMREAD_COLOR)
    gray = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
    kernel = np.ones((5,5), np.uint8)
    erode_img = cv2.erode(gray, kernel)
    eroDil = cv2.dilate(erode_img, kernel)
    show_and_wait_key("eroDil", eroDil)

    canny = cv2.Canny(eroDil, 50, 150)
    show_and_wait_key("canny", canny)

    lines = cv2.HoughLinesP(canny, 0.8, np.pi / 180, 90, minLineLength=100, maxLineGap=10)
    drawing = np.zeros(src.shape[:], dtype=np.uint8)

    maxY = 0
    degree_of_bottomline = 0
    index = 0
    for line in lines:
        x1, y1, x2, y2 = line[0]
        cv2.line(drawing, (x1, y1), (x2, y2), (0, 255, 0), 1, lineType=cv2.LINE_AA)
        k = float(y1-y2)/(x1-x2)
        degree = np.degrees(math.atan(k))
        if index == 0:
            maxY = y1
            degree_of_bottomline = degree
        else:        
            if y1 > maxY:
                maxY = y1
                degree_of_bottomline = degree
        index += 1
    show_and_wait_key("houghP", drawing)

    img = Image.fromarray(src)
    rotate_img = img.rotate(degree_of_bottomline)
    rotate_img_cv = np.array(rotate_img)
    cv2.imshow("rotateImg", rotate_img_cv)
    cv2.imwrite("deskewed.jpg", rotate_img_cv)
    cv2.waitKey()

def show_and_wait_key(win_name, img):
    cv2.imshow(win_name, img)
    cv2.waitKey()

if __name__ == "__main__":
    deskew()
PYTHON

Lines detected:

Lines detected

Deskewed:

Deskewed image

Image Processing Techniques for IronOCR

The quality of the input image is not important here because IronOCR excels at repairing defective documents (though this is time-consuming and will cause your OCR jobs to use more CPU cycles).

Choosing input image formats with less digital noise, such as TIFF or PNG, can also result in speedier outcomes than lossy image formats, such as JPEG.

The image filters listed below can significantly enhance performance:

  • OcrInput.Rotate(double degrees) — Rotates images clockwise by a specified number of degrees. Negative integers are used for anti-clockwise rotation.
  • OcrInput.Binarize() — This image filter makes every pixel either black or white, with no in-between. It may improve OCR performance in circumstances where the text-to-background contrast is very low.
  • OcrInput.ToGrayScale() — This image filter converts every pixel to a grayscale shade. It is unlikely to improve OCR accuracy, but it may increase speed.
  • OcrInput.Contrast() — Automatically increases contrast. In low-contrast scans, this filter frequently improves OCR speed and accuracy.
  • OcrInput.DeNoise() — This filter should be used only when noise is expected.
  • OcrInput.Invert() — Reverses all colors. For example, white becomes black: black becomes white.
  • OcrInput.Dilate() — Advanced morphology. Dilation is the process of adding pixels to the edges of objects in an image. (Inverse of Erode)
  • OcrInput.Erode() — An advanced morphology function. Erosion is the process of removing pixels from the edges of objects. (Inverse of Dilate)
  • OcrInput.Deskew() — Rotates an image so that it is orthogonal and the right way up. Because Tesseract tolerance for skewed scans can be as low as 5 degrees, this is quite useful for OCR.
  • DeepCleanBackgroundNoise() — Removes a lot of background noise. Only use this filter if you know there is a lot of background noise in the document because it can reduce OCR accuracy on clear documents and is quite CPU intensive.
  • OcrInput.EnhanceResolution — Improves the resolution of low-resolution photos. Because of OcrInput, this filter is rarely used. OcrInput will detect and resolve low resolution automatically.

We may want to use IronTesseract to speed up OCR on higher-quality scans. If we're looking for speed, we might start here and subsequently turn features back on until the proper balance is struck.

using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        var ocr = new IronTesseract();

// Configuration for speed tuning
        ocr.Configuration.BlackListCharacters = "~`$#^*_}{][|\\";
        ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
        ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
        ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
        ocr.Language = OcrLanguage.EnglishFast;

        using (var input = new OcrInput(@"img\Potter.tiff"))
        {
            var result = ocr.Read(input);
            Console.WriteLine(result.Text);
        }
    }
}
using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        var ocr = new IronTesseract();

// Configuration for speed tuning
        ocr.Configuration.BlackListCharacters = "~`$#^*_}{][|\\";
        ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;
        ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
        ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly;
        ocr.Language = OcrLanguage.EnglishFast;

        using (var input = new OcrInput(@"img\Potter.tiff"))
        {
            var result = ocr.Read(input);
            Console.WriteLine(result.Text);
        }
    }
}
Imports IronOcr

Friend Class Program
	Shared Sub Main(ByVal args() As String)
		Dim ocr = New IronTesseract()

' Configuration for speed tuning
		ocr.Configuration.BlackListCharacters = "~`$#^*_}{][|\"
		ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto
		ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
		ocr.Configuration.EngineMode = TesseractEngineMode.LstmOnly
		ocr.Language = OcrLanguage.EnglishFast

		Using input = New OcrInput("img\Potter.tiff")
			Dim result = ocr.Read(input)
			Console.WriteLine(result.Text)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

This result is 99.8% accurate compared to the baseline of 100% — but 35% faster.

Licensing and Pricing

Dynamsoft Licensing and Pricing

Per year license. All rates include one year of maintenance, which includes free software upgrades and premium support.

Dynamsoft offers two types of licenses:

Per client device license

The "One Client Device License" provides access to a same-origin Application (same protocol, same host, and same port) to use the software's features from a single client device. An inactive client device is one that has not accessed any software capability for 90 days in a row. An inactive client device's license seat will be instantly freed and made available for usage by any other active client device. When you reach the maximum number of license seats allowed, Dynamsoft will give you an extra 10% of your client device allowance for emergency use. Once the additional client device allowance has been depleted, no new client devices can access and use the software until there are available license seats again. Please keep in mind that exceeding your client device allowance has no effect on any client devices that have already been licensed.

Per-server license

To deploy the application to a single server, a "One Server License" is required. Servers refer to both physical and virtual servers and include, but are not limited to, production servers, failover servers, development servers that are also used for testing, quality assurance servers, testing servers, and staging servers, all of which require a license. Additional licenses are not required for continuous integration servers (build servers) or localhost development servers. The per-server license is only valid for on-premises server installations, and not for cloud deployments.

Pricing for Dynamsoft OCR starts at USD 1,249/year.

IronOCR Licensing and Pricing

As developers, we all want to accomplish our projects with the least amount of money and resources possible — budgeting is critical. Examine the chart to determine which license is best suited to your requirements and budget.

IronOCR provides licenses with a customizable number of developers, projects, and locations, allowing you to fulfill the needs of your project while only paying for the coverage you require.

IronOCR licensing keys enable you to publish your product without a watermark.

Licenses start from $799 and include one year of support and upgrades.

You can also use a trial license key to try IronOCR for free.

Conclusion

Tesseract OCR on Mac, Windows, Linux, Azure OCR, and Docker are all available with IronOCR for C#. .NET Framework 4.0 or above is required, .NET Standard 2.0+, .NET Core 2.0+, .NET 5, Mono for macOS and Linux, and Xamarin for macOS are all examples of cross-platform development. IronOCR also uses the latest Tesseract 5 engine to read text, barcodes, and QR codes from all major image and PDF formats. In minutes, this library adds OCR functionality to your desktop, console, or web apps! The OCR can also read PDFs and multi-page TIFFs, and it can be saved as a searchable PDF document or XHTML in any OCR Scan. Plain text, barcode data, and an OCR result class encompassing paragraphs, lines, words, and characters are among its data output choices. It is available in 125 languages, including Arabic, Chinese, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Portuguese, Russian, and Spanish, but keep in mind that bespoke language packs can also be generated.

The Dynamic .NET TWAIN OCR add-on is a quick and reliable .NET component for Optical Character Recognition that you can use in WinForms and WPF applications written in C# or VB .NET. You can scan documents or capture photos from webcams using Dynamic .NET TWAIN's image capture module, and then conduct OCR on the images to convert the text in the images to text, searchable PDF files, or strings. Multiple Asian languages, as well as Arabic, are offered in addition to English.

IronOCR offers better licensing than Dynamsoft OCR; IronOcr starts at $799 with one year free, while Dynamsoft starts at $1249 with a free trial. IronOCR also offers licenses for multiple users, while with Dynamsoft, you only get one license per user.

While both sets of software aim at offering the best performance in terms of OCR readings of barcodes, image to text, and image to text, IronOCR stands out in that it shines its light even on images that are in pretty bad shape. It automatically puts in place its sophisticated tuning methods to give you the best OCR results. IronOCR also makes use of Tesseract to give you optimal results with little or no errors.

Iron Software is also offering its customers and users the option to grab its entire suite of software in just two clicks. This means that for the price of two of the components in the Iron Software suite, you can currently get all five components and uninterrupted support.

请注意Dynamsoft OCR is a registered trademark of its respective owner. This site is not affiliated with, endorsed by, or sponsored by Dynamsoft OCR. All product names, logos, and brands are property of their respective owners. Comparisons are for informational purposes only and reflect publicly available information at the time of writing.

常见问题解答

我如何使用C#将文本图像转换为数字格式?

您可以使用C#中的IronOCR将文本图像转换为数字格式。IronOCR提供了自动锐化和纠正低质量扫描的方法,非常适合将各种图像格式转换为文本。

处理低质量扫描IronOCR有哪些优点?

IronOCR通过降低倾斜、失真和背景噪声,改善分辨率和对比度,自动增强低质量扫描,从而确保更高的文本识别准确性。

哪个OCR库更适合跨平台应用程序?

IronOCR适合跨平台应用程序,因为它支持Xamarin和Azure等环境,为在不同平台工作开发人员提供了灵活性。

IronOCR支持哪些图像格式?

IronOCR支持多种图像格式,使其在不同OCR应用中具有多功能性。它可以处理图像和PDF文档,为处理各种输入源提供灵活性。

OCR技术能否在企业的文档管理中提供帮助?

是的,OCR技术可以通过数字化物理文档,使其可以搜索和编辑,显著促进文档管理。这减少了手动数据输入,减少了错误,并提高了可访问性。

Dynamsoft OCR如何处理设备的图像获取?

Dynamsoft OCR支持从符合TWAIN标准的设备和网络摄像头获取图像,允许批量扫描和修改扫描仪属性以实现高效图像处理。

与其他库相比,IronOCR的定价选项是什么?

IronOCR提供可自定义的许可证,能够根据开发人员数量、项目和位置量身定制,与某些其他库相比,更具灵活性和成本效益的选择。

使用OCR技术时遇到的一些常见问题是什么?

OCR技术的常见问题包括处理低分辨率图像、失真和不同的文本字体。然而,像IronOCR这样的库拥有内置功能来解决这些问题,提高OCR的准确性。

Kannaopat Udonpant
软件工程师
在成为软件工程师之前,Kannapat 在日本北海道大学完成了环境资源博士学位。在攻读学位期间,Kannapat 还成为了车辆机器人实验室的成员,隶属于生物生产工程系。2022 年,他利用自己的 C# 技能加入 Iron Software 的工程团队,专注于 IronPDF。Kannapat 珍视他的工作,因为他可以直接从编写大多数 IronPDF 代码的开发者那里学习。除了同行学习外,Kannapat 还喜欢在 Iron Software 工作的社交方面。不撰写代码或文档时,Kannapat 通常可以在他的 PS5 上玩游戏或重温《最后生还者》。