Saltar al pie de página
USANDO IRONOCR

Cómo crear un software de OCR en C#

Optical Character Recognition (OCR) is a technology that transforms various document formats, including scanned paper documents, PDFs, digital files, or images of printed text taken with a digital camera, into editable and searchable machine-encoded text data.

IronOCR is a great OCR engine library that offers powerful OCR functionalities to developers. In this article, we will explore how to perform OCR using IronOCR with code examples with OCR Software Demo.

What is IronOCR?

IronOCR is a powerful .NET library designed to facilitate optical character recognition (OCR) within C# and VB.NET applications. Leveraging advanced algorithms and machine learning techniques, IronOCR can accurately extract text and content from scanned PDF files, images, and PDFs, making it easier to process, search, and analyze such files programmatically.

With its straightforward API and extensive features, developers can seamlessly integrate OCR capabilities into their applications to automate data extraction, document processing, data entry, and content management tasks. Whether you're working on business, with invoices, reports, automated data extraction, a searchable PDF, or any other text-rich documents, IronOCR offers a reliable solution to handle OCR requirements efficiently.

Getting Started with IronOCR

Before diving into the code examples, you need to install IronOCR via NuGet Package Manager. You can install IronOCR by running the following command in the Package Manager Console:

Install-Package IronOcr

Performing OCR with IronOCR

Basic Text Recognition

To perform basic text recognition using IronOCR, you can use the following code snippet:

using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();
        using (var ocrInput = new OcrInput("ocr.png"))
        {
            var ocrResult = ocrTesseract.Read(ocrInput);
            string recognizedText = ocrResult.Text;
            Console.WriteLine(recognizedText);
        }
    }
}
using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();
        using (var ocrInput = new OcrInput("ocr.png"))
        {
            var ocrResult = ocrTesseract.Read(ocrInput);
            string recognizedText = ocrResult.Text;
            Console.WriteLine(recognizedText);
        }
    }
}
Imports IronOcr
Imports System

Friend Class Program
	Shared Sub Main()
		Dim ocrTesseract = New IronTesseract()
		Using ocrInput As New OcrInput("ocr.png")
			Dim ocrResult = ocrTesseract.Read(ocrInput)
			Dim recognizedText As String = ocrResult.Text
			Console.WriteLine(recognizedText)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

This code uses IronOCR to perform optical character recognition (OCR) on an image file named "ocr.png". It initializes an IronTesseract object and reads the text layer of the image file into an OcrInput object.

The OCR result is then retrieved as recognizedText and printed to the console.

Output
- LOGO SHOP
- LOREM IPSUM
- DOLOR SITAMET CONSECTETUR
- ADIPISCING ELIT
- 1 LOREM IPSUM $3.20
- 2 ORNARE MALESUADA $9.50
- 3 PORTA FERMENTUM $5.90
- 4 SODALES ARCU $6.00
- 5 ELEIFEND $9.00
- 6 SEMNISIMASSA $0.50
- 7 DUIS FAMES DIS $7.60
- 8 FACILISIRISUS $810
- TOTAL AMOUNT $49.80
- CASH $50.00

Advanced OCR Options

IronOCR provides various options that enable you to customize the OCR process according to your image files and requirements. For example, you can specify the OCR language, adjust the image preprocessing settings, or enable text cleaning. Here's an example that demonstrates some of these advanced options:

using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocr = new IronTesseract();
        using var ocrInput = new OcrInput();
        ocrInput.LoadImage(@"images\image.png");

        // Set OCR language to English
        ocr.Language = OcrLanguage.English;

        // Enable text cleaning and enhance the resolution
        ocrInput.DeNoise();
        ocrInput.EnhanceResolution(225);

        var result = ocr.Read(ocrInput);
        if (!string.IsNullOrEmpty(result.Text))
        {
            Console.WriteLine($"Recognized Text: {result.Text}");
        }
    }
}
using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocr = new IronTesseract();
        using var ocrInput = new OcrInput();
        ocrInput.LoadImage(@"images\image.png");

        // Set OCR language to English
        ocr.Language = OcrLanguage.English;

        // Enable text cleaning and enhance the resolution
        ocrInput.DeNoise();
        ocrInput.EnhanceResolution(225);

        var result = ocr.Read(ocrInput);
        if (!string.IsNullOrEmpty(result.Text))
        {
            Console.WriteLine($"Recognized Text: {result.Text}");
        }
    }
}
Imports IronOcr
Imports System

Friend Class Program
	Shared Sub Main()
		Dim ocr = New IronTesseract()
		Dim ocrInput As New OcrInput()
		ocrInput.LoadImage("images\image.png")

		' Set OCR language to English
		ocr.Language = OcrLanguage.English

		' Enable text cleaning and enhance the resolution
		ocrInput.DeNoise()
		ocrInput.EnhanceResolution(225)

		Dim result = ocr.Read(ocrInput)
		If Not String.IsNullOrEmpty(result.Text) Then
			Console.WriteLine($"Recognized Text: {result.Text}")
		End If
	End Sub
End Class
$vbLabelText   $csharpLabel

The code uses IronOCR to perform OCR on an image file "image.png" located in the "images" folder. It sets the OCR language to English, cleans the image noise, and enhances its resolution. The recognized text from the image is extracted and then printed to the console.

How to Create OCR Software Demo in C#: Figure 1

Barcode Reading

IronOCR also supports barcode reading, allowing you to make software to extract barcode information from images. Here's a code example that demonstrates how to read a barcode using IronOCR:

using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();
        ocrTesseract.Configuration.ReadBarCodes = true;

        using var ocrInput = new OcrInput();
        ocrInput.LoadImage(@"images\imageWithBarcode.png");

        var ocrResult = ocrTesseract.Read(ocrInput);
        foreach (var barcode in ocrResult.Barcodes)
        {
            Console.WriteLine(barcode.Value);
        }
    }
}
using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();
        ocrTesseract.Configuration.ReadBarCodes = true;

        using var ocrInput = new OcrInput();
        ocrInput.LoadImage(@"images\imageWithBarcode.png");

        var ocrResult = ocrTesseract.Read(ocrInput);
        foreach (var barcode in ocrResult.Barcodes)
        {
            Console.WriteLine(barcode.Value);
        }
    }
}
Imports IronOcr
Imports System

Friend Class Program
	Shared Sub Main()
		Dim ocrTesseract = New IronTesseract()
		ocrTesseract.Configuration.ReadBarCodes = True

		Dim ocrInput As New OcrInput()
		ocrInput.LoadImage("images\imageWithBarcode.png")

		Dim ocrResult = ocrTesseract.Read(ocrInput)
		For Each barcode In ocrResult.Barcodes
			Console.WriteLine(barcode.Value)
		Next barcode
	End Sub
End Class
$vbLabelText   $csharpLabel

The code uses IronOCR to detect and read barcodes from an image file "imageWithBarcode.png" in the "images" folder. It configures IronOCR to enable barcode reading by setting ReadBarCodes to true. The detected barcode values are then printed to the console.

How to Create OCR Software Demo in C#: Figure 2

PDF Text Extraction

IronOCR can also extract text from PDFs and scanned documents. Here's a code example that demonstrates how to extract text from a PDF file using IronOCR:

using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();
        using var ocrInput = new OcrInput();

        // OCR entire document
        ocrInput.LoadPdf("Email_Report.pdf");

        // Alternatively OCR selected page numbers
        int[] pages = { 1, 2, 3, 4, 5 };
        ocrInput.LoadPdfPages("example.pdf", pages, Password: "password");

        var ocrResult = ocrTesseract.Read(ocrInput);
        Console.WriteLine(ocrResult.Text);
    }
}
using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();
        using var ocrInput = new OcrInput();

        // OCR entire document
        ocrInput.LoadPdf("Email_Report.pdf");

        // Alternatively OCR selected page numbers
        int[] pages = { 1, 2, 3, 4, 5 };
        ocrInput.LoadPdfPages("example.pdf", pages, Password: "password");

        var ocrResult = ocrTesseract.Read(ocrInput);
        Console.WriteLine(ocrResult.Text);
    }
}
Imports IronOcr
Imports System

Friend Class Program
	Shared Sub Main()
		Dim ocrTesseract = New IronTesseract()
		Dim ocrInput As New OcrInput()

		' OCR entire document
		ocrInput.LoadPdf("Email_Report.pdf")

		' Alternatively OCR selected page numbers
		Dim pages() As Integer = { 1, 2, 3, 4, 5 }
		ocrInput.LoadPdfPages("example.pdf", pages, Password:= "password")

		Dim ocrResult = ocrTesseract.Read(ocrInput)
		Console.WriteLine(ocrResult.Text)
	End Sub
End Class
$vbLabelText   $csharpLabel

The code uses IronOCR to perform OCR processing on a PDF document named "Email_Report.pdf". It can OCR the entire document using LoadPdf, or specific pages from "example.pdf" using LoadPdfPages with a password. The recognized text from the OCR operation is printed to the console.

How to Create OCR Software Demo in C#: Figure 3

Conclusion

IronOCR is a powerful .NET library that offers advanced OCR software capabilities, making it easy for developers to perform OCR tasks in their applications. In this article, we explored how to perform basic and advanced OCR Software Demo using IronOCR with code examples.

If you're working on a .NET project and need to integrate OCR functionality, IronOCR is definitely worth considering when looking at different OCR engines. Its ease of use, speed, flexibility, and extensive documentation make it a popular choice among developers for OCR automation tasks.

So why not give IronOCR a try and see how it can simplify your own OCR project development process? It may be the best OCR engine for your projects.

IronOCR offers a free trial license then starts from $799 USD which allows you to continue to get the most out of IronOCR in your projects.

To know more about IronOCR visit here.

Preguntas Frecuentes

¿Cómo puedo realizar OCR en C#?

Puedes realizar OCR en C# utilizando la biblioteca IronOCR. Primero, instálala a través del Administrador de Paquetes NuGet con el comando Install-Package IronOcr. Luego, usa el objeto IronTesseract para leer texto de imágenes o PDF y convertirlos en texto editable.

¿Qué pasos están involucrados en extraer texto de una imagen usando C#?

Para extraer texto de una imagen en C#, usa IronOCR creando una instancia del objeto IronTesseract. Carga tu imagen, como 'ocr.png', y llama al método Read() para procesar la imagen y extraer el texto.

¿Puedo personalizar el proceso OCR con IronOCR?

Sí, IronOCR te permite personalizar el proceso OCR configurando opciones como el idioma OCR, habilitando el preprocesamiento de imágenes para la reducción de ruido y ajustando la resolución para mejorar la precisión.

¿Es posible realizar la lectura de códigos de barras con IronOCR?

Sí, IronOCR admite la lectura de códigos de barras. Puedes configurarlo para detectar y extraer información de códigos de barras de imágenes habilitando la función de lectura de códigos de barras dentro de tu configuración OCR.

¿Cómo extraigo texto de un PDF usando C#?

Usando IronOCR, puedes extraer texto de archivos PDF en C#. Puedes elegir hacer OCR de todo el documento o de páginas específicas cargando el PDF en IronTesseract y usando su método Read() para extraer el texto.

¿Qué hace que IronOCR sea una opción recomendada para desarrolladores?

IronOCR es recomendado para desarrolladores debido a sus características completas de OCR, facilidad de uso, procesamiento rápido y flexibilidad. Se integra perfectamente en proyectos .NET, permitiendo una automatización eficiente de tareas OCR.

¿Hay opciones de licencia disponibles para IronOCR?

IronOCR ofrece varias opciones de licencia, comenzando con una prueba gratuita. Los desarrolladores pueden elegir entre diferentes licencias para continuar usando las capacidades completas de IronOCR en sus aplicaciones.

¿Dónde puedo encontrar ejemplos de código para usar IronOCR?

Puedes encontrar ejemplos de código para usar IronOCR en el artículo 'Demo de Software OCR en C# (Tutorial para Desarrolladores).' Los ejemplos demuestran reconocimiento de texto básico, opciones OCR avanzadas y lectura de códigos de barras.

Kannaopat Udonpant
Ingeniero de Software
Antes de convertirse en Ingeniero de Software, Kannapat completó un doctorado en Recursos Ambientales de la Universidad de Hokkaido en Japón. Mientras perseguía su grado, Kannapat también se convirtió en miembro del Laboratorio de Robótica de Vehículos, que es parte del Departamento de Ingeniería ...
Leer más