Saltar al pie de página
USANDO IRONOCR

Cómo obtener texto de una captura de pantalla en C#

Many people out there may be wondering "What is an OCR Screenshot?" Others might wonder how to convert a screenshot of any text into a digital text-editable format or to .txt, or .doc format. If you are one of these people, then worry no more because we have the perfect solutions for you.

In this article, we will discuss different tools that will allow you to perform OCR, Optical Character Recognition, on screenshots.

There are many OCR tools out there but today we will be using IronOCR to extract text from screenshots.

1. IronOCR

IronOCR is a software library for the C# and VB.NET programming languages, designed to enable developers to add OCR (Optical Character Recognition) capabilities to their applications. The library can be used to recognize text in images and convert it into machine-readable text. The library is built on the Tesseract OCR engine, which is considered one of the most accurate OCR engines available.

IronOCR can be used to read text from images in many different file formats, including PNG, JPG, TIFF, and PDF. It also provides a range of advanced features for working with text recognition, such as the ability to recognize multiple languages, as well as the ability to recognize text from images that have been rotated or skewed. Additionally, developers can use IronOCR to quickly integrate OCR functionality into their applications, as it provides a simple, easy-to-use API that can be called from C# or VB.NET code. Using IronOCR, you can choose your OCR language, and perform OCR on images, digital PDF files, and scanned PDF files.

IronOCR is considered a good option for developers who want to add OCR functionality to their applications. It's open-source, easy to use and integrate, fast, accurate, and up-to-date with the latest OCR technologies.

2. IronOCR Features

IronOCR provides a wide range of features to help developers integrate OCR functionality into their applications. Some of the key features of IronOCR include:

  1. Multi-language support: IronOCR can recognize text in over 60 languages, including English, Spanish, German, French, Italian, and Chinese.
  2. Automatic detection of text orientation: IronOCR can automatically detect the orientation of text in an image, even if the image has been rotated or skewed.
  3. Support for a wide range of image formats: IronOCR can read text from images in many different file formats, including PNG, JPG, TIFF, and PDF.
  4. Customizable recognition settings: Developers can customize the recognition settings to improve recognition accuracy for specific types of images or use cases.
  5. Ability to recognize text from scanned documents and PDFs with multiple pages.
  6. Fast recognition and high accuracy: IronOCR uses the Tesseract OCR engine, which is one of the most accurate and widely used OCR engines available.
  7. Easy-to-use API: IronOCR provides a simple, easy-to-use API that can be called from C# or VB.NET code, which makes it easy to integrate OCR functionality into any application.

Overall, IronOCR is a powerful tool that provides a wide range of features to help developers add OCR functionality to their applications.

3. Creating a New Project in Visual Studio

Open Visual Studio and go to the File menu. Select "New Project" and then select Console Application.

Enter the project name and select the path in the appropriate text box. Then, click the Create button. Select the required .NET Framework, as in the screenshot below:

How to OCR Get Text From Screenshot in C#, Figure 1: Creating a New Project in Visual Studio Creating a New Project in Visual Studio

The Visual Studio project will now generate the structure for the console application. Once finished, it will open the program.cs file, in which you can write and execute source code.

How to OCR Get Text From Screenshot in C#, Figure 2: The program.cs file, generated from Visual Studio's New Project Wizard The program.cs file, generated from Visual Studio's New Project Wizard

Now we can add the IronOCR library and test the program.

4. Install IronOCR

In Visual Studio, you can easily integrate IronOCR with your C# project.

IronOCR offers multiple processes to integrate with a C# .NET project. Here, we'll discuss one of them: installing IronOCR using the NuGet Package Manager.

In Visual Studio go to Tools > NuGet Package Manager > Package Manager Console

How to OCR Get Text From Screenshot in C#, Figure 3: The NuGet Package Manager UI The NuGet Package Manager UI

After clicking, a new console will appear at the bottom of Visual Studio's window. Type the below command in the console and press enter.

Install-Package IronOcr

IronOCR will be installed in just a few seconds.

5. Using IronOCR to Perform OCR on a Screenshot

IronOCR is a powerful OCR library that can be used to recognize text from screenshots. With IronOCR, you can take a screenshot of text, and then use the library's OCR capabilities to convert the text in the screenshot into a digital, editable format. Here's an example of how you might use IronOCR to perform OCR on a screenshot in C#. To perform screenshot OCR, just capture a screenshot and run the below code to extract the text to any output format you want.

using IronOcr;
using System;

class Program
{
    static void Main()
    {
        // Create an instance of IronTesseract, the core OCR engine
        var ocr = new IronTesseract();

        // Perform OCR on the specified image file
        var result = ocr.Read("ocr.png");

        // Output the recognized text to the console
        Console.WriteLine(result.Text);
    }
}
using IronOcr;
using System;

class Program
{
    static void Main()
    {
        // Create an instance of IronTesseract, the core OCR engine
        var ocr = new IronTesseract();

        // Perform OCR on the specified image file
        var result = ocr.Read("ocr.png");

        // Output the recognized text to the console
        Console.WriteLine(result.Text);
    }
}
Imports IronOcr
Imports System

Friend Class Program
	Shared Sub Main()
		' Create an instance of IronTesseract, the core OCR engine
		Dim ocr = New IronTesseract()

		' Perform OCR on the specified image file
		Dim result = ocr.Read("ocr.png")

		' Output the recognized text to the console
		Console.WriteLine(result.Text)
	End Sub
End Class
$vbLabelText   $csharpLabel

Input Image file

How to OCR Get Text From Screenshot in C#, Figure 4: Sample Screenshot used for input Sample Screenshot used for input

Text Output

- IRONOCR for NET
- The C# OCR Library
- OCR for C# to scan and read images & PDFs
- NET OCR library with 125+ global language packs
- Output as text, structured data, or searchable PDFs
- Supports NET 6, 5, Core, Standard, Framework

6. Using IronOCR to Perform OCR on a Specific Zone

IronOCR allows you to perform OCR on specific zones within an image. This can be useful when the image contains multiple regions of text, and you only want to recognize the text within a specific region. An example code for this is shown below.

using IronOcr;
using IronSoftware.Drawing;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();

        using (var ocrInput = new OcrInput())
        {
            // Define the rectangle to crop the image for OCR
            var contentArea = new CropRectangle(x: 0, y: 0, width: 350, height: 150);

            // Add the image with the specified cropping area
            ocrInput.AddImage("ocr.png", contentArea);

            // Perform the OCR operation on the defined area
            var ocrResult = ocrTesseract.Read(ocrInput);

            // Output the recognized text
            Console.WriteLine(ocrResult.Text);
        }
    }
}
using IronOcr;
using IronSoftware.Drawing;
using System;

class Program
{
    static void Main()
    {
        var ocrTesseract = new IronTesseract();

        using (var ocrInput = new OcrInput())
        {
            // Define the rectangle to crop the image for OCR
            var contentArea = new CropRectangle(x: 0, y: 0, width: 350, height: 150);

            // Add the image with the specified cropping area
            ocrInput.AddImage("ocr.png", contentArea);

            // Perform the OCR operation on the defined area
            var ocrResult = ocrTesseract.Read(ocrInput);

            // Output the recognized text
            Console.WriteLine(ocrResult.Text);
        }
    }
}
Imports IronOcr
Imports IronSoftware.Drawing
Imports System

Friend Class Program
	Shared Sub Main()
		Dim ocrTesseract = New IronTesseract()

		Using ocrInput As New OcrInput()
			' Define the rectangle to crop the image for OCR
			Dim contentArea = New CropRectangle(x:= 0, y:= 0, width:= 350, height:= 150)

			' Add the image with the specified cropping area
			ocrInput.AddImage("ocr.png", contentArea)

			' Perform the OCR operation on the defined area
			Dim ocrResult = ocrTesseract.Read(ocrInput)

			' Output the recognized text
			Console.WriteLine(ocrResult.Text)
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

Output

- IRONOCR for NET
- The C# OCR Library
- OCR for C# to scan and read images & PDFs
- NET OCR library with 125+ global language packs

7. Using IronOCR to Perform OCR on an Image

To perform OCR on an image and save the recognized text in a .txt file, you can use the following code.

using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocr = new IronTesseract();
        using (var input = new OcrInput("ocr.png"))
        {
            // Perform OCR on the image
            var result = ocr.Read(input);

            // Save the recognized text to a .txt file
            result.SaveAsTextFile("output.txt");
        }
    }
}
using IronOcr;
using System;

class Program
{
    static void Main()
    {
        var ocr = new IronTesseract();
        using (var input = new OcrInput("ocr.png"))
        {
            // Perform OCR on the image
            var result = ocr.Read(input);

            // Save the recognized text to a .txt file
            result.SaveAsTextFile("output.txt");
        }
    }
}
Imports IronOcr
Imports System

Friend Class Program
	Shared Sub Main()
		Dim ocr = New IronTesseract()
		Using input = New OcrInput("ocr.png")
			' Perform OCR on the image
			Dim result = ocr.Read(input)

			' Save the recognized text to a .txt file
			result.SaveAsTextFile("output.txt")
		End Using
	End Sub
End Class
$vbLabelText   $csharpLabel

The contents of the output file are shown below:

How to OCR Get Text From Screenshot in C#, Figure 5: Contents of the generated output.txt file Contents of the generated output.txt file

8. Learn More

Read the Image Text Extraction tutorial for more information about how to perform OCR on images.

IronOCR is part of a suite of five .NET libraries designed to work with different types of documents. You can purchase all five libraries for the price of just two licenses.

Preguntas Frecuentes

¿Cómo puedo extraer texto de una captura de pantalla usando OCR en C#?

Puede usar IronOCR en C# para extraer texto de una captura de pantalla aprovechando su API simple para convertir la captura de pantalla en un formato digital editable. Primero, instale IronOCR a través de NuGet en Visual Studio, luego use ejemplos de código proporcionados por IronOCR para realizar OCR en la imagen de su captura de pantalla.

¿Qué es el Reconocimiento Óptico de Caracteres (OCR)?

El reconocimiento óptico de caracteres (OCR) es una tecnología que convierte diferentes tipos de documentos, como documentos de papel escaneados, archivos PDF o imágenes capturadas por una cámara digital, en datos editables y de búsqueda. IronOCR es una biblioteca de C# que facilita el OCR en aplicaciones.

¿Puede IronOCR manejar varios idiomas para OCR?

Sí, IronOCR admite el reconocimiento de texto en más de 60 idiomas, lo que lo hace versátil para aplicaciones internacionales. Proporciona opciones para establecer las preferencias de idioma para garantizar una extracción de texto precisa.

¿Qué formatos de imagen admite IronOCR para OCR?

IronOCR admite varios formatos de imagen para OCR, incluyendo PNG, JPG, TIFF y PDF. Esta flexibilidad permite a los desarrolladores trabajar con una amplia gama de fuentes de imagen sin necesidad de convertir los formatos manualmente.

¿Cómo afecta la orientación del texto a la precisión de OCR?

La orientación del texto puede impactar significativamente en la precisión del OCR. IronOCR detecta y corrige automáticamente la orientación del texto en las imágenes, garantizando que el texto girado o inclinado se reconozca y convierta con precisión en un formato digital.

¿Cómo instalo IronOCR en un proyecto C#?

Para instalar IronOCR en un proyecto C#, use el Administrador de paquetes NuGet en Visual Studio. Busque IronOCR e instálelo en su proyecto para comenzar a usar sus capacidades de OCR para la extracción de texto de imágenes.

¿Cuáles son las ventajas de usar IronOCR para el reconocimiento de texto?

IronOCR ofrece varias ventajas, incluyendo soporte robusto para varios idiomas, corrección automática de la orientación del texto, soporte para múltiples formatos de imagen y configuraciones personalizables para mejorar la precisión de reconocimiento. Su API simple facilita la integración fácil en aplicaciones C#.

¿Es IronOCR adecuado para reconocer texto en zonas específicas de una imagen?

Sí, IronOCR permite a los desarrolladores definir zonas específicas dentro de una imagen para realizar OCR, permitiendo la extracción de texto dirigida. Esta característica es útil para escenarios donde solo una parte de la imagen contiene el texto relevante.

¿Cuáles son algunos consejos comunes para la solución de problemas de OCR?

Los consejos comunes para la solución de problemas de OCR incluyen asegurarse de que la imagen sea clara y de alta resolución, verificar la orientación del texto, asegurarse de que el idioma correcto esté configurado y actualizar a la última versión de IronOCR para un rendimiento óptimo.

¿Cómo puedo convertir los resultados de OCR en un archivo .txt o .doc?

Con IronOCR, puede convertir los resultados de OCR en un archivo .txt o .doc extrayendo el texto de la imagen y guardándolo usando operaciones estándar de E/S de archivos en C#. Esto le permite crear documentos editables a partir de texto basado en imágenes.

Kannaopat Udonpant
Ingeniero de Software
Antes de convertirse en Ingeniero de Software, Kannapat completó un doctorado en Recursos Ambientales de la Universidad de Hokkaido en Japón. Mientras perseguía su grado, Kannapat también se convirtió en miembro del Laboratorio de Robótica de Vehículos, que es parte del Departamento de Ingeniería ...
Leer más