How to Read Screenshots using IronOCR

Screenshots are a quick and easy way to share information and quickly capture vital information to send to colleagues and peers. However, extracting text from screenshots has often proven difficult because of the dimensions and noise involved in taking them. This makes screenshots a less effective medium in the release of OCR.

However, IronOCR resolves that issue by providing specialized methods such as ReadScreenshot to combat this. ReadScreenshot is optimized for reading screenshots and extracting information from them; it also accepts common file formats.

In this guide, we'll quickly demonstrate how to use IronOCR for screenshot text recognition, walking through examples and the properties of the result object.

Start using IronOCR in your project today with a free trial.

First Step:
green arrow pointer

To use this function, you must also install the IronOcr.Extension.AdvancedScan package.

Read Screenshots Example

To read a screenshot in IronOCR, we have to apply the following steps. We utilize the ReadScreenshot method, which takes an OcrInput as a parameter for the input. This method is more optimized for screenshots than the library's standard Read counterpart.

Please note

  • The method currently works for languages including English, Chinese, Japanese, Korean, and Latin-based alphabets.
  • Using advanced scan on .NET Framework requires the project to run on x64 architecture.

Input

Below is our input for the code example; we'll demonstrate the versatility of this method by mixing different text fonts and sizes.

Input

Code

:path=/static-assets/ocr/content-code-examples/how-to/read-screenshot-read-screenshot.cs
using IronOcr;
using System;
using System.Linq;

// Instantiate OCR engine
var ocr = new IronTesseract();

using (var inputScreenshot = new OcrInput())
{
    // Load the image into the OCR input
    inputScreenshot.AddImage("screenshotOCR.png");

    // Perform OCR on the loaded image
    OcrResult result = ocr.Read(inputScreenshot);

    // Output the OCR result text
    Console.WriteLine("Extracted Text:");
    Console.WriteLine(result.Text);

    // Check if any text regions were detected
    if (result.TextRegions.Any())
    {
        // Output the x-coordinate of the first detected text region
        Console.WriteLine("X-Coordinate of the First Text Region:");
        Console.WriteLine(result.TextRegions.First().Rectangle.X);

        // Output the width of the last detected text region
        Console.WriteLine("Width of the Last Text Region:");
        Console.WriteLine(result.TextRegions.Last().Rectangle.Width);
    }
    else
    {
        Console.WriteLine("No text regions were detected.");
    }

    // Output the confidence level of the OCR result
    Console.WriteLine("Confidence Level:");
    Console.WriteLine(result.Confidence);
}
Imports IronOcr
Imports System
Imports System.Linq

' Instantiate OCR engine
Private ocr = New IronTesseract()

Using inputScreenshot = New OcrInput()
	' Load the image into the OCR input
	inputScreenshot.AddImage("screenshotOCR.png")

	' Perform OCR on the loaded image
	Dim result As OcrResult = ocr.Read(inputScreenshot)

	' Output the OCR result text
	Console.WriteLine("Extracted Text:")
	Console.WriteLine(result.Text)

	' Check if any text regions were detected
	If result.TextRegions.Any() Then
		' Output the x-coordinate of the first detected text region
		Console.WriteLine("X-Coordinate of the First Text Region:")
		Console.WriteLine(result.TextRegions.First().Rectangle.X)

		' Output the width of the last detected text region
		Console.WriteLine("Width of the Last Text Region:")
		Console.WriteLine(result.TextRegions.Last().Rectangle.Width)
	Else
		Console.WriteLine("No text regions were detected.")
	End If

	' Output the confidence level of the OCR result
	Console.WriteLine("Confidence Level:")
	Console.WriteLine(result.Confidence)
End Using
$vbLabelText   $csharpLabel

Output

Output

As you can see from the console output above, it extracted all instances of text from the screenshot. Let's dive deeper into the properties of OcrPhotoResult.

  • Text: The extracted text from OCR Input.
  • Confidence: A double property that indicates the statistical accuracy confidence, with a scale from 0 to 1, where 1 is the highest confidence level.
  • TextRegion: An array of TextRegion objects, which hold properties returning the areas where text is found on the screenshot. By default, all TextRegion is a derived Rectangle class from the IronOCR models. It includes the x and y coordinates, as well as the height and width of the rectangle.

Frequently Asked Questions

What is IronOCR?

IronOCR is a C# OCR library that provides methods for reading and extracting text from images, such as screenshots, using Tesseract.

How can I read text from a screenshot using IronOCR?

To read text from a screenshot using IronOCR, you need to use the `ReadScreenshot` method, which is optimized for extracting information from screenshots.

What file formats does IronOCR's ReadScreenshot method accept?

The `ReadScreenshot` method in IronOCR accepts common file formats for processing screenshots.

What additional package is required to use IronOCR's advanced scan features?

To use the advanced scan features of IronOCR, you need to install the IronOcr.Extension.AdvancedScan package from NuGet.

What languages does the IronOCR ReadScreenshot method support?

The `ReadScreenshot` method supports languages including English, Chinese, Japanese, Korean, and other Latin-based alphabets.

What is the confidence property in IronOCR's OcrPhotoResult?

The confidence property in `OcrPhotoResult` is a double value that indicates the statistical accuracy of the extracted text, with a scale from 0 to 1.

How do I start using IronOCR in my C# project?

To use IronOCR in your C# project, first download the library from NuGet, import the screenshot images, and use the `ReadScreenshot` method to extract text.

What are TextRegion objects in IronOCR?

TextRegion objects in IronOCR represent areas where text is found on the screenshot, including their coordinates and dimensions.

Curtis Chau
Technical Writer

Curtis Chau holds a Bachelor’s degree in Computer Science (Carleton University) and specializes in front-end development with expertise in Node.js, TypeScript, JavaScript, and React. Passionate about crafting intuitive and aesthetically pleasing user interfaces, Curtis enjoys working with modern frameworks and creating well-structured, visually appealing manuals.

Beyond development, Curtis has a strong interest in the Internet of Things (IoT), exploring innovative ways to integrate hardware and software. In his free time, he enjoys gaming and building Discord bots, combining his love for technology with creativity.