How to Extract Read Results

The read or OCR result encompasses a wealth of information pertaining to detected paragraphs, lines, words, and individual characters. For each of these elements, the result provides a comprehensive set of details.

For each element, it provides the text content, precise X and Y coordinates, dimensions (width and height), text direction (Left to Right or Top to Bottom), and location in a CropRectangle object.

Get started with IronOCR

Start using IronOCR in your project today with a free trial.

First Step:
green arrow pointer



Data in OcrResult

The result value doesn't only contain the extracted text but also provides information about pages, paragraphs, lines, words, characters, and barcodes discovered in the PDF and image document by IronOcr. You can access this information from the returned OcrResult object using the Read method.

:path=/static-assets/ocr/content-code-examples/how-to/read-results-output-information.cs
using IronOcr;
using System;
using static IronOcr.OcrResult;

// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();

// Add image
using var imageInput = new OcrImageInput("sample.jpg");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);

// Retrieve list of detected paragraphs
Paragraph[] paragraphs = ocrResult.Paragraphs;

// Output information to console
Console.WriteLine($"Text: {paragraphs[0].Text}");
Console.WriteLine($"X: {paragraphs[0].X}");
Console.WriteLine($"Y: {paragraphs[0].Y}");
Console.WriteLine($"Width: {paragraphs[0].Width}");
Console.WriteLine($"Height: {paragraphs[0].Height}");
Console.WriteLine($"Text direction: {paragraphs[0].TextDirection}");
Imports IronOcr
Imports System
Imports IronOcr.OcrResult

' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()

' Add image
Private imageInput = New OcrImageInput("sample.jpg")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)

' Retrieve list of detected paragraphs
Private paragraphs() As Paragraph = ocrResult.Paragraphs

' Output information to console
Console.WriteLine($"Text: {paragraphs(0).Text}")
Console.WriteLine($"X: {paragraphs(0).X}")
Console.WriteLine($"Y: {paragraphs(0).Y}")
Console.WriteLine($"Width: {paragraphs(0).Width}")
Console.WriteLine($"Height: {paragraphs(0).Height}")
Console.WriteLine($"Text direction: {paragraphs(0).TextDirection}")
$vbLabelText   $csharpLabel
Data in OcrResult

For each part of the text, like paragraphs, lines, words, and individual characters, we provide the following information:

  • Text: The actual text as a string.
  • X: The position from the left edge of the page in pixels.
  • Y: The position from the top edge of the page in pixels.
  • Width: The width in pixels.
  • Height: The height in pixels.
  • Text Direction: The direction in which the text was read, like 'Left to Right' or 'Top to Bottom.'
  • Location: A rectangle showing where this text is on the page in pixels.

Paragraph, Line, Word, and Character Comparison

Below is the comparison of the detected paragraphs, lines, words, and characters.

Highlight paragraph
Highlight line
Highlight word
Highlight character

Barcode and QR Code

That’s correct! IronOcr can read barcodes and QR codes. While the feature may not be as robust as IronBarcode, IronOcr does provide support for common barcode types. To enable barcode detection, set the Configuration.ReadBarCodes property to true.

Additionally, valuable information can be extracted from the detected barcode, including its format, value, coordinates (x, y), height, width, and location as IronSoftware.Drawing.Rectangle object. This Rectangle class in IronDrawing allows for precise positioning on the document.

:path=/static-assets/ocr/content-code-examples/how-to/read-results-barcodes.cs
using IronOcr;
using System;
using static IronOcr.OcrResult;

// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();

// Enable barcodes detection
ocrTesseract.Configuration.ReadBarCodes = true;

// Add image
using OcrInput ocrInput = new OcrInput();
ocrInput.LoadPdf("sample.pdf");

// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(ocrInput);

// Output information to console
foreach(var barcode in ocrResult.Barcodes)
{
    Console.WriteLine("Format = " + barcode.Format);
    Console.WriteLine("Value = " + barcode.Value);
    Console.WriteLine("X = " + barcode.X);
    Console.WriteLine("Y = " + barcode.Y);
}
Console.WriteLine(ocrResult.Text);
Imports IronOcr
Imports System
Imports IronOcr.OcrResult

' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()

' Enable barcodes detection
ocrTesseract.Configuration.ReadBarCodes = True

' Add image
Using ocrInput As New OcrInput()
	ocrInput.LoadPdf("sample.pdf")
	
	' Perform OCR
	Dim ocrResult As OcrResult = ocrTesseract.Read(ocrInput)
	
	' Output information to console
	For Each barcode In ocrResult.Barcodes
		Console.WriteLine("Format = " & barcode.Format)
		Console.WriteLine("Value = " & barcode.Value)
		Console.WriteLine("X = " & barcode.X)
		Console.WriteLine("Y = " & barcode.Y)
	Next barcode
	Console.WriteLine(ocrResult.Text)
End Using
$vbLabelText   $csharpLabel

Output

Detect barcodes

Frequently Asked Questions

How do I extract text elements from images and PDFs using C#?

You can extract text elements from images and PDFs using IronOCR by utilizing its `Read` method, which performs Optical Character Recognition (OCR) to obtain details about paragraphs, lines, words, and characters, including their text content, coordinates, and dimensions.

What is the process to get started with OCR in .NET C#?

To begin with OCR in .NET C#, download the IronOCR library from NuGet, prepare your image or PDF document, and use the `Read` method to obtain an `OcrResult` object, which contains detailed information about the extracted text and document structure.

Can IronOCR detect and extract barcode information?

Yes, IronOCR can detect and extract barcode information by setting the `Configuration.ReadBarCodes` property to true, allowing you to retrieve data such as the barcode's format, value, and its position within the document.

What types of document elements can IronOCR detect?

IronOCR can detect various document elements including pages, paragraphs, lines, words, and individual characters, as well as barcodes and QR codes, providing a comprehensive analysis of the document's structure.

How can I configure IronOCR to read text in different directions?

IronOCR is capable of reading text in multiple directions, such as 'Left to Right' or 'Top to Bottom', by analyzing the direction property within the `OcrResult` object.

What is the `CropRectangle` object in IronOCR?

The `CropRectangle` object in IronOCR defines the location and size of text elements on a page in terms of coordinates and dimensions, helping in precise text identification and extraction.

How do I use the IronOCR `Read` method to analyze documents?

To use the `Read` method in IronOCR, create an instance of the IronOCR engine, load your target document, and execute the `Read` method to obtain OCR results, which can be used to access text data and document properties.

How does IronOCR handle the detection of QR codes?

IronOCR handles the detection of QR codes by enabling barcode reading through the `Configuration.ReadBarCodes` setting, which allows it to extract QR code data including its format, value, and location.

What is the role of `OcrResult` in text extraction?

The `OcrResult` object plays a crucial role in text extraction by holding the extracted text and accompanying details such as the position, dimensions, and direction of text elements, as well as barcode information.

How can I ensure accurate text extraction with IronOCR?

To ensure accurate text extraction with IronOCR, make sure to provide high-quality input documents and properly configure settings like `Configuration.ReadBarCodes` for barcode detection to optimize OCR performance.

Chaknith Bin
Software Engineer
Chaknith works on IronXL and IronBarcode. He has deep expertise in C# and .NET, helping improve the software and support customers. His insights from user interactions contribute to better products, documentation, and overall experience.