How to Extract Read Results
The read or OCR result encompasses a wealth of information pertaining to detected paragraphs, lines, words, and individual characters. For each of these elements, the result provides a comprehensive set of details.
For each element, it provides the text content, precise X and Y coordinates, dimensions (width and height), text direction (Left to Right or Top to Bottom), and location in a CropRectangle object.
How to Extract Read Results
- Download a C# library for accessing read results
- Prepare the target image and PDF document
- Use the
Read
method to perform OCR on the imported document - Access the X, Y, width, height, and text direction of the result
- Check the detected paragraphs, lines, words, and character comparisons
Install with NuGet
Install-Package IronOcr
Download DLL
Manually install into your project
Install with NuGet
Install-Package IronOcr
Download DLL
Manually install into your project
Start using IronPDF in your project today with a free trial.
Check out IronOCR on Nuget for quick installation and deployment. With over 8 million downloads, it's transforming OCR with C#.
Install-Package IronOcr
Consider installing the IronOCR DLL directly. Download and manually install it for your project or GAC form: IronOcr.zip
Manually install into your project
Download DLLData in OcrResult
The result value doesn't only contain the extracted text but also provides information about pages, paragraphs, lines, words, characters, and barcodes discovered in the PDF and image document by IronOcr. You can access this information from the returned OcrResult object using the Read
method.
:path=/static-assets/ocr/content-code-examples/how-to/read-results-output-information.cs
using IronOcr;
using System;
using static IronOcr.OcrResult;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Add image
using var imageInput = new OcrImageInput("sample.jpg");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(imageInput);
// Retrieve list of detected paragraphs
Paragraph[] paragraphs = ocrResult.Paragraphs;
// Output information to console
Console.WriteLine($"Text: {paragraphs[0].Text}");
Console.WriteLine($"X: {paragraphs[0].X}");
Console.WriteLine($"Y: {paragraphs[0].Y}");
Console.WriteLine($"Width: {paragraphs[0].Width}");
Console.WriteLine($"Height: {paragraphs[0].Height}");
Console.WriteLine($"Text direction: {paragraphs[0].TextDirection}");
Imports IronOcr
Imports System
Imports IronOcr.OcrResult
' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()
' Add image
Private imageInput = New OcrImageInput("sample.jpg")
' Perform OCR
Private ocrResult As OcrResult = ocrTesseract.Read(imageInput)
' Retrieve list of detected paragraphs
Private paragraphs() As Paragraph = ocrResult.Paragraphs
' Output information to console
Console.WriteLine($"Text: {paragraphs(0).Text}")
Console.WriteLine($"X: {paragraphs(0).X}")
Console.WriteLine($"Y: {paragraphs(0).Y}")
Console.WriteLine($"Width: {paragraphs(0).Width}")
Console.WriteLine($"Height: {paragraphs(0).Height}")
Console.WriteLine($"Text direction: {paragraphs(0).TextDirection}")
For each part of the text, like paragraphs, lines, words, and individual characters, we provide the following information:
- Text: The actual text as a string.
- X: The position from the left edge of the page in pixels.
- Y: The position from the top edge of the page in pixels.
- Width: The width in pixels.
- Height: The height in pixels.
- Text Direction: The direction in which the text was read, like 'Left to Right' or 'Top to Bottom.'
- Location: A rectangle showing where this text is on the page in pixels.
Paragraph, Line, Word, and Character Comparison
Below is the comparison of the detected paragraphs, lines, words, and characters.
Paragraph | Line |
Word | Character |
Barcode and QR Code
That’s correct! IronOcr can read barcodes and QR codes. While the feature may not be as robust as IronBarcode, IronOcr does provide support for common barcode types. To enable barcode detection, set the Configuration.ReadBarCodes property to true.
Additionally, valuable information can be extracted from the detected barcode, including its format, value, coordinates (x, y), height, width, and location as IronSoftware.Drawing.Rectangle object. This Rectangle class in IronDrawing allows for precise positioning on the document.
:path=/static-assets/ocr/content-code-examples/how-to/read-results-barcodes.cs
using IronOcr;
using System;
using static IronOcr.OcrResult;
// Instantiate IronTesseract
IronTesseract ocrTesseract = new IronTesseract();
// Enable barcodes detection
ocrTesseract.Configuration.ReadBarCodes = true;
// Add image
using OcrInput ocrInput = new OcrInput();
ocrInput.LoadPdf("sample.pdf");
// Perform OCR
OcrResult ocrResult = ocrTesseract.Read(ocrInput);
// Output information to console
foreach(var barcode in ocrResult.Barcodes)
{
Console.WriteLine("Format = " + barcode.Format);
Console.WriteLine("Value = " + barcode.Value);
Console.WriteLine("X = " + barcode.X);
Console.WriteLine("Y = " + barcode.Y);
}
Console.WriteLine(ocrResult.Text);
Imports IronOcr
Imports System
Imports IronOcr.OcrResult
' Instantiate IronTesseract
Private ocrTesseract As New IronTesseract()
' Enable barcodes detection
ocrTesseract.Configuration.ReadBarCodes = True
' Add image
Using ocrInput As New OcrInput()
ocrInput.LoadPdf("sample.pdf")
' Perform OCR
Dim ocrResult As OcrResult = ocrTesseract.Read(ocrInput)
' Output information to console
For Each barcode In ocrResult.Barcodes
Console.WriteLine("Format = " & barcode.Format)
Console.WriteLine("Value = " & barcode.Value)
Console.WriteLine("X = " & barcode.X)
Console.WriteLine("Y = " & barcode.Y)
Next barcode
Console.WriteLine(ocrResult.Text)
End Using