OcrResult Class

IronOCR returns an advanced result object for each page it scans using Tesseract 5. This contains location data, images, text, statistical confidence, alternative symbol choices, font-names, font-sizes decoration, font weights, and position for each:

  • Page
  • Paragraph
  • Line of Text
  • Word
  • Individual Character
  • Barcode

Here is an example of how you might retrieve and work with these data points using C# with IronOCR:

// Import the IronOCR library
using IronOcr;

class OCRExample
{
    static void Main()
    {
        // Create a new instance of the IronTesseract engine
        var OcrEngine = new IronTesseract();

        // Specify the file path of the scanned document
        var Input = new OcrInput(@"path_to_your_image_file.jpg");

        // Perform OCR on the input image
        OcrResult result = OcrEngine.Read(Input);

        // Check the number of pages detected
        Console.WriteLine($"Detected {result.Pages.Count} page(s)");

        // Iterate through each page
        foreach (var page in result.Pages)
        {
            // Output the page text
            Console.WriteLine($"Page Text: {page.Text}");

            // Iterate through each paragraph in the page
            foreach (var paragraph in page.Paragraphs)
            {
                Console.WriteLine($"Paragraph Text: {paragraph.Text}");

                // Iterate through each line in the paragraph
                foreach (var line in paragraph.Lines)
                {
                    Console.WriteLine($"Line Text: {line.Text}");

                    // Iterate through each word in the line
                    foreach (var word in line.Words)
                    {
                        Console.WriteLine($"Word Text: {word.Text}");

                        // Iterate through each character in the word
                        foreach (var character in word.Characters)
                        {
                            Console.WriteLine($"Character Text: {character.Text}");
                        }
                    }
                }
            }

            //Detect barcodes within the page and output their values
            foreach (var barcode in page.Barcodes)
            {
                Console.WriteLine($"Barcode Value: {barcode.Value}");
            }
        }
    }
}
// Import the IronOCR library
using IronOcr;

class OCRExample
{
    static void Main()
    {
        // Create a new instance of the IronTesseract engine
        var OcrEngine = new IronTesseract();

        // Specify the file path of the scanned document
        var Input = new OcrInput(@"path_to_your_image_file.jpg");

        // Perform OCR on the input image
        OcrResult result = OcrEngine.Read(Input);

        // Check the number of pages detected
        Console.WriteLine($"Detected {result.Pages.Count} page(s)");

        // Iterate through each page
        foreach (var page in result.Pages)
        {
            // Output the page text
            Console.WriteLine($"Page Text: {page.Text}");

            // Iterate through each paragraph in the page
            foreach (var paragraph in page.Paragraphs)
            {
                Console.WriteLine($"Paragraph Text: {paragraph.Text}");

                // Iterate through each line in the paragraph
                foreach (var line in paragraph.Lines)
                {
                    Console.WriteLine($"Line Text: {line.Text}");

                    // Iterate through each word in the line
                    foreach (var word in line.Words)
                    {
                        Console.WriteLine($"Word Text: {word.Text}");

                        // Iterate through each character in the word
                        foreach (var character in word.Characters)
                        {
                            Console.WriteLine($"Character Text: {character.Text}");
                        }
                    }
                }
            }

            //Detect barcodes within the page and output their values
            foreach (var barcode in page.Barcodes)
            {
                Console.WriteLine($"Barcode Value: {barcode.Value}");
            }
        }
    }
}
CONVERTER NOT RUNNING
$vbLabelText   $csharpLabel

Explanation:

  • IronTesseract Engine: This is used to initiate the OCR process on the provided image input.

  • OcrInput: Represents the image file that will be processed. You need to specify the path to your image file.

  • Read Method: This processes the image and returns an OcrResult with all extracted data.

  • Iterating Structure: The example provided utilizes nested loops to dive deep from pages to characters and barcodes, allowing access to every element's text and properties.

  • Console Output: The program writes each text element to the console. Replace these actions with any function you need to perform using these elements.

This structured approach enables a detailed exploration and utilization of the various data points retrieved by IronOCR.