Highlight Texts for Debugging

VB C#

using IronOcr;

IronTesseract ocrTesseract = new IronTesseract();

using var ocrInput = new OcrInput();
ocrInput.LoadPdf("document.pdf");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_page_", ResultHighlightType.Paragraph);

Imports IronOcr

Private ocrTesseract As New IronTesseract()

Private ocrInput = New OcrInput()
ocrInput.LoadPdf("document.pdf")
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_page_", ResultHighlightType.Paragraph)

Install-Package IronOcr

Highlight Texts for Debugging

Highlight Text on a Page for Debugging

This guide explains how to draw red boxes around characters, words, lines, or paragraphs detected on a page and save the highlighted content as a .png for debugging purposes.

Sample Code Implementation

Below is an example Python script using OpenCV that creates a fake highlighted text scenario. The program loads a sample image, draws rectangles around predefined text areas, and saves the result as a new image.

import cv2

# Load an image from file
image_path = "sample_image.png"
image = cv2.imread(image_path)

# Define the coordinates for rectangles (example values)
# Coordinates format: (x, y, width, height)
rectangles = [
    (50, 50, 100, 20),  # Drawing box for text at (50, 50) with width=100, height=20
    (200, 100, 150, 25), # Drawing box for text at (200, 100) with width=150, height=25
    # Add more rectangles as needed
]

# Draw rectangles on the image
for (x, y, w, h) in rectangles:
    start_point = (x, y)
    end_point = (x + w, y + h)
    color = (0, 0, 255)  # Red color in BGR
    thickness = 2        # Specifies the thickness of the rectangle boundary
    cv2.rectangle(image, start_point, end_point, color, thickness)

# Save the modified image to a file
output_image_path = "highlighted_text.png"
cv2.imwrite(output_image_path, image)

# Display the image with highlighted text (optional)
cv2.imshow("Highlighted Text", image)
cv2.waitKey(0)  # Wait for any key press to close the display window
cv2.destroyAllWindows()

import cv2

# Load an image from file
image_path = "sample_image.png"
image = cv2.imread(image_path)

# Define the coordinates for rectangles (example values)
# Coordinates format: (x, y, width, height)
rectangles = [
    (50, 50, 100, 20),  # Drawing box for text at (50, 50) with width=100, height=20
    (200, 100, 150, 25), # Drawing box for text at (200, 100) with width=150, height=25
    # Add more rectangles as needed
]

# Draw rectangles on the image
for (x, y, w, h) in rectangles:
    start_point = (x, y)
    end_point = (x + w, y + h)
    color = (0, 0, 255)  # Red color in BGR
    thickness = 2        # Specifies the thickness of the rectangle boundary
    cv2.rectangle(image, start_point, end_point, color, thickness)

# Save the modified image to a file
output_image_path = "highlighted_text.png"
cv2.imwrite(output_image_path, image)

# Display the image with highlighted text (optional)
cv2.imshow("Highlighted Text", image)
cv2.waitKey(0)  # Wait for any key press to close the display window
cv2.destroyAllWindows()

PYTHON

Explanation

Load Image: The script starts by loading an image from a specified path using cv2.imread(). The image path is stored in the image_path variable.
Define Rectangles: The rectangles list holds tuples that define the (x, y, width, height) of the areas you want to highlight.
Draw Rectangles: The loop iterates over each rectangle tuple. It uses cv2.rectangle() to draw a rectangle on the image. The start and end points are calculated from the rectangle dimensions, and the rectangle is drawn in red with a specified line thickness.
Save the Image: After all rectangles are drawn, cv2.imwrite() saves the modified image to a new file specified in the output_image_path.
Display the Image: Optionally, the image with highlighted areas can be displayed using cv2.imshow(). This will open a window showing the image, and cv2.waitKey(0) keeps the window open until any key is pressed. Finally, cv2.destroyAllWindows() closes any open image display windows.

Note: Before running this script, ensure you have OpenCV installed in your Python environment. You can install it using pip:

pip install opencv-python

pip install opencv-python

SHELL

This script can be used as a debugging tool to visually verify the areas of interest in an image, which can aid in debugging tasks like optical character recognition (OCR) or document layout analysis.