Class AdvancedOcrElement
Base class for advanced OCR elements (words and characters) that share spatial coordinate data.
Provides common properties for text content, bounding box coordinates, page number, region index, and confidence score.
Namespace: IronOcr
Assembly: IronOcr.dll
Syntax
public abstract class AdvancedOcrElement : Object
AdvancedOcrElement is the shared shape behind every positioned element an advanced OCR read returns, the common data a recognized word or character carries. It pairs the recognized text with the coordinates that place it on the page image, so any element from an advanced read can be highlighted, cropped, or mapped back onto its source the same way. Its two concrete forms are AdvancedWord and AdvancedCharacter, which add nothing structural beyond word versus character granularity.
You do not work with AdvancedOcrElement directly. You receive its derived types from an AdvancedOcrResultBase: the Words array yields AdvancedWord elements and the Characters array yields AdvancedCharacter elements. Both result collections come from the advanced and handwriting reads on IronTesseract, which is where every element in this hierarchy originates.
The members split into two groups. Content is on Text, the recognized string for the element. Geometry is the rest: BoundingBox is the element's pixel rectangle, while X, Y, Width, and Height express that same box as individual values for code that prefers them separately. PageNumber is the 1-based page the element was found on, important for multi-page documents. Two region members add context: RegionIndex is the 0-based index of the text region the element belongs to, and RegionConfidence is the OCR confidence for that region, which is the value to threshold on when filtering shaky output before display or storage.
foreach (AdvancedWord word in result.Words)
if (word.RegionConfidence > 0.8)
Console.WriteLine($"{word.Text} @ {word.BoundingBox}");The read document advanced how-to walks through an advanced read, the read results how-to covers traversing the recognized elements, and the read handwritten image how-to uses the same element shape for handwriting.
Properties
BoundingBox
The bounding rectangle of this element in pixels, relative to the page image.
Declaration
public Rectangle BoundingBox { get; }
Property Value
| Type | Description |
|---|---|
| IronSoftware.Drawing.Rectangle |
Height
The height of the bounding box in pixels.
Declaration
public int Height { get; }
Property Value
| Type | Description |
|---|---|
| System.Int32 |
PageNumber
The 1-based page number where this element was found.
Declaration
public int PageNumber { get; }
Property Value
| Type | Description |
|---|---|
| System.Int32 |
RegionConfidence
OCR confidence score for the text region containing this element.
Value ranges from 0.0 (no confidence) to 1.0 (full confidence).
This is the region-level confidence; all elements within the same region share the same confidence value.
Declaration
public float RegionConfidence { get; }
Property Value
| Type | Description |
|---|---|
| System.Single |
RegionIndex
The 0-based index of the text region this element belongs to on its page.
Declaration
public int RegionIndex { get; }
Property Value
| Type | Description |
|---|---|
| System.Int32 |
Text
The recognized text content.
Declaration
public string Text { get; }
Property Value
| Type | Description |
|---|---|
| System.String |
Width
The width of the bounding box in pixels.
Declaration
public int Width { get; }
Property Value
| Type | Description |
|---|---|
| System.Int32 |
X
The X coordinate (left edge) of the bounding box in pixels, relative to the page image.
Declaration
public int X { get; }
Property Value
| Type | Description |
|---|---|
| System.Int32 |
Y
The Y coordinate (top edge) of the bounding box in pixels, relative to the page image.
Declaration
public int Y { get; }
Property Value
| Type | Description |
|---|---|
| System.Int32 |