Class Page
Inheritance
System.Object
Page
Implements
System.IDisposable
Assembly: IronOcr.dll
Syntax
public sealed class Page : DisposableBase
Page represents one recognized image in IronOCR's low-level Tesseract layer and exposes the text, geometry, and confidence the engine extracted from it. A page is the unit of recognition at this level: feed the engine an image, and the resulting Page is where the recognized output is read back.
A Page carries the image it was built from through the ImageName property, the PageSegmentMode that controlled how the engine split the image into regions, and a RegionOfInterest (a Rect) that can restrict recognition to part of the page. Because it derives from DisposableBase and holds a native recognition handle, a Page is disposable and should be wrapped in a using block so the handle is released when reading finishes.
The text output comes in several shapes. GetText returns the plain recognized text, GetHOCRText returns positioned hOCR markup, GetBoxText returns per-character box coordinates, and GetUNLVText returns the UNLV format used in accuracy testing. GetMeanConfidence reports the overall recognition confidence as a percentage, which is useful for flagging a low-quality scan before trusting its text. For structural work, GetIterator returns a ResultIterator that walks the recognized text element by element, AnalyseLayout returns a PageIterator over layout without running full recognition, and GetSegmentedRegions returns a List<Rectangle> of regions at a chosen PageIteratorLevel. DetectBestOrientation reports the page rotation as an Orientation so the image can be turned upright before a final read.
using DynamicTesseract.Page page = engine.Process(image);
Console.WriteLine(page.GetText());
Console.WriteLine(page.GetMeanConfidence());
The read results how-to covers working with recognized output, the hOCR export how-to covers positioned markup, and the detect page rotation how-to covers orientation.
Properties
ImageName
Declaration
public string ImageName { get; }
Property Value
| Type |
Description |
| System.String |
|
PageSegmentMode
Declaration
public PageSegMode PageSegmentMode { get; }
Property Value
RegionOfInterest
Declaration
public Rect RegionOfInterest { get; set; }
Property Value
Methods
AnalyseLayout()
Declaration
public PageIterator AnalyseLayout()
Returns
DetectBestOrientation(out Orientation, out Single)
Declaration
public void DetectBestOrientation(out Orientation orientation, out float confidence)
Parameters
| Type |
Name |
Description |
| Orientation |
orientation |
|
| System.Single |
confidence |
|
DetectBestOrientation(out Int32, out Single)
Declaration
public void DetectBestOrientation(out int orientation, out float confidence)
Parameters
| Type |
Name |
Description |
| System.Int32 |
orientation |
|
| System.Single |
confidence |
|
DetectBestOrientationAndScript(out Int32, out Single, out String, out Single)
Declaration
public void DetectBestOrientationAndScript(out int orientation, out float confidence, out string scriptName, out float scriptConfidence)
Parameters
| Type |
Name |
Description |
| System.Int32 |
orientation |
|
| System.Single |
confidence |
|
| System.String |
scriptName |
|
| System.Single |
scriptConfidence |
|
Dispose(Boolean)
Declaration
public override void Dispose(bool disposing)
Parameters
| Type |
Name |
Description |
| System.Boolean |
disposing |
|
Overrides
Finalize()
Declaration
protected override void Finalize()
Overrides
GetBoxText(Int32)
Declaration
public string GetBoxText(int pageNum)
Parameters
| Type |
Name |
Description |
| System.Int32 |
pageNum |
|
Returns
| Type |
Description |
| System.String |
|
GetHOCRText(Int32, Boolean)
Declaration
public string GetHOCRText(int pageNum, bool useXHtml = false)
Parameters
| Type |
Name |
Description |
| System.Int32 |
pageNum |
|
| System.Boolean |
useXHtml |
|
Returns
| Type |
Description |
| System.String |
|
GetIterator()
Declaration
public ResultIterator GetIterator()
Returns
GetMeanConfidence()
Declaration
public float GetMeanConfidence()
Returns
| Type |
Description |
| System.Single |
|
GetSegmentedRegions(PageIteratorLevel)
Declaration
public List<Rectangle> GetSegmentedRegions(PageIteratorLevel pageIteratorLevel)
Parameters
Returns
| Type |
Description |
| System.Collections.Generic.List<IronSoftware.Drawing.Rectangle> |
|
GetText()
Declaration
Returns
| Type |
Description |
| System.String |
|
GetUNLVText()
Declaration
public string GetUNLVText()
Returns
| Type |
Description |
| System.String |
|
Implements
System.IDisposable