Where does OcrInput.Page live in the IronOCR API?

OcrInput.Page is a class in the IronOcr namespace, shipped in IronOcr.dll, deriving from OcrInputPage. You obtain instances by enumerating the Pages collection of an OcrInput.

How do you read just one page of a multi-page OCR input in C#?

Load and append content to an OcrInput, then enumerate its Pages and use the page you want. Set ContentArea to limit the read to part of the page, and call IronTesseract.Read on the parent input.

Class OcrInput.Page

Represents a page within an OcrInput object. This can relate to one of many images appended, pages of a PDF or TIFF, or frames of a GIF.

Inheritance

System.Object

OcrInputPage

OcrInput.Page

Inherited Members

OcrInputPage.ToBitmap()

OcrInputPage.DrawRectanglesOnPage(Rectangle[], Color)

OcrInputPage.GetCropRectangleImage()

OcrInputPage.GetTextRegions()

OcrInputPage.FindTextRegion()

OcrInputPage.FindMultipleTextRegions()

OcrInputPage.SaveAsImage(String)

OcrInputPage.SaveAsImage(String, AnyBitmap.ImageFormat)

OcrInputPage.Width

OcrInputPage.HorizontalDPI

OcrInputPage.VerticalDPI

OcrInputPage.Height

OcrInputPage.Index

OcrInputPage.ContentArea

Namespace: IronOcr

Assembly: IronOcr.dll

Syntax

public class Page : OcrInputPage

OcrInput.Page is a single page inside an OcrInput, the unit you work with when an input holds more than one image. One page can correspond to an appended image, a page of a PDF or TIFF, or a frame of a GIF, so it is how you address an individual sheet within a multi-page or multi-frame document before or after reading.

You obtain pages by enumerating the Pages collection of an OcrInput you have loaded and appended content to. Each page carries the members it inherits from OcrInputPage: Index for its position in the input, Width and Height for its pixel size, and HorizontalDPI and VerticalDPI for its resolution. ContentArea sets or reports the rectangle the engine should read, which is the member to use when you only need part of a page.

For preparing or inspecting a page, ToBitmap renders it to an AnyBitmap, SaveAsImage writes it out (optionally with an AnyBitmap.ImageFormat), and GetTextRegions, FindTextRegion, and FindMultipleTextRegions locate text areas on the page. Set the page-level options before calling IronTesseract.Read on the parent input.

The input PDFs how-to covers reading multi-page documents, and the OCR a region how-to uses the page content area.