Search Results for

    Show / Hide Table of Contents

    Class OcrInput

    OcrInput provides a robust class for preparing one or more Image Files, PDFs, IronSoftware.Drawing.AnyBitmap, SixLabors.ImageSharp.Image, System.Drawing.Bitmap, other famous image library Objects, Streams and Binary Image data for OCR. Instances of OcrInput can be read by the IronTesseract class.

    We recognise that much of the quality of OCR results depends on preparing images to be read. This class allows developers to enhance their scanned documents provide faster, more accurate OCR results using filters such as: EnhanceResolution(Int32), DeNoise(Boolean), ToGrayScale(), IronOcr.OcrInput.Deskew(IronOcr.OcrLanguage,System.Int32,IronOcr.OrientationConfidence), Rotate(Double) and Sharpen().

    Supports for multi-paged OCR input.

    Inheritance
    System.Object
    OcrInput
    Implements
    System.IDisposable
    Namespace: IronOcr
    Assembly: IronOcr.dll
    Syntax
    public class OcrInput : Object, IDisposable

    Constructors

    OcrInput(AnyBitmap)

    Create a new OcrInput object populated with a IronSoftware.Drawing.AnyBitmap.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(AnyBitmap Bitmap)
    Parameters
    Type Name Description
    IronSoftware.Drawing.AnyBitmap Bitmap

    A IronSoftware.Drawing.AnyBitmap.

    OcrInput(AnyBitmap, CropRectangle)

    Create a new OcrInput object populated with a IronSoftware.Drawing.AnyBitmap.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(AnyBitmap Bitmap, CropRectangle ContentArea)
    Parameters
    Type Name Description
    IronSoftware.Drawing.AnyBitmap Bitmap

    A IronSoftware.Drawing.AnyBitmap.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(CropRectangle, Object[])

    Create a new OcrInput object populated with one or more images sharing a common crop area.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts any number of images as File Paths, Streams, Byte Arrays, IronSoftware.Drawing.AnyBitmap, SixLabors.ImageSharp.Image, System.Drawing.Bitmap, or System.Drawing.Image. Each will become a OcrInput.Page.

    Declaration
    public OcrInput(CropRectangle ContentArea, params object[] Inputs)
    Parameters
    Type Name Description
    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Object[] Inputs

    Any number of images as File Paths, Streams, Byte Arrays, SixLabors.ImageSharp.Image, System.Drawing.Bitmap, or System.Drawing.Image.

    OcrInput(Image)

    Create a new OcrInput object populated with a SixLabors.ImageSharp.Image.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(Image Image)
    Parameters
    Type Name Description
    SixLabors.ImageSharp.Image Image

    SixLabors.ImageSharp.Image

    OcrInput(Image, CropRectangle)

    Create a new OcrInput object populated with a specified region of a SixLabors.ImageSharp.Image.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(Image Image, CropRectangle ContentArea)
    Parameters
    Type Name Description
    SixLabors.ImageSharp.Image Image

    SixLabors.ImageSharp.Image

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(Byte[])

    Create a new OcrInput object populated with an Image file as binary data.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(byte[] Bytes)
    Parameters
    Type Name Description
    System.Byte[] Bytes

    Bytes of an Image or PDF file.

    OcrInput(Byte[], CropRectangle)

    Create a new OcrInput object populated with an Image file as binary data.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(byte[] Bytes, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Byte[] Bytes

    Bytes of an Image or PDF file.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of each image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(IEnumerable<AnyBitmap>)

    Create a new OcrInput object populated with multiple IronSoftware.Drawing.AnyBitmap.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<AnyBitmap> Bitmaps)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<IronSoftware.Drawing.AnyBitmap> Bitmaps

    An IEnumerable of IronSoftware.Drawing.AnyBitmap.

    OcrInput(IEnumerable<AnyBitmap>, CropRectangle)

    Create a new OcrInput object populated with multiple IronSoftware.Drawing.AnyBitmap sharing a common ContentArea.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<AnyBitmap> Bitmaps, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<IronSoftware.Drawing.AnyBitmap> Bitmaps

    An IEnumerable of IronSoftware.Drawing.AnyBitmap.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(IEnumerable<Image>)

    Create a new OcrInput object populated with any number of SixLabors.ImageSharp.Image.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<Image> Images)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<SixLabors.ImageSharp.Image> Images

    Any Number of SixLabors.ImageSharp.Image

    OcrInput(IEnumerable<Image>, CropRectangle)

    Create a new OcrInput object populated with any number of SixLabors.ImageSharp.Image Image sharing a common ContentArea.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<Image> Images, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<SixLabors.ImageSharp.Image> Images

    Any Number of SixLabors.ImageSharp.Image

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(IEnumerable<Byte[]>)

    Create a new OcrInput object populated with the binary data of multiple Images with a common ContentArea.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<byte[]> Bytes)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.Byte[]> Bytes

    An IEnumerable of byte arrays containing Image or PDF files.

    OcrInput(IEnumerable<Byte[]>, CropRectangle)

    Create a new OcrInput object populated with the binary data of multiple Images with a common ContentArea.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<byte[]> Bytes, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.Byte[]> Bytes

    An IEnumerable of byte arrays containing Image or PDF files.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of each image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(IEnumerable<Stream>)

    Create a new OcrInput object populated with multiple images as Streams.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<Stream> Streams)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.IO.Stream> Streams

    Steam containing an Image or PDF file.

    OcrInput(IEnumerable<Stream>, CropRectangle)

    Create a new OcrInput object populated with multiple images as Streams sharing a common ContentArea.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<Stream> Streams, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.IO.Stream> Streams

    Steam containing an Image or PDF file.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(IEnumerable<String>)

    Create a new OcrInput object populated with multiple Image files.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<string> FilePaths)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.String> FilePaths

    An IEnumerable of paths to Image or PDF files.

    OcrInput(IEnumerable<String>, CropRectangle)

    Create a new OcrInput object populated with multiple Image files with a common ContentArea.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(IEnumerable<string> FilePaths, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.String> FilePaths

    An IEnumerable of paths to Image or PDF files.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of each image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(Stream)

    Create a new OcrInput object populated with image data as a Stream.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(Stream Stream)
    Parameters
    Type Name Description
    System.IO.Stream Stream

    Steam containing an Image or PDF file.

    OcrInput(Stream, CropRectangle)

    Create a new OcrInput object populated with image data as a Stream.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(Stream Stream, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.IO.Stream Stream

    Steam containing an Image or PDF file.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    OcrInput(Object[])

    Create a new OcrInput object to which images and PDF pages may be added.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(params object[] Inputs)
    Parameters
    Type Name Description
    System.Object[] Inputs

    Any number of images as File Paths, Streams, Byte Arrays, IronSoftware.Drawing.AnyBitmap and SixLabors.ImageSharp.Image.

    OcrInput(String)

    Create a new OcrInput object populated with an Image file or PDF document.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(string FilePath)
    Parameters
    Type Name Description
    System.String FilePath

    Path to an Image or PDF file.

    OcrInput(String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with an Image file.

    This class is IDisposable and is best initiated with a 'using' statement.

    Declaration
    public OcrInput(string FilePath, CropRectangle ContentArea, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.String FilePath

    Path to an Image or PDF file.

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Optional target DPI for the input content

    Fields

    OriginaPdfPageDimensions

    Declaration
    public List<PdfPage> OriginaPdfPageDimensions
    Field Value
    Type Description
    System.Collections.Generic.List<IronSoftware.Pdfium.PdfPage>

    TargetDPI

    The resolution that low resolution images will be enhanced to. To disable upscaling, set this to 0 (will affect read quality).

    TargetDPI also determines the resolution at which PDF documents will be sampled.

    Declaration
    public int TargetDPI
    Field Value
    Type Description
    System.Int32

    Properties

    Pages

    Access to every OcrInput.Page within this OcrInput

    Declaration
    public List<OcrInput.Page> Pages { get; }
    Property Value
    Type Description
    System.Collections.Generic.List<OcrInput.Page>

    Title

    A title for the OcrInput document. This is relevant as it becomes metadata when exporting searchable PDFs and HOCR files from IronTesseract results.

    See SaveAsSearchablePdf(String) and SaveAsHocrFile(String)

    Declaration
    public string Title { get; set; }
    Property Value
    Type Description
    System.String

    Methods

    AdaptiveThreshold(Nullable<Single>)

    Applies Bradley Adaptive Threshold to the image.

    Adaptive thresholding is the method where the threshold value is calculated for smaller regions and therefore, there will be different threshold values for different regions.

    Declaration
    public OcrInput AdaptiveThreshold(Nullable<float> thresholdLimit = null)
    Parameters
    Type Name Description
    System.Nullable<System.Single> thresholdLimit

    Threshold limit (0.0-1.0) to consider for binarization.

    0.0 to consider threshold is completely white

    1.0 to consider threshold is completely black

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Add(OcrInput, CropRectangle)

    Adds all pages of an OcrInput to this OcrInput.

    Declaration
    public void Add(OcrInput imageAsOcrInput, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    OcrInput imageAsOcrInput

    OcrInput object to be added to this OcrInput.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area to use of each page of the OcrInput object.

    Add(OcrInput.Page, CropRectangle)

    Adds a OcrInput OcrInput.Page to this OcrInput.

    Declaration
    public void Add(OcrInput.Page imageAsOcrInputPage, CropRectangle ContentArea)
    Parameters
    Type Name Description
    OcrInput.Page imageAsOcrInputPage

    Page to be added.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the page to be added.

    Add(AnyBitmap, CropRectangle)

    Adds a IronSoftware.Drawing.AnyBitmap to this OcrInput.

    Declaration
    public void Add(AnyBitmap imageAsBitmap, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    IronSoftware.Drawing.AnyBitmap imageAsBitmap

    A managed IronSoftware.Drawing.AnyBitmap object.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the image to use with IronOCR.

    Add(Image, CropRectangle)

    Adds a SixLabors.ImageSharp.Image to this OcrInput.

    Declaration
    public void Add(Image image, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    SixLabors.ImageSharp.Image image

    A managed Image object.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the image to use with IronOCR.

    Add(Byte[], CropRectangle)

    Adds a byte array containing the binary data of an image to this OcrInput.

    Declaration
    public void Add(byte[] imageAsByteArray, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.Byte[] imageAsByteArray

    A byte[] containing an image. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the image to use with IronOCR.

    Add(IEnumerable<OcrInput.Page>, CropRectangle)

    Adds a IEnumerable of OcrInput OcrInput.Page to this OcrInput.

    Declaration
    public void Add(IEnumerable<OcrInput.Page> imagesAsOcrInputPages, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<OcrInput.Page> imagesAsOcrInputPages

    Pages to be added.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of every page to be added.

    Add(IEnumerable<AnyBitmap>, CropRectangle)

    Adds a IEnumerable of IronSoftware.Drawing.AnyBitmap to this OcrInput.

    Declaration
    public void Add(IEnumerable<AnyBitmap> imageAsBitmaps, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<IronSoftware.Drawing.AnyBitmap> imageAsBitmaps

    An IEnumerable of managed IronSoftware.Drawing.AnyBitmap objects.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the every image to use with IronOCR.

    Add(IEnumerable<Image>, CropRectangle)

    Adds an IEnumerable of SixLabors.ImageSharp.Images to this OcrInput.

    Declaration
    public void Add(IEnumerable<Image> images, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<SixLabors.ImageSharp.Image> images

    IEnumerable of managed Image objects.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the every image to use with IronOCR.

    Add(IEnumerable<Byte[]>, CropRectangle)

    Adds a IEnumerable of byte array containing the binary data of images to this OcrInput.

    Declaration
    public void Add(IEnumerable<byte[]> imagesAsByteArrays, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.Byte[]> imagesAsByteArrays

    A IEnumerable of byte[] containing image data. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the every image to use with IronOCR.

    Add(IEnumerable<Stream>, CropRectangle)

    Adds an IEnumerable of System.IO.Stream of image raw data to this OcrInput.

    Declaration
    public void Add(IEnumerable<Stream> sourceStreams, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.IO.Stream> sourceStreams

    A IEnumerable of Streams containing raw data of images. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the every image to use with IronOCR.

    Add(IEnumerable<String>, CropRectangle)

    Adds images to this this OcrInput.

    Declaration
    public void Add(IEnumerable<string> imageFilePaths, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.String> imageFilePaths

    IEnumerable of image file paths. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the every image to use with IronOCR.

    Add(Stream, CropRectangle)

    Adds a System.IO.Stream containing the raw data of an image to this OcrInput.

    Declaration
    public void Add(Stream sourceStream, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.IO.Stream sourceStream

    A Stream containing an image. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the image to use with IronOCR.

    Add(String, CropRectangle)

    Adds an image to this this OcrInput.

    Declaration
    public void Add(string imageFilePath, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.String imageFilePath

    File path to an image file. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Area of the image to use with IronOCR.

    AddFrameFromTiff(Byte[], Int32, CropRectangle)

    Adds a single frame (a page) from a Multi-frame TIFF file to the OcrInput document. The Tiff may be input as a file, byte array or stream.

    Each Frame will become a page of this OcrInput

    Declaration
    public void AddFrameFromTiff(byte[] TiffBytes, int FrameIndex, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.Byte[] TiffBytes

    A byte[] containing a TIFF file.

    System.Int32 FrameIndex

    Zero based frame number.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddFrameFromTiff(Stream, Int32, CropRectangle)

    Adds a single frame (a page) from a Multi-frame TIFF file to the OcrInput document. The Tiff may be input as a file, byte array or stream.

    Each Frame will become a page of this OcrInput

    Declaration
    public void AddFrameFromTiff(Stream TiffStream, int FrameIndex, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.IO.Stream TiffStream

    A Stream containing a TIFF file.

    System.Int32 FrameIndex

    Zero based frame number.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddFrameFromTiff(String, Int32, CropRectangle)

    Adds a single frame (a page) from a Multi-frame TIFF file to the OcrInput document. The Tiff may be input as a file, byte array or stream.

    Each Frame will become a page of this OcrInput

    Declaration
    public void AddFrameFromTiff(string TiffPath, int FrameIndex, CropRectangle ContentArea = null)
    Parameters
    Type Name Description
    System.String TiffPath

    A file path to a TIFF image.

    System.Int32 FrameIndex

    Zero based frame number.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddImage(AnyBitmap)

    Adds a IronSoftware.Drawing.AnyBitmap to this OcrInput.

    Declaration
    public void AddImage(AnyBitmap Bitmap)
    Parameters
    Type Name Description
    IronSoftware.Drawing.AnyBitmap Bitmap

    A managed IronSoftware.Drawing.AnyBitmap object.

    AddImage(AnyBitmap, CropRectangle)

    Adds a IronSoftware.Drawing.AnyBitmap to this OcrInput.

    Declaration
    public void AddImage(AnyBitmap Bitmap, CropRectangle ContentArea)
    Parameters
    Type Name Description
    IronSoftware.Drawing.AnyBitmap Bitmap

    A managed IronSoftware.Drawing.AnyBitmap object.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the bitmap to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddImage(AnyBitmap, CropRectangle[])

    Adds a IronSoftware.Drawing.AnyBitmap to this OcrInput with many content area regions. If an empty array is supplied, will use whole image instead.

    Note: Output PDF of SaveAsSearchablePdf when using multiple Crop Rectangles will generate one page per Rectangle/>

    Declaration
    public void AddImage(AnyBitmap Bitmap, CropRectangle[] Rectangles)
    Parameters
    Type Name Description
    IronSoftware.Drawing.AnyBitmap Bitmap

    A managed IronSoftware.Drawing.AnyBitmap object.

    IronSoftware.Drawing.CropRectangle[] Rectangles

    Array of crop rectangles of various content regions.

    AddImage(Image)

    Adds a SixLabors.ImageSharp.Image to this OcrInput.

    Declaration
    public void AddImage(Image Image)
    Parameters
    Type Name Description
    SixLabors.ImageSharp.Image Image

    A managed Image object.

    AddImage(Image, CropRectangle)

    Adds a SixLabors.ImageSharp.Image to this OcrInput. Adds a SixLabors.ImageSharp.Image to this OcrInput.

    Declaration
    public void AddImage(Image Image, CropRectangle ContentArea)
    Parameters
    Type Name Description
    SixLabors.ImageSharp.Image Image

    A managed Image object.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddImage(Image, CropRectangle[])

    Adds a SixLabors.ImageSharp.Image to this OcrInput with many content area regions. If an empty array is supplied, will use whole image instead.

    Note: Output PDF of SaveAsSearchablePdf when using multiple Crop Rectangles will generate one page per Rectangle/>

    Declaration
    public void AddImage(Image Image, CropRectangle[] Rectangles)
    Parameters
    Type Name Description
    SixLabors.ImageSharp.Image Image

    A managed Image object.

    IronSoftware.Drawing.CropRectangle[] Rectangles

    Array of crop rectangles of various content regions.

    AddImage(Byte[])

    Adds a byte array containing the binary data of an image to this OcrInput.

    Declaration
    public void AddImage(byte[] ImageBytes)
    Parameters
    Type Name Description
    System.Byte[] ImageBytes

    A byte[] containing an image. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    AddImage(Byte[], CropRectangle)

    Adds a byte array containing the binary data of an image to this OcrInput.

    Declaration
    public void AddImage(byte[] ImageBytes, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Byte[] ImageBytes

    A byte[] containing an image. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddImage(Stream)

    Adds a System.IO.Stream containing the raw data of an image to this OcrInput.

    Declaration
    public void AddImage(Stream ImageStream)
    Parameters
    Type Name Description
    System.IO.Stream ImageStream

    A Stream containing an image. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    AddImage(Stream, CropRectangle)

    Adds a System.IO.Stream containing the raw data of an image to this OcrInput.

    Declaration
    public void AddImage(Stream ImageStream, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.IO.Stream ImageStream

    A Stream containing an image. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddImage(String)

    Adds an image file to this OcrInput.

    Declaration
    public void AddImage(string ImagePath)
    Parameters
    Type Name Description
    System.String ImagePath

    File path to an image file. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    AddImage(String, CropRectangle)

    Adds an image file to this OcrInput.

    Declaration
    public void AddImage(string ImagePath, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.String ImagePath

    File path to an image file. Supported formats include JPEG, TIFF, GIF, PNG, PDF, BMP.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddMultiFrameTiff(Byte[])

    Adds a byte[] containing the binary data of a TIFF image with multiple pages to this OcrInput.

    Declaration
    public void AddMultiFrameTiff(byte[] TiffBytes)
    Parameters
    Type Name Description
    System.Byte[] TiffBytes

    A byte[] containing a TIFF file.

    AddMultiFrameTiff(Byte[], CropRectangle)

    Adds a byte[] containing the binary data of a TIFF image with multiple pages to this OcrInput.

    Declaration
    public void AddMultiFrameTiff(byte[] TiffBytes, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.Byte[] TiffBytes

    A byte[] containing a TIFF file.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddMultiFrameTiff(Stream)

    Adds a Stream containing the binary data of a TIFF image with multiple pages to this OcrInput.

    Declaration
    public void AddMultiFrameTiff(Stream TiffStream)
    Parameters
    Type Name Description
    System.IO.Stream TiffStream

    A System.IO.Stream containing a TIFF file .

    AddMultiFrameTiff(Stream, CropRectangle)

    Adds a Stream containing the binary data of a TIFF image with multiple pages to this OcrInput.

    Declaration
    public void AddMultiFrameTiff(Stream TiffStream, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.IO.Stream TiffStream

    A System.IO.Stream containing a TIFF file .

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddMultiFrameTiff(String)

    Adds a Multi-frame TIFF file to the OcrInput document.

    Each Frame will become a page of this OcrInput

    Declaration
    public void AddMultiFrameTiff(string ImagePath)
    Parameters
    Type Name Description
    System.String ImagePath

    A file path to a TIFF image.

    AddMultiFrameTiff(String, CropRectangle)

    Adds a Multi-frame TIFF file to the OcrInput document.

    Each Frame will become a page of this OcrInput

    Declaration
    public void AddMultiFrameTiff(string ImagePath, CropRectangle ContentArea)
    Parameters
    Type Name Description
    System.String ImagePath

    A file path to a TIFF image.

    IronSoftware.Drawing.CropRectangle ContentArea

    Optionally specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    AddPdf(Byte[], String, CropRectangle, Nullable<Int32>)

    Adds all pages of a PDF document to this OcrInput.

    Declaration
    public void AddPdf(byte[] PdfBytes, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.Byte[] PdfBytes

    Binary data of a PDF file

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdf(Stream, String, CropRectangle, Nullable<Int32>)

    Adds all pages of a PDF document to this OcrInput.

    Declaration
    public void AddPdf(Stream PdfStream, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.IO.Stream PdfStream

    System.IO.Stream containing a PDF

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdf(String, Int32, String)

    Adds all pages of a PDF document to this OcrInput.

    Declaration
    public void AddPdf(string PdfPath, int DPI, string Password = null)
    Parameters
    Type Name Description
    System.String PdfPath

    String file path to the PDF

    System.Int32 DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    AddPdf(String, String, CropRectangle, Nullable<Int32>)

    Adds all pages of a PDF document to this OcrInput.

    Declaration
    public void AddPdf(string PdfPath, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.String PdfPath

    String file path to the PDF

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdfPage(Byte[], Int32, String, CropRectangle, Nullable<Int32>)

    Adds one page of a PDF document to this OcrInput.

    Declaration
    public void AddPdfPage(byte[] PdfBytes, int Page, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.Byte[] PdfBytes

    Binary data of a PDF file

    System.Int32 Page

    The page number within the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdfPage(Stream, Int32, String, CropRectangle, Nullable<Int32>)

    Adds one page of a PDF document to this OcrInput.

    Declaration
    public void AddPdfPage(Stream PdfStream, int Page, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.IO.Stream PdfStream

    System.IO.Stream containing a PDF

    System.Int32 Page

    The page number within the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdfPage(String, Int32, String, CropRectangle, Nullable<Int32>)

    Adds one page of a PDF document to this OcrInput.

    Declaration
    public void AddPdfPage(string PdfPath, int Page, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.String PdfPath

    String file path to the PDF

    System.Int32 Page

    The page number within the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdfPages(Byte[], IEnumerable<Int32>, String, CropRectangle, Nullable<Int32>)

    Adds selected pages of a PDF document to this OcrInput.

    Declaration
    public void AddPdfPages(byte[] PdfBytes, IEnumerable<int> Pages, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.Byte[] PdfBytes

    Binary data of a PDF file

    System.Collections.Generic.IEnumerable<System.Int32> Pages

    The page numbers within the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdfPages(Stream, IEnumerable<Int32>, String, CropRectangle, Nullable<Int32>)

    Adds selected pages of a PDF document to this OcrInput.

    Declaration
    public void AddPdfPages(Stream PdfStream, IEnumerable<int> Pages, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.IO.Stream PdfStream

    System.IO.Stream containing a PDF

    System.Collections.Generic.IEnumerable<System.Int32> Pages

    The page numbers within the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddPdfPages(String, IEnumerable<Int32>, String, CropRectangle, Nullable<Int32>)

    Adds selected pages from a PDF document into this OcrInput.

    Declaration
    public void AddPdfPages(string PdfPath, IEnumerable<int> Pages, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI)
    Parameters
    Type Name Description
    System.String PdfPath

    String file path to the PDF

    System.Collections.Generic.IEnumerable<System.Int32> Pages
    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    AddRange(OcrInput)

    Combines 2 instances of OcrInput, appending pages to the end of this OcrInput document.

    Declaration
    public void AddRange(OcrInput Range)
    Parameters
    Type Name Description
    OcrInput Range

    An Ocr Input to be appended to this instance.

    ApplyMultipleFilters(OcrFilters, Double, Int32, Int32, Int32, Boolean, Nullable<Int32>)

    Apply multiple imaging filters using the specified paramaeters

    Declaration
    public void ApplyMultipleFilters(OcrFilters filters, double Rotation = 0, int MaxDeskewAngle = 45, int MaxWidth = 0, int MaxHeight = 0, bool Use3x3 = false, Nullable<int> thresholdLimit = null)
    Parameters
    Type Name Description
    OcrFilters filters

    Filters to apply

    System.Double Rotation

    Rotation amount. Required when using the Rotation filter. Rotate

    System.Int32 MaxDeskewAngle

    Optional MaxDeskewAngle amount when using the Deskew filter. Defaults to 45 degrees.Deskew

    System.Int32 MaxWidth

    Maximum width. Required when using the Scale filter. Scale

    System.Int32 MaxHeight

    Maximum height. Required when using the Scale filter. Scale

    System.Boolean Use3x3

    Optional morphology when using the Despeckle (DeNoise), Dilate, or Erode filter. DeNoise, Dilate, or Erode

    System.Nullable<System.Int32> thresholdLimit

    Optional Threshold limit (0.0-1.0) to consider for binarization when using the Bradley Adaptive Threshold.

    Remarks

    This method serves as an alternative way to apply multiple filters. Filters are applied in what is typically the optimal order.

    Exceptions
    Type Condition
    System.ArgumentOutOfRangeException

    Binarize()

    This image filter turns every pixel black or white with no middle ground. May Improve OCR performance cases of very low contrast of text to background.

    Declaration
    public OcrInput Binarize()
    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Close(Boolean)

    Advanced Morphology.

    Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing small holes inside the foreground objects.

    Declaration
    public OcrInput Close(bool use3x3 = false)
    Parameters
    Type Name Description
    System.Boolean use3x3

    2x2 is default morphology

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Contrast(Single)

    Increases contrast automatically. This filter often improves OCR speed and accuracy in low contrast scans. Flattens Alpha channels to white.

    Declaration
    public OcrInput Contrast(float amount = 1.1F)
    Parameters
    Type Name Description
    System.Single amount

    Amount which is used to adjust contrast. A value of 0 will create an image that is completely gray. A value of 1 leaves the input unchanged.

    Amount values greater than 0 increase contrast making light areas lighter and dark areas darker.

    Amount values less than 0 decrease contrast - decreasing variety of contrast.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    DeNoise(Boolean)

    Removes digital noise. This filter should only be used where noise is expected. Flattens Alpha channels to white.

    Declaration
    public OcrInput DeNoise(bool use3x3 = false)
    Parameters
    Type Name Description
    System.Boolean use3x3

    2x2 is default morphology

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    See Also
    Despeckle(Boolean)

    Deskew(Int32)

    Rotates an image so it is the right way up and orthogonal. This is very useful for OCR because Tesseract tolerance for skewed scans can be as low as 5 degrees.

    This also helps when producing searchable PDF documents from IronTesseract because the pages will likely all be the right way up.

    This version uses only Hough transform to make minor correction. Example: pages that where put in a scanner at a slight angle.

    Declaration
    public bool Deskew(int MaxDeskewAngle = 45)
    Parameters
    Type Name Description
    System.Int32 MaxDeskewAngle

    Maximum angle of skew to correct for. Higher values can lead to more opportunity for correction, but may be slower and more prone to error including upside down pages.

    Returns
    Type Description
    System.Boolean

    Returns a boolean result of whether or not IronOCR was able to detect image orientation. True = Deskew was applied. False = Failed to detect image orientation and image remains unchanged.

    See Also
    Rotation

    Despeckle(Boolean)

    DeSpeckle as an alias of DeNoise.

    Alias of DeNoise(Boolean) to make this method easily to find in Intensense.

    Declaration
    public OcrInput Despeckle(bool use3x3 = false)
    Parameters
    Type Name Description
    System.Boolean use3x3

    2x2 is default morphology

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Dilate(Boolean)

    Advanced Morphology. Dilation is the opposite of Erosion, instead of shrinking it expands the foreground object.

    Opposite of Erode(Boolean).

    Declaration
    public OcrInput Dilate(bool use3x3 = false)
    Parameters
    Type Name Description
    System.Boolean use3x3

    2x2 is default morphology

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Dispose()

    OcrInput is IDisposable. For best practice and to avoid memory leaks, remember to dispose, or initialize instances with a "using" statement.

    Declaration
    public void Dispose()

    Dispose(Boolean)

    OcrInput is IDisposable. For best practice and to avoid memory leaks, remember to dispose, or initialize instances with a "using" statement.

    Declaration
    public void Dispose(bool disposing = true)
    Parameters
    Type Name Description
    System.Boolean disposing

    EnhanceResolution(Int32)

    Enhances the resolution of low quality images. This filter is not often needed because TargetDPI will automatically catch and resolve low resolution inputs.

    May not work for all images if their metadata is corrupted.

    Declaration
    public OcrInput EnhanceResolution(int TargetDPI = 225)
    Parameters
    Type Name Description
    System.Int32 TargetDPI

    The target DPI to resample to.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Erode(Boolean)

    Advanced Morphology. Erosion is the morphological operation used to diminish the size of the foreground object.

    Opposite of Erode(Boolean).

    Declaration
    public OcrInput Erode(bool use3x3 = false)
    Parameters
    Type Name Description
    System.Boolean use3x3

    2x2 is default morphology

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Finalize()

    OcrInput has a safe finaliser that cleans up undisposed native images in memory.

    Declaration
    protected override void Finalize()

    FindMultipleTextRegions(Double, Int32, Boolean, Boolean)

    Use computer vision to detect areas which contain text elements and divide the input into separate images based on text regions.

    Declaration
    public void FindMultipleTextRegions(double Scale = 0, int DilationAmount = -1, bool Binarize = true, bool Invert = false)
    Parameters
    Type Name Description
    System.Double Scale

    (Only used during text region detection) Resolution scale factor. Image width and height will be multiplied by this value.

    System.Int32 DilationAmount

    (Only used during text region detection) Dilation amount, in pixels. Text areas width and height will be increased by this value.

    System.Boolean Binarize

    (Only used during text region detection) True to convert the image to black and white, False otherwise

    System.Boolean Invert

    (Only used during text region detection) True to invert image colors during binarization, False otherwise

    Remarks

    Useful for generating several OCR results from a single image/page

    FindTextRegion(Double, Int32, Boolean, Boolean)

    Use computer vision to detect regions which contain text elements on each page

    Declaration
    public void FindTextRegion(double Scale = 0, int DilationAmount = -1, bool Binarize = true, bool Invert = false)
    Parameters
    Type Name Description
    System.Double Scale

    (Only used during text region detection) Resolution scale factor. Image width and height will be multiplied by this value.

    System.Int32 DilationAmount

    (Only used during text region detection) Dilation amount, in pixels. Text areas width and height will be increased by this value.

    System.Boolean Binarize

    (Only used during text region detection) True to convert the image to black and white, False otherwise

    System.Boolean Invert

    (Only used during text region detection) True to invert image colors when binarizing, False otherwise

    FromPdf(Byte[], String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with a PDF as binary data.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdf(byte[] PdfBytes, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.Byte[] PdfBytes

    The PDF document as binary data in memory.

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdf(Stream, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with a PDF as a Stream.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdf(Stream PdfStream, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.IO.Stream PdfStream

    The PDF document as a System.IO.Stream.

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdf(String, Int32, String)

    Create a new OcrInput object populated with a PDF.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdf(string PdfPath, int DPI, string Password = null)
    Parameters
    Type Name Description
    System.String PdfPath

    File path to the PDF

    System.Int32 DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    Returns
    Type Description
    OcrInput

    FromPdf(String, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with a PDF.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdf(string PdfPath, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.String PdfPath

    File path to the PDF

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdfPage(Byte[], Int32, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with a single page from a PDF as binary data.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdfPage(byte[] PdfBytes, int Page, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.Byte[] PdfBytes

    The PDF document as binary data in memory.

    System.Int32 Page

    The page number within the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdfPage(Stream, Int32, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with a single page from a PDF as a Stream.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdfPage(Stream PdfStream, int Page, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.IO.Stream PdfStream

    The PDF document as a System.IO.Stream.

    System.Int32 Page

    The page number within the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdfPage(String, Int32, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with a single page of a PDF.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdfPage(string PdfPath, int Page, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.String PdfPath

    File path to the PDF

    System.Int32 Page

    Which page of the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdfPages(Byte[], IEnumerable<Int32>, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with multiple pages from a PDF as binary data.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdfPages(byte[] PdfBytes, IEnumerable<int> Pages, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.Byte[] PdfBytes

    The PDF document as binary data in memory.

    System.Collections.Generic.IEnumerable<System.Int32> Pages

    List which pages of the PDF which will be read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdfPages(Stream, IEnumerable<Int32>, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with multiple selected pages from a PDF as a Stream.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdfPages(Stream PdfStream, IEnumerable<int> Pages, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.IO.Stream PdfStream

    The PDF document as a System.IO.Stream.

    System.Collections.Generic.IEnumerable<System.Int32> Pages

    The pages of the PDF to read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    FromPdfPages(String, IEnumerable<Int32>, String, CropRectangle, Nullable<Int32>)

    Create a new OcrInput object populated with multiple pages from a PDF.

    This class is IDisposable and is best initiated with a 'using' statement.

    This constructor accepts pdf as File Paths, Streams, or Byte Arrays. Each will become a OcrInput.Page.

    Declaration
    public static OcrInput FromPdfPages(string PdfPath, IEnumerable<int> Pages, string Password = null, CropRectangle ContentArea = null, Nullable<int> DPI = null)
    Parameters
    Type Name Description
    System.String PdfPath

    File path to the PDF

    System.Collections.Generic.IEnumerable<System.Int32> Pages

    List which pages of the PDF will be read. Zero based (first page is number 0)

    System.String Password

    Optional Password to unlock an encrypted or protected PDF

    IronSoftware.Drawing.CropRectangle ContentArea

    Specifies a region of the image to extract text from as a IronSoftware.Drawing.CropRectangle with X, Y Width and Height in pixels. Setting a ContentArea can improve OCR speed.

    System.Nullable<System.Int32> DPI

    Resolution at which to sample the PDF. If null or zero will use TargetDPI

    Returns
    Type Description
    OcrInput

    HighlightTextAndSaveAsImages(IronTesseract, String, ResultHighlightType)

    Based on the ResultHighlightType, will draw red boxes around characters/words/lines/paragraphs detected, and save to a PNG image.

    For best results, perform all filters before calling.

    Declaration
    public void HighlightTextAndSaveAsImages(IronTesseract tesseract, string filename, ResultHighlightType type)
    Parameters
    Type Name Description
    IronTesseract tesseract

    IronTesseract instance used to scan the OcrInput.

    System.String filename

    File will be saved as : 'filename_page_0.png'. You may use an absolute or relative path.

    ResultHighlightType type

    Choose whether each box represents a character, word, line, paragraph.

    HoughTransformStraighten(Int32)

    Uses a Hough Transform to rotate and image to the nearest 90 degrees of straightness. This is very useful for OCR because Tesseract tolerance for skewed scans can be as low as 5 degrees.

    A Synonym of Deskew(Int32)

    Declaration
    public OcrInput HoughTransformStraighten(int MaxDeskewAngle = 45)
    Parameters
    Type Name Description
    System.Int32 MaxDeskewAngle

    Maximum angle of skew to correct for. Higher values can lead to more opportunity for correction, but may be slower and more prone to error including upside down pages.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    See Also
    Rotation

    Invert(Boolean)

    Inverts every color. E.g. White becomes black : black becomes white.

    Declaration
    public OcrInput Invert(bool GrayScale = true)
    Parameters
    Type Name Description
    System.Boolean GrayScale

    Optionally remove all color channels and return a GrayScale image.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Open(Boolean)

    Advanced Morphology.

    Opening is just another name of erosion followed by dilation. It is useful in removing noise.

    Declaration
    public OcrInput Open(bool use3x3 = false)
    Parameters
    Type Name Description
    System.Boolean use3x3

    2x2 is default morphology

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    OrientPagesWithOSD(IronTesseract, OrientationConfidence)

    Declaration
    public OcrInput OrientPagesWithOSD(IronTesseract TesseractInstance, OrientationConfidence Confidence = null)
    Parameters
    Type Name Description
    IronTesseract TesseractInstance

    Reads OcrLanguage settings from your IronTesseract instance to help detect letters and numbers to straighten pages. If you wish to use multiple languages please use the IronOcr.OcrInput.Deskew(IronOcr.IronTesseract,System.Int32,IronOcr.OrientationConfidence) overload.

    OrientationConfidence Confidence

    Optional OrientationConfidence class used to control and measure OSD by way of confidence thresholds.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    See Also
    Rotation

    OrientPagesWithOSD(OcrLanguage, OrientationConfidence)

    Uses Tesseract "Orientation & Script Detection: to turn OcrInput pages the right way up in multiples of 90 degress.

    Declaration
    public OcrInput OrientPagesWithOSD(OcrLanguage CharacterDetectionLanguage, OrientationConfidence Confidence = null)
    Parameters
    Type Name Description
    OcrLanguage CharacterDetectionLanguage

    An OcrLanguage used to help detect letters and numbers to straighten pages. If you wish to use multiple languages please use the IronOcr.OcrInput.Deskew(IronOcr.IronTesseract,System.Int32,IronOcr.OrientationConfidence) overload.

    OrientationConfidence Confidence

    Optional OrientationConfidence class used to control and measure OSD by way of confidence thresholds.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    See Also
    Rotation

    PageCount()

    The number of OcrInput.Pages currently present in this OcrInput

    Declaration
    public int PageCount()
    Returns
    Type Description
    System.Int32

    ReplaceColor(Color, Color, Int32)

    Replace current color to new color in Image

    Declaration
    public OcrInput ReplaceColor(Color currentColor, Color newColor, int tolerance = 10)
    Parameters
    Type Name Description
    IronSoftware.Drawing.Color currentColor

    Current IronSoftware.Drawing.Color

    IronSoftware.Drawing.Color newColor

    New IronSoftware.Drawing.Color

    System.Int32 tolerance

    Tolerance Value

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Rotate(Double)

    Rotates images by a number of degrees clockwise. For anti-clockwise, use negative numbers. Also see Deskew(Int32)

    Declaration
    public OcrInput Rotate(double Degrees)
    Parameters
    Type Name Description
    System.Double Degrees

    A number of clockwise degrees.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    SaveAsImages(String, OcrInput.ImageType)

    Exports an OcrInput object as Images

    Declaration
    public string[] SaveAsImages(string Prefix = "export_of_page", OcrInput.ImageType Extension)
    Parameters
    Type Name Description
    System.String Prefix

    Will save images at {Prefix}_(page_number).{Extension}. May include a fully qualified file path.

    OcrInput.ImageType Extension

    Output file extension in lower-case.

    Returns
    Type Description
    System.String[]

    Array of saved image file names. Can be multiple if OcrInput used has multiple pages.

    Exceptions
    Type Condition
    System.Exception

    Throws an exception if there are no pages. See OcrInput.Page

    Scale(Int32, Boolean)

    Scales OCRInput pages proportionally.

    Declaration
    public OcrInput Scale(int Percentage, bool ScaleCropArea = true)
    Parameters
    Type Name Description
    System.Int32 Percentage

    The percentage scale. 100 = no effect.

    System.Boolean ScaleCropArea

    Should associated crop areas also be scaled proportionally (recommended true)

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Scale(Int32, Int32, Boolean)

    Scales the OCRInput pages up in size.

    Declaration
    public OcrInput Scale(int MaxWidth, int MaxHeight, bool ScaleCropArea = true)
    Parameters
    Type Name Description
    System.Int32 MaxWidth

    Maximum width in pixels.

    System.Int32 MaxHeight

    Maximum height in pixels.

    System.Boolean ScaleCropArea

    Should associated crop areas also be scaled proportionally (recommended true)

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    SelectTextColor(Color, Int32)

    Binarize an image to read pixels of a color (with threshold) as text and ignore other colors as background.

    This is useful if you image has many colors and a normal binarize will not work. It will turn all text of the color specified into black and the rest of the image to white.

    Declaration
    public OcrInput SelectTextColor(Color selectColor, int tolerance = 10)
    Parameters
    Type Name Description
    IronSoftware.Drawing.Color selectColor

    IronSoftware.Drawing.Color of text to isolate from background.

    System.Int32 tolerance

    [0,255]; Acceptable range of the difference between PixelColor and selectColor for each R, G, and B value

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    SelectTextColors(IEnumerable<Color>, Int32)

    Binarize an image to read pixels of only selected colors (with thresholds) as text and ignore other colors as background.

    This is useful if you image has many colors and a normal binarize will not work. It will turn all text of the color specified into black and the rest of the image to white.

    Declaration
    public OcrInput SelectTextColors(IEnumerable<Color> selectColors, int tolerance = 10)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<IronSoftware.Drawing.Color> selectColors

    IronSoftware.Drawing.Color of text to isolate from background.

    System.Int32 tolerance

    [0,255]; Acceptable range of the difference between PixelColor and selectColor for each R, G, and B value

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Sharpen()

    Sharpens blurred OCR Documents. Applies a Gaussian sharpening filter to image.

    Declaration
    public OcrInput Sharpen()
    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    StampCropRectangleAndSaveAs(CropRectangle, Color, String, OcrInput.ImageType)

    Saves a copy of the image with a rectangle applied and to visualize and debug where CropRectangle will be applied when using IronSoftware.Drawing.CropRectangle on your image.

    Declaration
    public string[] StampCropRectangleAndSaveAs(CropRectangle cropRectangle, Color rectangleColor, string Prefix = "rectangle_on_page", OcrInput.ImageType extension)
    Parameters
    Type Name Description
    IronSoftware.Drawing.CropRectangle cropRectangle

    Use a IronSoftware.Drawing.CropRectangle to debug the area that will be scanned on an image.

    IronSoftware.Drawing.Color rectangleColor

    Color of rectangle drawn. Red is recommended for easy contrast.

    System.String Prefix

    Will save images at {Prefix}_(page_number).{Extension}. May include a fully qualified file path.

    OcrInput.ImageType extension

    Output file extension in lower-case.

    Returns
    Type Description
    System.String[]

    Array of saved image file names. Can be multiple if OcrInput used has multiple pages.

    Exceptions
    Type Condition
    System.Exception

    Throws an exception if there are no pages. See OcrInput.Page

    ToGrayScale()

    This image filter turns every pixel into a shade of grayscale. Unlikely to improve OCR accuracy but may improve speed.

    Declaration
    public OcrInput ToGrayScale()
    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    WithTitle(String)

    Adds a Title to the OcrInput Document. This title will be used when calling SaveAsHocrFile(String) and SaveAsSearchablePdf(String)

    Declaration
    public OcrInput WithTitle(string Title)
    Parameters
    Type Name Description
    System.String Title

    The document title as a string.

    Returns
    Type Description
    OcrInput

    This OcrInput object allowing for LINQ style fluent notation.

    Implements

    System.IDisposable
    ☀
    ☾
    Downloads
    • Download with Nuget
    • Free 30-Day Trial Key
    In This Article
    Back to top
    Install with Nuget
    Want to deploy IronOCR to a live project for FREE?
    What’s included?
    30 days of fully-functional product
    Test and share in a live environment
    No watermarks in production
    Get your free 30-day Trial Key instantly.
    No credit card or account creation required
    Your Trial License Key has been emailed to you.
    Download IronOCR free to apply
    your Trial Licenses Key
    Install with NuGet View Licenses
    Licenses from $499. Have a question? Get in touch.