Search Results for

    Show / Hide Table of Contents

    Class OcrResult

    A full document object model (DOM) for results when IronTesseract reads an image or OcrInput.

    Gives access to Text, Pages, Words, Paragraphs, Lines, Words, Characters, Images, Barcodes, Coordinates, Font information in granular detail.

    Inheritance
    System.Object
    OcrResult
    Implements
    IronSoftware.Abstractions.Ocr.IOcrResult
    IronSoftware.Abstractions.Absolute.IDocumentPageContainer<OcrResultPagesCollection>
    IronSoftware.Abstractions.Pdf.IDocumentWithExtractableText
    Namespace: IronOcr
    Assembly: IronOcr.dll
    Syntax
    public class OcrResult : Object, IOcrResult, IDocumentPageContainer<OcrResultPagesCollection>, IDocumentWithExtractableText

    Properties

    Barcodes

    Represents every barcode discovered in this OCR document. Developers must set ReadBarCodes = True for this feature to be active.

    Declaration
    public OcrResult.Barcode[] Barcodes { get; }
    Property Value
    Type Description
    OcrResult.Barcode[]

    Blocks

    Represents every block of text discovered in this OCR document in order of appearance. A Block is a collection of 1 or more paragraphs located closely together.

    Declaration
    public OcrResult.Block[] Blocks { get; }
    Property Value
    Type Description
    OcrResult.Block[]

    Cancelled

    Indicates that the Ocr reading was cancelled by the user or after a timeout

    Declaration
    public bool Cancelled { get; }
    Property Value
    Type Description
    System.Boolean

    Characters

    Represents every symbol (char) discovered in this OCR document in order of appearance.

    Declaration
    public OcrResult.Character[] Characters { get; }
    Property Value
    Type Description
    OcrResult.Character[]

    Confidence

    OCR statistical accuracy confidence as an average of every character.

    1 = 100%, 0 = 0%

    Declaration
    public double Confidence { get; }
    Property Value
    Type Description
    System.Double

    EngineModeUsed

    The TesseractEngineMode used to generate this OcrResult.

    Declaration
    public TesseractEngineMode EngineModeUsed { get; }
    Property Value
    Type Description
    TesseractEngineMode
    See Also
    TesseractEngineMode

    Lines

    Represents every line of text discovered in this OCR document in order of appearance.

    Declaration
    public OcrResult.Line[] Lines { get; }
    Property Value
    Type Description
    OcrResult.Line[]

    PageCount

    Declaration
    public int PageCount { get; }
    Property Value
    Type Description
    System.Int32

    Pages

    Represents every page within this OcrResult object.

    Declaration
    public OcrResultPagesCollection Pages { get; }
    Property Value
    Type Description
    OcrResultPagesCollection

    Paragraphs

    Represents every paragraph of text discovered in this OCR document in order of appearance.

    Declaration
    public OcrResult.Paragraph[] Paragraphs { get; }
    Property Value
    Type Description
    OcrResult.Paragraph[]

    Tables

    Represents every table that can be rationalized clearrly in this OCR document. To see tables in the OcrResult, access the Tables property. To enable table reading, set IronTesseract's Configuration.ReadDataTables to true.

     var Ocr = new IronTesseract();
     Ocr.Configuration.ReadDataTables = true;

    Declaration
    public OcrResult.Table[] Tables { get; }
    Property Value
    Type Description
    OcrResult.Table[]

    TesseractVersion

    The TesseractVersion used to generate this OcrResult.

    Declaration
    public string TesseractVersion { get; }
    Property Value
    Type Description
    System.String
    See Also
    TesseractVersion

    Text

    Returns the entire Text content of this OCR document. 4 System.Environment.NewLine characters will separate pages. This is truncated when the product is unlicensed.

    Declaration
    public string Text { get; set; }
    Property Value
    Type Description
    System.String

    Words

    Represents every word discovered in this OCR document in order of appearance.

    Declaration
    public OcrResult.Word[] Words { get; }
    Property Value
    Type Description
    OcrResult.Word[]

    Methods

    ExtractTextFromPage(Int32)

    Declaration
    public string ExtractTextFromPage(int PageIndex)
    Parameters
    Type Name Description
    System.Int32 PageIndex
    Returns
    Type Description
    System.String

    ExtractTextFromPages(IEnumerable<Int32>)

    Declaration
    public string ExtractTextFromPages(IEnumerable<int> PageIndices)
    Parameters
    Type Name Description
    System.Collections.Generic.IEnumerable<System.Int32> PageIndices
    Returns
    Type Description
    System.String

    FromJson(String)

    Deserializes the JSON to the OcrResult object.

    Declaration
    public static OcrResult FromJson(string json)
    Parameters
    Type Name Description
    System.String json

    A JSON string representation of the OcrResult.

    Returns
    Type Description
    OcrResult

    The deserialized OcrResult object from the JSON string.

    Exceptions
    Type Condition
    System.ArgumentNullException

    json is null

    FromJsonFile(String)

    Deserializes the JSON file to the OcrResult object.

    Declaration
    public OcrResult FromJsonFile(string Path)
    Parameters
    Type Name Description
    System.String Path
    Returns
    Type Description
    OcrResult
    Remarks

    Use the method FromJson(String) instead if you want deserialize from a JSON string.

    SaveAsHocrFile(String)

    Exports an hOCR version of the Tesseract results object document. This is an XHTML file which can be read as XML or HTML.

    https://en.wikipedia.org/wiki/HOCR

    Declaration
    public void SaveAsHocrFile(string Path)
    Parameters
    Type Name Description
    System.String Path

    The file path the xhtml file will be saved to.

    SaveAsHocrString()

    Exports an hOCR version of the Tesseract results object document as a string. This is an XHTML file which can be read as XML or HTML.

    https://en.wikipedia.org/wiki/HOCR

    Declaration
    public string SaveAsHocrString()
    Returns
    Type Description
    System.String

    SaveAsHtmlDocument(String, String, Int32, Boolean)

    Converts the OcrResult into an HTML Document eg.: example.html

    Declaration
    public void SaveAsHtmlDocument(string path, string title, int pdfPageMargin = 10, bool fullContentWidth = false)
    Parameters
    Type Name Description
    System.String path

    File path to save to

    System.String title

    Title for the HTML document

    System.Int32 pdfPageMargin

    Margin to use for PDF page

    System.Boolean fullContentWidth

    Optionally use full content width in the HTML

    Remarks

    IronTesseract's Configuration.RenderSearchablePdf flag must be set to true.

    SaveAsHtmlString(String, Int32, Boolean)

    Converts the OcrResult into an HTML string

    Declaration
    public string SaveAsHtmlString(string title, int pdfPageMargin, bool fullContentWidth)
    Parameters
    Type Name Description
    System.String title

    Title for the HTML document

    System.Int32 pdfPageMargin

    Margin to use for PDF page

    System.Boolean fullContentWidth

    Optionally use full content width in the HTML

    Returns
    Type Description
    System.String
    Remarks

    IronTesseract's Configuration.RenderSearchablePdf flag must be set to true.

    SaveAsSearchablePdf(String, Boolean, String, String)

    Exports a searchable PDF version of the OCR input document as a byte array. Works for all input formats including PDFs & Images.

    Declaration
    public byte[] SaveAsSearchablePdf(string Path = null, bool ApplyFilters = false, string CustomFontFile = null, string CustomFontName = null)
    Parameters
    Type Name Description
    System.String Path

    The file path the PDF will be saved to.

    System.Boolean ApplyFilters

    Determine whether applying OcrFilters to the output searchable pdf or not; default is false.

    System.String CustomFontFile

    Optional path to a custom font file (.ttf, .otf) for rendering text in the PDF. Required for proper UTF-8 character support in languages like Polish, Arabic, Chinese, etc.

    System.String CustomFontName

    Optional custom font name. If not provided, will be extracted from the font file.

    Returns
    Type Description
    System.Byte[]

    SaveAsSearchablePdfBytes(Boolean, String, String)

    Exports a searchable PDF version of the OCR input document as a byte array. Works for all input formats including PDFs & Images.

    Declaration
    public byte[] SaveAsSearchablePdfBytes(bool ApplyFilters = false, string CustomFontFile = null, string CustomFontName = null)
    Parameters
    Type Name Description
    System.Boolean ApplyFilters

    Determine whether applying OcrFilters to the output searchable pdf or not; default is false.

    System.String CustomFontFile

    Optional path to a custom font file (.ttf, .otf) for rendering text in the PDF. Required for proper UTF-8 character support in languages like Polish, Arabic, Chinese, etc.

    System.String CustomFontName

    Optional custom font name. If not provided, will be extracted from the font file.

    Returns
    Type Description
    System.Byte[]

    SaveAsSearchablePdfStream(Boolean, String, String)

    Exports a searchable PDF version of the OCR input document as a Stream. Works for all input formats including PDFs & Images.

    Declaration
    public Stream SaveAsSearchablePdfStream(bool ApplyFilters = false, string CustomFontFile = null, string CustomFontName = null)
    Parameters
    Type Name Description
    System.Boolean ApplyFilters

    Determine whether applying OcrFilters to the output searchable pdf or not; default is false.

    System.String CustomFontFile

    Optional path to a custom font file (.ttf, .otf) for rendering text in the PDF.

    System.String CustomFontName

    Optional custom font name. If not provided, will be extracted from the font file.

    Returns
    Type Description
    System.IO.Stream

    SaveAsTextFile(String)

    Exports a .txt version of the Tesseract results objects document. This is a plain text file.

    4 Environment.Newlines between pages. 2 Environment.Newlines between paragraphs.

    Declaration
    public void SaveAsTextFile(string Path)
    Parameters
    Type Name Description
    System.String Path

    The file path the text file will be saved to.

    SaveJsonAs(String)

    Serializes the OcrResult object to a JSON file..

    Declaration
    public void SaveJsonAs(string Path)
    Parameters
    Type Name Description
    System.String Path
    Remarks

    Use the method ToJson() instead if you want to get the string instead of saving to disk.

    ToJson()

    Serializes the OcrResult object to a JSON string.

    Declaration
    public string ToJson()
    Returns
    Type Description
    System.String

    The JSON string representation of the OcrResult.

    Remarks

    Use the method FromJson(String) for deserializing the JSON to the OcrResult object.

    Implements

    IronSoftware.Abstractions.Ocr.IOcrResult
    IronSoftware.Abstractions.Absolute.IDocumentPageContainer<>
    IronSoftware.Abstractions.Pdf.IDocumentWithExtractableText
    ☀
    ☾
    Downloads
    • Download with Nuget
    • Start for Free
    In This Article
    Back to top
    Install with Nuget
    IronOCR_for_dotnet_log2o
    Blue key in circleGet started for FREE
    No credit card required
    Test in a live environment

    Test in production without watermarks.
    Works wherever you need it to.

    Fully-functional product

    Get 30 days of fully functional product.
    Have it up and running in minutes.

    24/5 technical support

    Full access to our support engineering team during your product trial

    Grey key in circleGet started for FREE
    The trial form was submitted successfully.
    Calendar in circleBook Free Live Demo
    No contact, no card details, no commitments Book a 30-minute, personal demo.
    Here's what to expect:

    A live demo of our product and its key features

    Get project specific feature recommendations

    All your questions are answered to make sure you have all the information you need. (No commitment whatsoever.)

    Grey key in circleBook Free Live Demo
    Your booking has been completed Check your e-mail for confirmation
    Support Team Member 6 related to The C# PDF Library Support Team Member 14 related to The C# PDF Library Support Team Member 4 related to The C# PDF Library Support Team Member 2 related to The C# PDF Library
    Online 24/5
    Need help? Our sales team would be glad to help you.
    Try the Enterprise Trial
    ironpdf_for_dotnet_log2o
    Key in blue circle
    Get your free 30-day Trial Key instantly.
    bullet_checkedNo credit card or account creation required
    Key in blue circle
    Get your free 30-day Trial Key instantly.
    Blue key in circleNo credit card or account creation required
    Green Check in orange circle
    The trial form was submitted successfully.
    badge_greencheck_in_yellowcircle
    Thank you for starting a trial

    Please check your email for the trial license key.

    If you don’t receive an email, please start a live chat or email support@ironsoftware.com

    Install with NuGet
    View Licensing
    • Logo Aetna
    • Logo NASA
    • Logo GE
    • Logo Porsche
    • Logo USDA
    • Logo Qatar
    Join Millions of Engineers who’ve tried IronOCR