IronOCR Features

IronOCR is a .NET library allowing developers to scan and read text or barcodes from images.

Compatibility

.NET Languages

  • C# (Tutorial | API Reference)

    • Scan and read texts or barcodes from images (jpg, png, gif, tiff, bmp) in C#.
  • VB.NET

    • Scan and read texts or barcodes from images (jpg, png, gif, tiff, bmp) in VB.NET.
  • F#
    • Scan and read text or barcodes from images (jpg, png, gif, tiff, bmp) in F#.

Platforms

  • .NET 5, 6, 7, and 8 (Tutorial)
    • IronOCR is available for .NET 5, 6, 7, and 8.
  • .NET Core 2 and 3 (Tutorial)

    • IronOCR is available for .NET Core 2 and 3.
    • The current release supports .NET Core on Linux, Unix, and macOS client operating systems as well as Mono.
    • A future release will support MAUI and Xamarin
  • .NET Standard 2+ (API Reference)

    • IronOCR is compatible with .NET Standard 2 and upwards.
  • .NET Framework 4.6.2+ (Code Example | API Reference)
    • Scan and read texts or barcode from images with .NET Framework 4.6.2 and above.

App Types

  • Console, Web and Desktop (Tutorial | API Reference)
    • Create apps for Web, Desktop and Console using IronOCR.

Operating Systems

IDEs

  • Microsoft Visual Studio (Tutorial)
  • Jetbrains ReSharper and Rider

Certification

  • Microsoft Authenticode
    • DigiCert Trusted G4 Code Signing RSA4096 SHA384 2021 CA1

OCR Engine

Underlying OCR Engine

  • Tesseract 5 (Tutorial | Code Example)
    • Tesseract is an open source text recognition (OCR) engine, available under the Apache 2.0 license. One of the most accurate and fast library is available for C#.NET. Currently, Tesseract 5 is the most stable version.
  • Detailed Configuration (Code Example)
    • The IronTesseract.Configuration object provides access to the underlying Tesseract API in C# / .NET to configure setup for advanced users.
    • The setup can help improve the output after performing OCR.
    • To improve the OCR speed, please check this code example for Fast OCR Configuration.

International Languages

Tutorial

Text and Barcode Reading

Specialist Documents

  • Receipts
  • Checks (Cheques)
  • Invoices

Concurrency

  • Single and Multithreading (How-To | Code Example)
  • Async Support (How-To | API Reference)

  • Abort Token (Code Example)
    • Allowing the users to suspend the current thread for a specified period in millisecond in the case of reading large input file and there's a stuck while the program or application is running.
  • Timeout (Code Example)
    • Providing optional timeout in milliseconds, after which the OCR read will be cancelled.

Computer Vision

How-To | API Reference

  • Use Computer Vision to find text based on our advanced trained models. IronOCR utilizes OpenCV to use Computer Vision to detect areas where text exists in an image. This is useful for images that contain a lot of noise, images with text in many different places, and images where text is warped. Use of computer vision in IronOCR will determine where text regions exists and then use Tesseract to attempt to read those regions.

OCR Input

Code Example

Read from Many Formats

Filters

  • Filter Wizard (Code Example | API Reference)

    • In case of not knowing which filter should be applied to the image, Filter Wizard provides the list of filters suitable for OcrInput using a brute-force approach that returns the combination with the maximum confidence.
  • OCR Image Filters (How-To | Tutorial | Code Example)

    • Sharpen (API Reference)
      • Sharpens blurred OCR Documents. Flattens Alpha channels to white.
    • EnhanceResolution (API Reference)
      • Enhances the resolution of low quality images.
    • Denoise (API Reference)
      • Removes digital noise. This filter should only be used where noise is expected. Flattens Alpha channels to white.
    • Dilate (API Reference)
      • Advanced Morphology. Dilation adds pixels to the boundaries of objects in an image. Opposite of Erode
    • Erode (API Reference)
      • Advanced Morphology. Erosion removes pixels on object boundaries. Opposite of Dilate
  • Fix Image Orientation (How-To | Tutorial | Code Example)

    • Rotate (Tutorial | API Reference)
      • Rotates images by a number of degrees clockwise. For anti-clockwise, use negative numbers.
    • Deskew (Tutorial | API Reference)
      • Rotates an image so it is the right way up and orthogonal. This is very useful for OCR because Tesseract's tolerance for skewed scans can be as low as 5 degrees
    • Scale (Tutorial | API Reference)
      • Scales OcrInput pages proportionally.
  • Fix Image Colors (How-To | Tutorial | Code Example)

    • Binarize (Tutorial | API Reference)
      • This image filter turns every pixel black or white with no middle ground. May Improve OCR performance cases of very low contrast of text to background.
    • ToGrayscale (API Reference)
      • This image filter turns every pixel into a shade of grayscale. Unlikely to improve OCR accuracy but may improve speed.
    • Invert (Tutorial | API Reference)
      • Inverts every color. E.g. White becomes black : black becomes white.
    • ReplaceColor (API Reference)
      • Replaces a color within an image with another color with a certain threshold.
    • SelectTextColor (API Reference)
      • Replaces a color within an image with another color with a certain threshold.

Apply a Crop Region

OCR Result

Simple Data Output

How-To | Code Example

  • .NET Text Strings
  • Barcode and QR Data
  • Images

Structured Data Output

How-To | Code Example

  • Pages
  • Blocks
  • Paragraphs
  • Lines
  • Words
  • Characters

Export Documents

Highlight Text on a Page for Debugging

  • Drawing red boxes around character/words/lines/paragraph detected as a highlight, and save it as a .png for debugging.

Status and Analytics