C# + VB.Net: Automatic Image to Text Automatic Image to Text
using System;
using IronOcr;
//..
var Ocr = new AutoOcr();
var Result = Ocr.Read(@"C:\path\to\image.png");
Console.WriteLine(Result.Text);
Imports System
Imports IronOcr
'..


Dim Ocr = New AutoOcr()
Dim Result = Ocr.Read("C:\path\to\image.png")
Console.WriteLine(Result.Text)

IronOCR is unique in its ability to automatically detect and read text from imperfectly scanned images and PDF documents. The AutoOCR Class provides the simplest (though not always fastest) way to extract text from images and documents - because it will automatically correct and sharpen low resolution scans, remove background noise, skew, distortion and perspective as well as enhancing resolution & contrast.

Also see the AdvancedOCR class for more granular developer control.

C# + VB.Net: Advanced Ocr Advanced Ocr
using IronOcr;
//..
var Ocr = new AdvancedOcr()
{
    CleanBackgroundNoise = true,
    EnhanceContrast = true,
    EnhanceResolution = true,
    Language =  IronOcr.Languages.English.OcrLanguagePack,
    Strategy = IronOcr.AdvancedOcr.OcrStrategy.Advanced,
    ColorSpace = AdvancedOcr.OcrColorSpace.Color,
    DetectWhiteTextOnDarkBackgrounds = true,
    InputImageType = AdvancedOcr.InputTypes.AutoDetect,
    RotateAndStraighten = true,
    ReadBarCodes = true,
    ColorDepth = 4
};

var testImage = @"C:\path\to\scan.tiff";

var Results = Ocr.Read(testImage);

var Barcodes = Results.Barcodes.Select(b => b.Value);

Console.WriteLine(Results.Text);
Console.WriteLine("Barcodes:" + String.Join(",", Barcodes));
Imports IronOcr
'..


Dim Ocr = New AdvancedOcr() With {
	.CleanBackgroundNoise = True,
	.EnhanceContrast = True,
	.EnhanceResolution = True,
	.Language = IronOcr.Languages.English.OcrLanguagePack,
	.Strategy = IronOcr.AdvancedOcr.OcrStrategy.Advanced,
	.ColorSpace = AdvancedOcr.OcrColorSpace.Color,
	.DetectWhiteTextOnDarkBackgrounds = True,
	.InputImageType = AdvancedOcr.InputTypes.AutoDetect,
	.RotateAndStraighten = True,
	.ReadBarCodes = True,
	.ColorDepth = 4
}

Dim testImage = "C:\path\to\scan.tiff"

Dim Results = Ocr.Read(testImage)

Dim Barcodes = Results.Barcodes.Select(Function(b) b.Value)

Console.WriteLine(Results.Text)
Console.WriteLine("Barcodes:" & String.Join(",", Barcodes))

The AdvanceOCR Class provides granular control to C# and .Net developers to add OCR (image and PDF to text) functionality to their application, and also to fine tune performance to their own specific use case.

By setting variables a perfect balance between speed and accuracy can be found though working with real world examples. Settings include: CleanBackgroundNoise, EnhanceContrast, EnhanceResolution, Language, Strategy, RotateAndStraighten, ColorSpace, DetectWhiteTextOnDarkBackgrounds, InputImageType.

There is also the option to automatically read barcode and QR codes with scanned documents.

C# + VB.Net: PDF to Text PDF to Text
var Ocr = new IronOcr.AutoOcr();
var Results = Ocr.ReadPdf(@"C:\Users\Me\Desktop\Invoice.pdf");


var Barcodes = Results.Barcodes;
var Text = Results.Text;
Console.WriteLine(Text);
Dim Ocr = New IronOcr.AutoOcr()
Dim Results = Ocr.ReadPdf("C:\Users\Me\Desktop\Invoice.pdf")


Dim Barcodes = Results.Barcodes
Dim Text = Results.Text
Console.WriteLine(Text)

Iron OCR can read many image formats, and also PDF documents using either AutoOCR or the AdvancedOCR Classe.

AutoOCR offers the option for PDF characteristics to be automatically detected and a best guess set of OCR settings applied to each document.

Developers may specify to read and entire PDF, a selection of pages or asingle crop area.

C# + VB.Net: Advanced PDF OCR Advanced PDF OCR
using IronOcr;

var Ocr = new AdvancedOcr()
{
    CleanBackgroundNoise = false,
    ColorDepth = 4,
    ColorSpace = AdvancedOcr.OcrColorSpace.Color,
    EnhanceContrast = false,
    DetectWhiteTextOnDarkBackgrounds = false,
    RotateAndStraighten = false,
    Language = IronOcr.Languages.English.OcrLanguagePack,
    EnhanceResolution = false,
    InputImageType = AdvancedOcr.InputTypes.Document,
    ReadBarCodes = true,
    Strategy = AdvancedOcr.OcrStrategy.Fast
};

var PagesToRead = new []{1,2,3};
var Results = Ocr.ReadPdf(@"C:\Users\Me\Desktop\Invoice.pdf", PagesToRead);
var Pages = Results.Pages;
var Barcodes = Results.Barcodes;
var FullPdfText = Results.Text;
Imports IronOcr



Dim Ocr = New AdvancedOcr() With {
	.CleanBackgroundNoise = False,
	.ColorDepth = 4,
	.ColorSpace = AdvancedOcr.OcrColorSpace.Color,
	.EnhanceContrast = False,
	.DetectWhiteTextOnDarkBackgrounds = False,
	.RotateAndStraighten = False,
	.Language = IronOcr.Languages.English.OcrLanguagePack,
	.EnhanceResolution = False,
	.InputImageType = AdvancedOcr.InputTypes.Document,
	.ReadBarCodes = True,
	.Strategy = AdvancedOcr.OcrStrategy.Fast
}

Dim PagesToRead = {1,2,3}
Dim Results = Ocr.ReadPdf("C:\Users\Me\Desktop\Invoice.pdf", PagesToRead)
Dim Pages = Results.Pages
Dim Barcodes = Results.Barcodes
Dim FullPdfText = Results.Text

Iron OCR can read many image formats, and also PDF documents using wither AutoOCR and AdvancedOCR Classes,

Using the AdvancedOCR Class to read a PDF gives granular control on PDF-to-Text conversion and allows the developer to strike aballence between accuracy and speed.

Developers may specify to read and entire PDF, a selection of pages or asingle crop area.

C# + VB.Net: Crop Regions Crop Regions
using IronOcr;
using System;
using System.Drawing; //Add Assembly Reference


// How to read just a rectangular portion of an image or PDF

var Ocr = new AutoOcr();

var X = 100; //px
var Y = 225;
var Width = 300;
var Height = 125;

var CropArea = new Rectangle(X,Y,Width,Height);

var Result = Ocr.Read(@"C:\path\to\image.png", CropArea );

// This approach works equally well with IronOcr.AdvancedOCR and PDF documents

Console.WriteLine(Result.Text);
Imports IronOcr
Imports System
Imports System.Drawing 'Add Assembly Reference


' How to read just a rectangular portion of an image or PDF



Dim Ocr = New AutoOcr()

Dim X = 100 'px
Dim Y = 225
Dim Width = 300
Dim Height = 125

Dim CropArea = New Rectangle(X,Y,Width,Height)

Dim Result = Ocr.Read("C:\path\to\image.png", CropArea)

' This approach works equally well with IronOcr.AdvancedOCR and PDF documents

Console.WriteLine(Result.Text)

With IronOCR it is possible to read a specific region of a documen, which is useful when dealing with regular invoices, reciepts, checks and other form like documents.

C# + VB.Net: International Languages International Languages
using IronOcr;
using System;

// 19 Languages are currently supported by IronOCR
// To use them, please install the language packs (below) as required
var Ocr = new AdvancedOcr()
{
    Language = IronOcr.Languages.Arabic.OcrLanguagePack,
    ColorSpace = AdvancedOcr.OcrColorSpace.GrayScale,
    EnhanceResolution = true,
    EnhanceContrast = true,
    CleanBackgroundNoise = true,
    ColorDepth = 4,
    RotateAndStraighten = false,
    DetectWhiteTextOnDarkBackgrounds = false,
    ReadBarCodes = false,
    Strategy = AdvancedOcr.OcrStrategy.Fast,
    InputImageType = AdvancedOcr.InputTypes.Document
};

var results = Ocr.Read(@"path\to\arabic\document.png");

Console.WriteLine(results.Text);
Imports IronOcr
Imports System

' 19 Languages are currently supported by IronOCR
' To use them, please install the language packs (below) as required


Dim Ocr = New AdvancedOcr() With {
	.Language = IronOcr.Languages.Arabic.OcrLanguagePack,
	.ColorSpace = AdvancedOcr.OcrColorSpace.GrayScale,
	.EnhanceResolution = True,
	.EnhanceContrast = True,
	.CleanBackgroundNoise = True,
	.ColorDepth = 4,
	.RotateAndStraighten = False,
	.DetectWhiteTextOnDarkBackgrounds = False,
	.ReadBarCodes = False,
	.Strategy = AdvancedOcr.OcrStrategy.Fast,
	.InputImageType = AdvancedOcr.InputTypes.Document
}

Dim results = Ocr.Read("path\to\arabic\document.png")

Console.WriteLine(results.Text)

IronOCR supports 22 international languages.

Other than English which is installed by default, language packs may be added to your .net project via Nuget or as DLLS.

Both AdvancedOCR and AutoOCR classes have a Language property which can be set as shown below: Language = IronOcr.Languages.Arabic.OcrLanguagePack

C# + VB.Net: OCR Results Objects OCR Results Objects
using IronOcr;
using System;
using System.Collections.Generic;
using System.Drawing; //Add Assembly Reference

// We can delve deep into OCR results as an object model of
// Pages, Barcodes, Paragraphs, Lines, Words and Characters
var Ocr = new AdvancedOcr()
{
    Language = IronOcr.Languages.English.OcrLanguagePack,
    ColorSpace = AdvancedOcr.OcrColorSpace.GrayScale,
    EnhanceResolution = true,
    EnhanceContrast = true,
    CleanBackgroundNoise = true,
    ColorDepth = 4,
    RotateAndStraighten = false,
    DetectWhiteTextOnDarkBackgrounds = false,
    ReadBarCodes = true,
    Strategy = AdvancedOcr.OcrStrategy.Fast,
    InputImageType = AdvancedOcr.InputTypes.Document
};

var results = Ocr.Read(@"path\to\document.png");

foreach (var page in results.Pages)
{
    // page object
    int page_number = page.PageNumber;
    String page_text = page.Text;
    int page_wordcount = page.WordCount;
    List<OcrResult.OcrBarcode> barcodes = page.Barcodes;

    System.Drawing.Image page_image = page.Image;

    int page_width_px = page.Width;
    int page_height_px = page.Height;

    foreach (var paragraph in page.Paragraphs)
    {
        // pages -> paragraphs

        int paragraph_number = paragraph.ParagraphNumber;
        String paragraph_text = paragraph.Text;
        System.Drawing.Image paragraph_image = paragraph.Image;
        int paragraph_x_location = paragraph.X;
        int paragraph_y_location = paragraph.Y;
        int paragraph_width = paragraph.Width;
        int paragraph_height = paragraph.Height;
        double paragraph_ocr_accuracy = paragraph.Confidence;
        string paragraph_font_name = paragraph.FontName;
        double paragraph_font_size = paragraph.FontSize;
        OcrResult.TextFlow paragrapth_text_direction = paragraph.TextDirection;
        double paragrapth_rotation_degrees = paragraph.TextOrientation;

        foreach (var line in paragraph.Lines)
        {
            // pages -> paragraphs -> lines
            int line_number = line.LineNumber;
            String line_text = line.Text;
            System.Drawing.Image line_image = line.Image;
            int line_x_location = line.X;
            int line_y_location = line.Y;
            int line_width = line.Width;
            int line_height = line.Height;
            double line_ocr_accuracy = line.Confidence;
            double line_skew = line.BaselineAngle;
            double line_offset = line.BaselineOffset;

            foreach (var word in line.Words)
            {
                // pages -> paragraphs -> lines -> words
                int word_number = word.WordNumber;
                String word_text = word.Text;
                System.Drawing.Image word_image = word.Image;
                int word_x_location = word.X;
                int word_y_location = word.Y;
                int word_width = word.Width;
                int word_height = word.Height;
                double word_ocr_accuracy = word.Confidence;
                String word_font_name = word.FontName;
                double word_font_size = word.FontSize;
                bool word_is_bold = word.FontIsBold;
                bool word_is_fixed_width_font = word.FontIsFixedWidth;
                bool word_is_italic = word.FontIsItalic;
                bool word_is_serif_font = word.FontIsSerif;
                bool word_is_underlined = word.FontIsUnderlined;

                foreach (var character in word.Characters)
                {
                    // pages -> paragraphs -> lines -> words -> characters
                    int character_number = character.CharacterNumber;
                    String character_text = character.Text;
                    System.Drawing.Image character_image = character.Image;
                    int character_x_location = character.X;
                    int character_y_location = character.Y;
                    int character_width = character.Width;
                    int character_height = character.Height;
                    double character_ocr_accuracy = character.Confidence;
                }
            }
        }
    }
}
Imports IronOcr
Imports System
Imports System.Collections.Generic
Imports System.Drawing 'Add Assembly Reference

' We can delve deep into OCR results as an object model of
' Pages, Barcodes, Paragraphs, Lines, Words and Characters


Dim Ocr = New AdvancedOcr() With {
	.Language = IronOcr.Languages.English.OcrLanguagePack,
	.ColorSpace = AdvancedOcr.OcrColorSpace.GrayScale,
	.EnhanceResolution = True,
	.EnhanceContrast = True,
	.CleanBackgroundNoise = True,
	.ColorDepth = 4,
	.RotateAndStraighten = False,
	.DetectWhiteTextOnDarkBackgrounds = False,
	.ReadBarCodes = True,
	.Strategy = AdvancedOcr.OcrStrategy.Fast,
	.InputImageType = AdvancedOcr.InputTypes.Document
}

Dim results = Ocr.Read("path\to\document.png")

For Each page In results.Pages
	' page object
	Dim page_number As Integer = page.PageNumber
	Dim page_text As String = page.Text
	Dim page_wordcount As Integer = page.WordCount
	Dim barcodes As List(Of OcrResult.OcrBarcode) = page.Barcodes

	Dim page_image As System.Drawing.Image = page.Image

	Dim page_width_px As Integer = page.Width
	Dim page_height_px As Integer = page.Height

	For Each paragraph In page.Paragraphs
		' pages -> paragraphs

		Dim paragraph_number As Integer = paragraph.ParagraphNumber
		Dim paragraph_text As String = paragraph.Text
		Dim paragraph_image As System.Drawing.Image = paragraph.Image
		Dim paragraph_x_location As Integer = paragraph.X
		Dim paragraph_y_location As Integer = paragraph.Y
		Dim paragraph_width As Integer = paragraph.Width
		Dim paragraph_height As Integer = paragraph.Height
		Dim paragraph_ocr_accuracy As Double = paragraph.Confidence
		Dim paragraph_font_name As String = paragraph.FontName
		Dim paragraph_font_size As Double = paragraph.FontSize
		Dim paragrapth_text_direction As OcrResult.TextFlow = paragraph.TextDirection
		Dim paragrapth_rotation_degrees As Double = paragraph.TextOrientation

		For Each line In paragraph.Lines
			' pages -> paragraphs -> lines
			Dim line_number As Integer = line.LineNumber
			Dim line_text As String = line.Text
			Dim line_image As System.Drawing.Image = line.Image
			Dim line_x_location As Integer = line.X
			Dim line_y_location As Integer = line.Y
			Dim line_width As Integer = line.Width
			Dim line_height As Integer = line.Height
			Dim line_ocr_accuracy As Double = line.Confidence
			Dim line_skew As Double = line.BaselineAngle
			Dim line_offset As Double = line.BaselineOffset

			For Each word In line.Words
				' pages -> paragraphs -> lines -> words
				Dim word_number As Integer = word.WordNumber
				Dim word_text As String = word.Text
				Dim word_image As System.Drawing.Image = word.Image
				Dim word_x_location As Integer = word.X
				Dim word_y_location As Integer = word.Y
				Dim word_width As Integer = word.Width
				Dim word_height As Integer = word.Height
				Dim word_ocr_accuracy As Double = word.Confidence
				Dim word_font_name As String = word.FontName
				Dim word_font_size As Double = word.FontSize
				Dim word_is_bold As Boolean = word.FontIsBold
				Dim word_is_fixed_width_font As Boolean = word.FontIsFixedWidth
				Dim word_is_italic As Boolean = word.FontIsItalic
				Dim word_is_serif_font As Boolean = word.FontIsSerif
				Dim word_is_underlined As Boolean = word.FontIsUnderlined

				For Each character In word.Characters
					' pages -> paragraphs -> lines -> words -> characters
					Dim character_number As Integer = character.CharacterNumber
					Dim character_text As String = character.Text
					Dim character_image As System.Drawing.Image = character.Image
					Dim character_x_location As Integer = character.X
					Dim character_y_location As Integer = character.Y
					Dim character_width As Integer = character.Width
					Dim character_height As Integer = character.Height
					Dim character_ocr_accuracy As Double = character.Confidence
				Next character
			Next word
		Next line
	Next paragraph
Next page

IronOCR returns an advanced result object for each page it scans which returns location, data, text, statistical confidence, font-names, font-sizes decoration and weights, rotation and position for each:


  • Page
  • Paragraph
  • Line of Text
  • Word
  • Individual Character
  • and Barcode

Human Support Directly From Our Development Team

Whether its product, integration or licensing queries, the Iron product development team are on hand to support all of your questions. Get in touch and start a dialog with Iron to make the most of our library in your project.

Ask a Question

Image to Text OCR
for Visual Studio

Pass IronOCR single or multi page scanned images to receive all text, barcode, & QR content in return. The OCR library provides a set of classes to add OCR functionality into Web, Desktop or Console .Net Applications. Input formats can include PDF, JPG, PNG, GIF, BMP and TIFF.

Read the How-To Tutorials

High Accuracy
at High Speeds

The OCR (Optical Character Recognition) engine views pages formatted with multiple popular fonts, weights, italics, and underlines for accurate text reading. Cropping classes further assists OCR to perform at speed and with pinpoint accuracy. Iron’s multithreaded engine accelerates OCR speeds for multi-page documents on multi-core servers.

Get Started

Advanced Image Pre-processing for Consistent Results Every time

What really makes IronOCR special is its ability to read badly scanned documents. Its unique pre-processing library reduces background noise, rotation, distortion and skewed alignment as well as simplifying colours and enhancing resolution & contrast. Iron’s AutoOCR and Advanced OCR settings provide developers with the tools to achieve the best possible results, every time.

Learn More

Support for Languages from across the globe

Language packs available for: Arabic, Simplified Chinese, Traditional Chinese, Danish, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Portuguese, Russian, Spanish, and Swedish. Other languages can be supported by request.

View Language Packs

Output content into your project as Structured Data

IronOCR outputs content as plain text and barcode data. An alternative structured data object model allows developers to receive all content in the format of structured Headings, Paragraphs, Lines, Words and Characters for input directly into .Net applications.

See the Object Reference
Supports:
  • .Net Framework 4.0 and above support C#, VB, F#
  • Microsoft Visual Studio. .Net Development IDE Icon
  • Nuget Installer Support for Visual Studio
  • JetBrains ReSharper C# language assistant compatible
  • Microsoft Azure C# .NET  hosting platform compatible

Licensing & Pricing

Free community development licenses. Commercial licenses from $399.

Project C# + VB.NET Library Licensing

Project

Developer C# + VB.NET Library Licensing

Developer

Organization C# + VB.NET Library Licensing

Organization

Agency C# + VB.NET Library Licensing

Agency

SaaS C# + VB.NET Library Licensing

SaaS

OEM C# + VB.NET Library Licensing

OEM

View Full License Options  

OCR Tutorials From Our .Net Community

How to Read Text from an Image in .Net | Tutorial

C# OCR ASP.NET

Gemma Beckford - Microsoft Solutions Engineer

How to Read Text from an Image in C# .Net

Learn how Gemma's team use IronOCR to read text from images for their archiving software. Gemma shares her own code samples.

View Gemma's Image to Text Tutorial
Tesseract Alternative for C# | IronOCR

C# Tesseract OCR

Jim Baker is a development engineer at Iron developing for the OCR product

IronOCR and Tesseract Comparison for .Net

Jim has been a leading figure in development of IronOCR. Jim designs and builds image processing algorithms and reading methods for OCR.

See Jim's Tesseract Comparison
.Net coders use IronOcr for...

Accounting and Finance Systems

  • # Recepits
  • # Reporting
  • # Invoice Printing
Add OCR Support to .Net Business Systems

Business Digitization

  • # Documentation
  • # Ordering & Labelling
  • # Paper Replacement
C# Digitization of content with OCR

Enterprise Content Management

  • # Content Production
  • # Document Management
  • # Content Distribution
.Net OCR Support

Data and Reporting Applications

  • # Performance Tracking
  • # Trend Mapping
  • # Reports
C# OCR archives and reporting
Join Them Today
Iron Software Enterprise .Net Component Developers

Thousands of corporations, governments, SME's and developers alike trust Iron software products.

Iron's team have over 10 years experience in the .Net software component market.

Marval
ANZ
Foley
GE
Nexudus
Vireq
Equinor
Medcode