VB.NET Code to Get You Started

C# + VB.NET: AutoOcr AutoOcr
using IronOcr;

string imageText = new IronTesseract().Read(@"images\image.png").Text;
Imports IronOcr

Private imageText As String = (New IronTesseract()).Read("images\image.png").Text

IronOCR is unique in its ability to automatically detect and read text from imperfectly scanned images and PDF documents. The IronTesseract class provides the simplest API.

Try other code samples to gain fine-grained control of your C# OCR operations.

IronOCR provides the most advanced build of Tesseract known anywhere, on any platform, with increased speed, accuracy, and a native DLL and API.

Supports Tesseract 3, Tesseract 4, and Tesseract 5 for .NET Framework, Standard, Core, Xamarin, and Mono.

Imports IronOcr

' Instantiate the IronTesseract class to create a new OCR reader
Dim Ocr As New IronTesseract()

' Load an image or PDF document from which to extract the text
Dim Result = Ocr.Read("example_document.png")

' Output the extracted text from the OCR result onto the console
Console.WriteLine(Result.Text)
Imports IronOcr

' Instantiate the IronTesseract class to create a new OCR reader
Dim Ocr As New IronTesseract()

' Load an image or PDF document from which to extract the text
Dim Result = Ocr.Read("example_document.png")

' Output the extracted text from the OCR result onto the console
Console.WriteLine(Result.Text)
VB .NET

Detailed Explanation:

  1. Instantiate IronTesseract: This step initializes an IronTesseract object which allows us to perform OCR operations. It provides the methods needed to read and extract text from images or documents.

  2. Load and Read the Document: The Read method performs OCR on the provided file. In this example, it processes "example_document.png". This operation extracts text content from the image.

  3. Output the Extracted Text: The result of the Read method contains a property Text, which holds the extracted text. This text is printed to the console using Console.WriteLine.

By following these steps, you can easily perform OCR in VB.NET using the IronOCR library, allowing the automation of text extraction for various applications.

C# + VB.NET: Intl. Languages Intl. Languages
using IronOcr;
using System;

var ocrTesseract = new IronTesseract();

ocrTesseract.Language = OcrLanguage.Arabic;

using (var ocrInput = new OcrInput())
{
    ocrInput.LoadImage(@"images\arabic.gif");
    var ocrResult = ocrTesseract.Read(ocrInput);
    Console.WriteLine(ocrResult.Text);
}

// Example with a Custom Trained Font Being used:

var ocrTesseractCustomerLang = new IronTesseract();
ocrTesseractCustomerLang.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata");
ocrTesseractCustomerLang.AddSecondaryLanguage(OcrLanguage.EnglishBest);

using (var ocrInput = new OcrInput())
{
    ocrInput.LoadPdf(@"images\mixed-lang.pdf");
    var ocrResult = ocrTesseractCustomerLang.Read(ocrInput);
    Console.WriteLine(ocrResult.Text);
}
Imports IronOcr
Imports System

Private ocrTesseract = New IronTesseract()

ocrTesseract.Language = OcrLanguage.Arabic

Using ocrInput As New OcrInput()
	ocrInput.LoadImage("images\arabic.gif")
	Dim ocrResult = ocrTesseract.Read(ocrInput)
	Console.WriteLine(ocrResult.Text)
End Using

' Example with a Custom Trained Font Being used:

Dim ocrTesseractCustomerLang = New IronTesseract()
ocrTesseractCustomerLang.UseCustomTesseractLanguageFile("custom_tesseract_files/custom.traineddata")
ocrTesseractCustomerLang.AddSecondaryLanguage(OcrLanguage.EnglishBest)

Using ocrInput As New OcrInput()
	ocrInput.LoadPdf("images\mixed-lang.pdf")
	Dim ocrResult = ocrTesseractCustomerLang.Read(ocrInput)
	Console.WriteLine(ocrResult.Text)
End Using

IronOCR Language Support

IronOCR supports 125 international languages. Other than English, which is installed by default, additional language packs can be added to your .NET project via NuGet or downloaded from our Languages Page.

Most languages are available in Fast, Standard (recommended), and Best quality. The Best quality option may offer more accurate results, but will also be slower in processing time.

Sample Code for Installing Language Packs

Below is a sample code snippet demonstrating how to install and use additional language packs in your .NET project using IronOCR. Install the IronOcr.Languages.{LanguageName} package using NuGet. Here is how to do it in a C# application.

// First, ensure you have the following NuGet packages:
// - IronOcr
// - IronOcr.Languages.{LanguageName} // Replace {LanguageName} with your language choice, e.g., French.

using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Define a file path for the OCR image
        var inputFilePath = "path\\to\\your\\sample-image.png";

        // Choose the OCR language. The below example demonstrates using French.
        var OcrLanguage = IronOcr.Languages.French.OcrLanguagePack.Best();

        // Initialize IronTesseract engine with the specified language
        var OcrEngine = new IronTesseract
        {
            Language = OcrLanguage
        };

        // Perform OCR to convert image text into a string
        using (var Input = new OcrInput(inputFilePath))
        {
            // Recognize text from the image
            var result = OcrEngine.Read(Input);
            System.Console.WriteLine(result.Text);
        }

        // Output: The text content extracted from the image is displayed on the console.
    }
}
// First, ensure you have the following NuGet packages:
// - IronOcr
// - IronOcr.Languages.{LanguageName} // Replace {LanguageName} with your language choice, e.g., French.

using IronOcr;

class Program
{
    static void Main(string[] args)
    {
        // Define a file path for the OCR image
        var inputFilePath = "path\\to\\your\\sample-image.png";

        // Choose the OCR language. The below example demonstrates using French.
        var OcrLanguage = IronOcr.Languages.French.OcrLanguagePack.Best();

        // Initialize IronTesseract engine with the specified language
        var OcrEngine = new IronTesseract
        {
            Language = OcrLanguage
        };

        // Perform OCR to convert image text into a string
        using (var Input = new OcrInput(inputFilePath))
        {
            // Recognize text from the image
            var result = OcrEngine.Read(Input);
            System.Console.WriteLine(result.Text);
        }

        // Output: The text content extracted from the image is displayed on the console.
    }
}
' First, ensure you have the following NuGet packages:
' - IronOcr
' - IronOcr.Languages.{LanguageName} // Replace {LanguageName} with your language choice, e.g., French.

Imports IronOcr

Friend Class Program
	Shared Sub Main(ByVal args() As String)
		' Define a file path for the OCR image
		Dim inputFilePath = "path\to\your\sample-image.png"

		' Choose the OCR language. The below example demonstrates using French.
		Dim OcrLanguage = IronOcr.Languages.French.OcrLanguagePack.Best()

		' Initialize IronTesseract engine with the specified language
		Dim OcrEngine = New IronTesseract With {.Language = OcrLanguage}

		' Perform OCR to convert image text into a string
		Using Input = New OcrInput(inputFilePath)
			' Recognize text from the image
			Dim result = OcrEngine.Read(Input)
			System.Console.WriteLine(result.Text)
		End Using

		' Output: The text content extracted from the image is displayed on the console.
	End Sub
End Class
$vbLabelText   $csharpLabel

Explanation

  • NuGet Packages: Install the necessary IronOCR NuGet packages. The primary package is IronOcr and you need to add the language-specific package like IronOcr.Languages.French for French.
  • Define the File Path: Set the path of the image file you want to process with OCR.
  • Select the Language Pack: Use the OcrLanguagePack for your desired language (best, standard, or fast quality).
  • Initialize the OCR Engine: Create an instance of IronTesseract and set its language to the selected language pack.
  • Read the Image: Using OcrInput, process the image file and extract text using Read, which returns the OCR result as text.
  • Output the Result: The recognized text is printed to the console.

This setup allows you to process images and recognize text in the desired language, including non-default languages, with varying levels of quality and speed.

C# + VB.NET: Results Objects Results Objects
using IronOcr;
using IronSoftware.Drawing;

// We can delve deep into OCR results as an object model of
// Pages, Barcodes, Paragraphs, Lines, Words and Characters
// This allows us to explore, export and draw OCR content using other APIs/
var ocrTesseract = new IronTesseract();

ocrTesseract.Configuration.ReadBarCodes = true;

using var ocrInput = new OcrInput();
var pages = new int[] { 1, 2 };
ocrInput.LoadImageFrames("example.tiff", pages);

OcrResult ocrResult = ocrTesseract.Read(ocrInput);
foreach (var page in ocrResult.Pages)
{
    // Page object
    int PageNumber = page.PageNumber;
    string PageText = page.Text;
    int PageWordCount = page.WordCount;
    // null if we dont set Ocr.Configuration.ReadBarCodes = true;
    OcrResult.Barcode[] Barcodes = page.Barcodes;
    AnyBitmap PageImage = page.ToBitmap(ocrInput);
    double PageWidth = page.Width;
    double PageHeight = page.Height;
    double PageRotation = page.Rotation; // angular correction in degrees from OcrInput.Deskew()

    foreach (var paragraph in page.Paragraphs)
    {
        // Pages -> Paragraphs
        int ParagraphNumber = paragraph.ParagraphNumber;
        string ParagraphText = paragraph.Text;
        AnyBitmap ParagraphImage = paragraph.ToBitmap(ocrInput);
        int ParagraphX_location = paragraph.X;
        int ParagraphY_location = paragraph.Y;
        int ParagraphWidth = paragraph.Width;
        int ParagraphHeight = paragraph.Height;
        double ParagraphOcrAccuracy = paragraph.Confidence;
        OcrResult.TextFlow paragrapthText_direction = paragraph.TextDirection;
        foreach (var line in paragraph.Lines)
        {
            // Pages -> Paragraphs -> Lines
            int LineNumber = line.LineNumber;
            string LineText = line.Text;
            AnyBitmap LineImage = line.ToBitmap(ocrInput);
            int LineX_location = line.X;
            int LineY_location = line.Y;
            int LineWidth = line.Width;
            int LineHeight = line.Height;
            double LineOcrAccuracy = line.Confidence;
            double LineSkew = line.BaselineAngle;
            double LineOffset = line.BaselineOffset;
            foreach (var word in line.Words)
            {
                // Pages -> Paragraphs -> Lines -> Words
                int WordNumber = word.WordNumber;
                string WordText = word.Text;
                AnyBitmap WordImage = word.ToBitmap(ocrInput);
                int WordX_location = word.X;
                int WordY_location = word.Y;
                int WordWidth = word.Width;
                int WordHeight = word.Height;
                double WordOcrAccuracy = word.Confidence;
                foreach (var character in word.Characters)
                {
                    // Pages -> Paragraphs -> Lines -> Words -> Characters
                    int CharacterNumber = character.CharacterNumber;
                    string CharacterText = character.Text;
                    AnyBitmap CharacterImage = character.ToBitmap(ocrInput);
                    int CharacterX_location = character.X;
                    int CharacterY_location = character.Y;
                    int CharacterWidth = character.Width;
                    int CharacterHeight = character.Height;
                    double CharacterOcrAccuracy = character.Confidence;
                    // Output alternative symbols choices and their probability.
                    // Very useful for spellchecking
                    OcrResult.Choice[] Choices = character.Choices;
                }
            }
        }
    }
}
Imports IronOcr
Imports IronSoftware.Drawing

' We can delve deep into OCR results as an object model of
' Pages, Barcodes, Paragraphs, Lines, Words and Characters
' This allows us to explore, export and draw OCR content using other APIs/
Private ocrTesseract = New IronTesseract()

ocrTesseract.Configuration.ReadBarCodes = True

Dim ocrInput As New OcrInput()
Dim pages = New Integer() { 1, 2 }
ocrInput.LoadImageFrames("example.tiff", pages)

Dim ocrResult As OcrResult = ocrTesseract.Read(ocrInput)
For Each page In ocrResult.Pages
	' Page object
	Dim PageNumber As Integer = page.PageNumber
	Dim PageText As String = page.Text
	Dim PageWordCount As Integer = page.WordCount
	' null if we dont set Ocr.Configuration.ReadBarCodes = true;
	Dim Barcodes() As OcrResult.Barcode = page.Barcodes
	Dim PageImage As AnyBitmap = page.ToBitmap(ocrInput)
	Dim PageWidth As Double = page.Width
	Dim PageHeight As Double = page.Height
	Dim PageRotation As Double = page.Rotation ' angular correction in degrees from OcrInput.Deskew()

	For Each paragraph In page.Paragraphs
		' Pages -> Paragraphs
		Dim ParagraphNumber As Integer = paragraph.ParagraphNumber
		Dim ParagraphText As String = paragraph.Text
		Dim ParagraphImage As AnyBitmap = paragraph.ToBitmap(ocrInput)
		Dim ParagraphX_location As Integer = paragraph.X
		Dim ParagraphY_location As Integer = paragraph.Y
		Dim ParagraphWidth As Integer = paragraph.Width
		Dim ParagraphHeight As Integer = paragraph.Height
		Dim ParagraphOcrAccuracy As Double = paragraph.Confidence
		Dim paragrapthText_direction As OcrResult.TextFlow = paragraph.TextDirection
		For Each line In paragraph.Lines
			' Pages -> Paragraphs -> Lines
			Dim LineNumber As Integer = line.LineNumber
			Dim LineText As String = line.Text
			Dim LineImage As AnyBitmap = line.ToBitmap(ocrInput)
			Dim LineX_location As Integer = line.X
			Dim LineY_location As Integer = line.Y
			Dim LineWidth As Integer = line.Width
			Dim LineHeight As Integer = line.Height
			Dim LineOcrAccuracy As Double = line.Confidence
			Dim LineSkew As Double = line.BaselineAngle
			Dim LineOffset As Double = line.BaselineOffset
			For Each word In line.Words
				' Pages -> Paragraphs -> Lines -> Words
				Dim WordNumber As Integer = word.WordNumber
				Dim WordText As String = word.Text
				Dim WordImage As AnyBitmap = word.ToBitmap(ocrInput)
				Dim WordX_location As Integer = word.X
				Dim WordY_location As Integer = word.Y
				Dim WordWidth As Integer = word.Width
				Dim WordHeight As Integer = word.Height
				Dim WordOcrAccuracy As Double = word.Confidence
				For Each character In word.Characters
					' Pages -> Paragraphs -> Lines -> Words -> Characters
					Dim CharacterNumber As Integer = character.CharacterNumber
					Dim CharacterText As String = character.Text
					Dim CharacterImage As AnyBitmap = character.ToBitmap(ocrInput)
					Dim CharacterX_location As Integer = character.X
					Dim CharacterY_location As Integer = character.Y
					Dim CharacterWidth As Integer = character.Width
					Dim CharacterHeight As Integer = character.Height
					Dim CharacterOcrAccuracy As Double = character.Confidence
					' Output alternative symbols choices and their probability.
					' Very useful for spellchecking
					Dim Choices() As OcrResult.Choice = character.Choices
				Next character
			Next word
		Next line
	Next paragraph
Next page

IronOCR returns an advanced result object for each page it scans using Tesseract 5. This contains location data, images, text, statistical confidence, alternative symbol choices, font-names, font-sizes decoration, font weights, and position for each:

  • Page
  • Paragraph
  • Line of Text
  • Word
  • Individual Character
  • Barcode

Here is an example of how you might retrieve and work with these data points using C# with IronOCR:

// Import the IronOCR library
using IronOcr;

class OCRExample
{
    static void Main()
    {
        // Create a new instance of the IronTesseract engine
        var OcrEngine = new IronTesseract();

        // Specify the file path of the scanned document
        var Input = new OcrInput(@"path_to_your_image_file.jpg");

        // Perform OCR on the input image
        OcrResult result = OcrEngine.Read(Input);

        // Check the number of pages detected
        Console.WriteLine($"Detected {result.Pages.Count} page(s)");

        // Iterate through each page
        foreach (var page in result.Pages)
        {
            // Output the page text
            Console.WriteLine($"Page Text: {page.Text}");

            // Iterate through each paragraph in the page
            foreach (var paragraph in page.Paragraphs)
            {
                Console.WriteLine($"Paragraph Text: {paragraph.Text}");

                // Iterate through each line in the paragraph
                foreach (var line in paragraph.Lines)
                {
                    Console.WriteLine($"Line Text: {line.Text}");

                    // Iterate through each word in the line
                    foreach (var word in line.Words)
                    {
                        Console.WriteLine($"Word Text: {word.Text}");

                        // Iterate through each character in the word
                        foreach (var character in word.Characters)
                        {
                            Console.WriteLine($"Character Text: {character.Text}");
                        }
                    }
                }
            }

            //Detect barcodes within the page and output their values
            foreach (var barcode in page.Barcodes)
            {
                Console.WriteLine($"Barcode Value: {barcode.Value}");
            }
        }
    }
}
// Import the IronOCR library
using IronOcr;

class OCRExample
{
    static void Main()
    {
        // Create a new instance of the IronTesseract engine
        var OcrEngine = new IronTesseract();

        // Specify the file path of the scanned document
        var Input = new OcrInput(@"path_to_your_image_file.jpg");

        // Perform OCR on the input image
        OcrResult result = OcrEngine.Read(Input);

        // Check the number of pages detected
        Console.WriteLine($"Detected {result.Pages.Count} page(s)");

        // Iterate through each page
        foreach (var page in result.Pages)
        {
            // Output the page text
            Console.WriteLine($"Page Text: {page.Text}");

            // Iterate through each paragraph in the page
            foreach (var paragraph in page.Paragraphs)
            {
                Console.WriteLine($"Paragraph Text: {paragraph.Text}");

                // Iterate through each line in the paragraph
                foreach (var line in paragraph.Lines)
                {
                    Console.WriteLine($"Line Text: {line.Text}");

                    // Iterate through each word in the line
                    foreach (var word in line.Words)
                    {
                        Console.WriteLine($"Word Text: {word.Text}");

                        // Iterate through each character in the word
                        foreach (var character in word.Characters)
                        {
                            Console.WriteLine($"Character Text: {character.Text}");
                        }
                    }
                }
            }

            //Detect barcodes within the page and output their values
            foreach (var barcode in page.Barcodes)
            {
                Console.WriteLine($"Barcode Value: {barcode.Value}");
            }
        }
    }
}
CONVERTER NOT RUNNING
$vbLabelText   $csharpLabel

Explanation:

  • IronTesseract Engine: This is used to initiate the OCR process on the provided image input.

  • OcrInput: Represents the image file that will be processed. You need to specify the path to your image file.

  • Read Method: This processes the image and returns an OcrResult with all extracted data.

  • Iterating Structure: The example provided utilizes nested loops to dive deep from pages to characters and barcodes, allowing access to every element's text and properties.

  • Console Output: The program writes each text element to the console. Replace these actions with any function you need to perform using these elements.

This structured approach enables a detailed exploration and utilization of the various data points retrieved by IronOCR.

Human Support related to OCR in VB.NET

Support From Our Team

For product or licensing queries, the Iron team are ready to support you. Send us your questions and we'll ensure the right person at Iron answers it for you.

Get in Touch
Image To Text related to OCR in VB.NET

OCR Images to Text in VB.NET Applications

One or multi pages can be sent to IronOCR. You'll receive all text, barcode, & QR content as a result. Add OCR functionality to .NET Console, Web, or Desktop Apps. Images can be submitted as PDF, JPG, PNG, GIF, BMP and TIFF.

Made for VB.NET, .NET, C#

See a Tutorials
Fast And Polite Behavior related to OCR in VB.NET

OCR with Fast & Accurate Results

The Optical Character Recognition software views content in multiple font styles for accurate text OCR. Use rectangle read regions to improve speed and accuracy. Multi-core multi threading improves OCR reading speeds.

API Reference Documenation
Advanced Image related to OCR in VB.NET

Image Processing for Imperfect Scan Recognition

What really makes IronOCR special is its ability to read badly scanned documents. Its unique pre-processing library reduces background noise, rotation, distortion and skewed alignment as well as simplifying colours and enhancing resolution & contrast. Iron’s AutoOCR and Advanced OCR settings provide developers with the tools to achieve the best possible results, every time.

Learn More
Support For Languages related to OCR in VB.NET

Multi-lingual OCR

Language packs available for: Arabic, Simplified Chinese, Traditional Chinese, Danish, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Portuguese, Russian, Spanish, and Swedish. Other languages can be supported by request.

Learn More
Output Content related to OCR in VB.NET

Data Exported Directly to Your VB.NET Application

IronOCR outputs content as plain text and barcode data. An alternative structured data object model allows developers to receive all content in the format of structured Headings, Paragraphs, Lines, Words and Characters for input directly into .NET applications.

Learn More
Supports:
  • .NET Framework 4.0 and above support C#, VB, F#
  • Microsoft Visual Studio. .NET Development IDE Icon
  • NuGet Installer Support for Visual Studio
  • JetBrains ReSharper C# language assistant compatible
  • Microsoft Azure C# .NET  hosting platform compatible

Licensing & Pricing

Free community development licenses. Commercial licenses from $749.

Project C# + VB.NET Library Licensing

Project

Developer C# + VB.NET Library Licensing

Developer

Organization C# + VB.NET Library Licensing

Organization

Agency C# + VB.NET Library Licensing

Agency

SaaS C# + VB.NET Library Licensing

SaaS

OEM C# + VB.NET Library Licensing

OEM

View Full License Options  

VB.NET Optical Character Recognition Tutorials

Tesseract Tutorial for C# | IronOCR

C# Tesseract OCR

Jim Baker is a development engineer at Iron developing for the OCR product

IronOCR and Tesseract Comparison for .NET

Jim has been a leading figure in development of IronOCR. Jim designs and builds image processing algorithms and reading methods for OCR.

See Jim's Tesseract Comparison
How to Read Text from an Image in .NET | Tutorial

C# OCR ASP.NET

Gemma Beckford - Microsoft Solutions Engineer

How to Read Text from an Image in C# .NET

Learn how Gemma's team use IronOCR to read text from images for their archiving software. Gemma shares her own code samples.

View Gemma's Image to Text Tutorial
VB Coders use IronOcr for...

Accounting and Finance Systems

  • # Receipts
  • # Reporting
  • # Invoice Printing
Add PDF Support to ASP.NET Accounting and Finance Systems

Business Digitization

  • # Documentation
  • # Ordering & Labelling
  • # Paper Replacement
C# Business Digitization Use Cases

Enterprise Content Management

  • # Content Production
  • # Document Management
  • # Content Distribution
.NET CMS PDF Support

Data and Reporting Applications

  • # Performance Tracking
  • # Trend Mapping
  • # Reports
C# PDF Reports
Iron .NET Customers

Thousands of corporations, governments, SMEs and developers alike trust Iron software products.

Iron's team have over 10 years experience in the .NET software component market.

Equinor
Medcode
GE
Foley
ANZ
Vireq
Nexudus
Marval