How to Extract Text from Images in C#

C# OCR Image to Text Tutorial: Convert Images to Text Without Tesseract

Looking to convert images to text in C# without the hassle of complex Tesseract configurations? This comprehensive IronOCR C# tutorial shows you how to implement powerful optical character recognition in your .NET applications with just a few lines of code.

Quickstart: Extract Text from an Image in One Line

This example shows how easy it is to grasp IronOCR—just one line of C# turns your image into text. It demonstrates initializing the OCR engine and immediately reading and retrieving text without complex setup.

  1. Install IronOCR with NuGet Package Manager

    PM > Install-Package IronOcr
  2. Copy and run this code snippet.

    string text = new IronTesseract().Read("image.png").Text;
  3. Deploy to test on your live environment

    Start using IronOCR in your project today with a free trial

    arrow pointer

How Do I Read Text from Images in .NET Applications?

To achieve C# OCR image to text functionality in your .NET applications, you'll need a reliable OCR library. IronOCR provides a managed solution using the IronOcr.IronTesseract class that maximizes both accuracy and speed without requiring external dependencies.

First, install IronOCR into your Visual Studio project. You can download the IronOCR DLL directly or use NuGet Package Manager.

Install-Package IronOcr

Why Choose IronOCR for C# OCR Without Tesseract?

When you need to convert images to text in C#, IronOCR offers significant advantages over traditional Tesseract implementations:

  • Works immediately in pure .NET environments
  • No Tesseract installation or configuration required
  • Runs the latest engines: Tesseract 5 (plus Tesseract 4 & 3)
  • Compatible with .NET Framework 4.6.2+, .NET Standard 2+, and .NET Core 2, 3, 5, 6, 7, 8, 9, and 10
  • Improves accuracy and speed compared to vanilla Tesseract
  • Supports Xamarin, Mono, Azure, and Docker deployments
  • Manages complex Tesseract dictionaries through NuGet packages
  • Handles PDFs, MultiFrame TIFFs, and all major image formats automatically
  • Corrects low-quality and skewed scans for optimal results

How to Use IronOCR C# Tutorial for Basic OCR?

This Iron Tesseract C# example demonstrates the simplest way to read text from image using IronOCR. The IronOcr.IronTesseract class extracts text and returns it as a string.

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-3.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();
using OcrInput input = new OcrInput();
var pageindices = new int[] { 1, 2 };
input.LoadImageFrames(@"img\Potter.LowQuality.tiff", pageindices);
input.Deskew(); // removes rotation and perspective
OcrResult result = ocr.Read(input);
Console.WriteLine(result.Text);
Imports IronOcr

Private ocr As New IronTesseract()
Private OcrInput As using
Private pageindices = New Integer() { 1, 2 }
input.LoadImageFrames("img\Potter.LowQuality.tiff", pageindices)
input.Deskew() ' removes rotation and perspective
Dim result As OcrResult = ocr.Read(input)
Console.WriteLine(result.Text)
$vbLabelText   $csharpLabel

This code achieves 100% accuracy on clear images, extracting text exactly as it appears:

IronOCR Simple Example

In this simple example we test the accuracy of our C# OCR library to read text from a PNG Image. This is a very basic test, but things will get more complicated as the tutorial continues.

The quick brown fox jumps over the lazy dog

The IronTesseract class handles complex OCR operations internally. It automatically scans for alignment, optimizes resolution, and uses AI to read text from image using IronOCR with human-level accuracy.

Despite the sophisticated processing happening behind the scenes - including image analysis, engine optimization, and intelligent text recognition - the OCR process matches human reading speed while maintaining exceptional accuracy levels.

IronOCR Simple Example showing C# OCR image to text conversion with 100% accuracy Screenshot demonstrating IronOCR's ability to extract text from a PNG image with perfect accuracy

How to Implement Advanced C# OCR Without Tesseract Configuration?

For production applications requiring optimal performance when you convert images to text in C#, use the OcrInput and IronTesseract classes together. This approach provides fine-grained control over the OCR process.

OcrInput Class Features

  • Processes multiple image formats: JPEG, TIFF, GIF, BMP, PNG
  • Imports complete PDFs or specific pages
  • Enhances contrast, resolution, and image quality automatically
  • Corrects rotation, scan noise, skew, and negative images

IronTesseract Class Features

  • Access to 127+ prepackaged languages
  • Tesseract 5, 4, and 3 engines included
  • Document type specification (screenshot, snippet, or full document)
  • Integrated barcode reading capabilities
  • Multiple output formats: Searchable PDFs, HOCR HTML, DOM objects, and strings

How to Get Started with OcrInput and IronTesseract?

Here's a recommended configuration for this IronOCR C# tutorial that works well with most document types:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-5.cs
using IronOcr;
using IronSoftware.Drawing;

IronTesseract ocr = new IronTesseract();
using OcrInput input = new OcrInput();
// a 41% improvement on speed
Rectangle contentArea = new Rectangle(x: 215, y: 1250, height: 280, width: 1335);
input.LoadImage("img/ComSci.png", contentArea);
OcrResult result = ocr.Read(input);
Console.WriteLine(result.Text);
Imports IronOcr
Imports IronSoftware.Drawing

Private ocr As New IronTesseract()
Private OcrInput As using
' a 41% improvement on speed
Private contentArea As New Rectangle(x:= 215, y:= 1250, height:= 280, width:= 1335)
input.LoadImage("img/ComSci.png", contentArea)
Dim result As OcrResult = ocr.Read(input)
Console.WriteLine(result.Text)
$vbLabelText   $csharpLabel

This configuration consistently achieves near-perfect accuracy on medium-quality scans. The LoadImageFrames method efficiently handles multi-page documents, making it ideal for batch processing scenarios.


Multi-page TIFF document showing Harry Potter text ready for C# OCR processing

Sample TIFF document demonstrating IronOCR's multi-page text extraction capabilities

The ability to read text from images and barcodes in scanned documents like TIFFs showcases how IronOCR simplifies complex OCR tasks. The library excels with real-world documents, seamlessly handling multi-page TIFFs and PDF text extraction.

How Does IronOCR Handle Low-Quality Scans?


Low-quality scan with digital noise demonstrating IronOCR's image enhancement capabilities

Low-resolution document with noise that IronOCR can process accurately using image filters

When working with imperfect scans containing distortion and digital noise, IronOCR outperforms other C# OCR libraries. It's specifically designed for real-world scenarios rather than pristine test images.

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-6.cs
// PM> Install IronOcr.Languages.Arabic
using IronOcr;

IronTesseract ocr = new IronTesseract();
ocr.Language = OcrLanguage.Arabic;

using OcrInput input = new OcrInput();
input.LoadImageFrame("img/arabic.gif", 1);
// add image filters if needed
// In this case, even thought input is very low quality
// IronTesseract can read what conventional Tesseract cannot.

OcrResult result = ocr.Read(input);

// Console can't print Arabic on Windows easily.
// Let's save to disk instead.
result.SaveAsTextFile("arabic.txt");
' PM> Install IronOcr.Languages.Arabic
Imports IronOcr

Private ocr As New IronTesseract()
ocr.Language = OcrLanguage.Arabic

Using input As New OcrInput()
	input.LoadImageFrame("img/arabic.gif", 1)
	' add image filters if needed
	' In this case, even thought input is very low quality
	' IronTesseract can read what conventional Tesseract cannot.
	
	Dim result As OcrResult = ocr.Read(input)
	
	' Console can't print Arabic on Windows easily.
	' Let's save to disk instead.
	result.SaveAsTextFile("arabic.txt")
End Using
$vbLabelText   $csharpLabel

Using Input.Deskew(), accuracy improves to 99.8% on low-quality scans, nearly matching high-quality results. This demonstrates why IronOCR is the preferred choice for C# OCR without Tesseract complications.

Image filters may slightly increase processing time but significantly reduce overall OCR duration. Finding the right balance depends on your document quality.

For most scenarios, Input.Deskew() and Input.DeNoise() provide reliable improvements to OCR performance. Learn more about image preprocessing techniques.

How to Optimize OCR Performance and Speed?

The most significant factor affecting OCR speed when you convert images to text in C# is input quality. Higher DPI (~200 dpi) with minimal noise produces the fastest and most accurate results.

While IronOCR excels at correcting imperfect documents, this enhancement requires additional processing time.

Choose image formats with minimal compression artifacts. TIFF and PNG typically yield faster results than JPEG due to lower digital noise.

Which Image Filters Improve OCR Speed?

The following filters can dramatically enhance performance in your C# OCR image to text workflow:

  • OcrInput.Rotate(double degrees): Rotates images clockwise (negative for counterclockwise)
  • OcrInput.Binarize(): Converts to black/white, improving performance in low-contrast scenarios
  • OcrInput.ToGrayScale(): Converts to grayscale for potential speed improvements
  • OcrInput.Contrast(): Auto-adjusts contrast for better accuracy
  • OcrInput.DeNoise(): Removes digital artifacts when noise is expected
  • OcrInput.Invert(): Inverts colors for white-on-black text
  • OcrInput.Dilate(): Expands text boundaries
  • OcrInput.Erode(): Reduces text boundaries
  • OcrInput.Deskew(): Corrects alignment - essential for skewed documents
  • OcrInput.DeepCleanBackgroundNoise(): Aggressive noise removal
  • OcrInput.EnhanceResolution: Improves low-resolution image quality
  • OcrInput.DetectPageOrientation(): Detects and corrects page rotation. Pass an OrientationDetectionMode to control the accuracy/speed trade-off: Fast, Balanced, Detailed, or ExtremeDetailed (added v2025.8.6)

WarningScale() and EnhanceResolution() are incompatible with SaveAsSearchablePdf() due to a known issue in v2025.12.3. All other filters work correctly with searchable PDF output.

How to Configure IronOCR for Maximum Speed?

Use these settings to optimize speed when processing high-quality scans:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-7.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();
ocr.Language = OcrLanguage.ChineseSimplified;

// We can add any number of languages.
ocr.AddSecondaryLanguage(OcrLanguage.English);
// Optionally add custom tesseract .traineddata files by specifying a file path

using OcrInput input = new OcrInput();
input.LoadImage("img/MultiLanguage.jpeg");
OcrResult result = ocr.Read(input);
result.SaveAsTextFile("MultiLanguage.txt");
Imports IronOcr

Private ocr As New IronTesseract()
ocr.Language = OcrLanguage.ChineseSimplified

' We can add any number of languages.
ocr.AddSecondaryLanguage(OcrLanguage.English)
' Optionally add custom tesseract .traineddata files by specifying a file path

Using input As New OcrInput()
	input.LoadImage("img/MultiLanguage.jpeg")
	Dim result As OcrResult = ocr.Read(input)
	result.SaveAsTextFile("MultiLanguage.txt")
End Using
$vbLabelText   $csharpLabel

This optimized setup maintains 99.8% accuracy while achieving a 35% speed improvement compared to default settings.

How to Read Specific Areas of Images Using C# OCR?

The Iron Tesseract C# example below shows how to target specific regions using System.Drawing.Rectangle. This technique is invaluable for processing standardized forms where text appears in predictable locations.

Can IronOCR Process Cropped Regions for Faster Results?

Using pixel-based coordinates, you can limit OCR to specific areas, dramatically improving speed and preventing unwanted text extraction:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-8.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();

using OcrInput input = new OcrInput();
input.LoadImage("image1.jpeg");
input.LoadImage("image2.png");
var pageindices = new int[] { 1, 2 };
input.LoadImageFrames("image3.gif", pageindices);

OcrResult result = ocr.Read(input);

Console.WriteLine($"{result.Pages.Length} Pages"); // 3 Pages
Imports IronOcr

Private ocr As New IronTesseract()

Private OcrInput As using
input.LoadImage("image1.jpeg")
input.LoadImage("image2.png")
Dim pageindices = New Integer() { 1, 2 }
input.LoadImageFrames("image3.gif", pageindices)

Dim result As OcrResult = ocr.Read(input)

Console.WriteLine($"{result.Pages.Length} Pages") ' 3 Pages
$vbLabelText   $csharpLabel

This targeted approach provides a 41% speed improvement while extracting only relevant text. It's ideal for structured documents like invoices, checks, and forms. The same cropping technique works seamlessly with PDF OCR operations.

Computer Science document showing targeted OCR region extraction in C# Document demonstrating precise region-based text extraction using IronOCR's rectangle selection

How Many Languages Does IronOCR Support?

IronOCR provides 127 international languages through convenient language packs. Download them as DLLs from our website or via NuGet Package Manager.

Install language packs through the NuGet interface (search "IronOcr.Languages") or visit the complete language pack listing.

Supported languages include Arabic, Chinese (Simplified/Traditional), Japanese, Korean, Hindi, Russian, German, French, Spanish, and 115+ others, each optimized for accurate text recognition.

How to Implement OCR in Multiple Languages?

This IronOCR C# tutorial example demonstrates Arabic text recognition:

Install-Package IronOcr.Languages.Arabic
Arabic text being processed by IronOCR demonstrating multi-language OCR support

IronOCR accurately extracting Arabic text from a GIF image

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-10.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();
using OcrInput input = new OcrInput();
input.LoadPdf("example.pdf", Password: "password");
// We can also select specific PDF page numbers to OCR

OcrResult result = ocr.Read(input);

Console.WriteLine(result.Text);
Console.WriteLine($"{result.Pages.Length} Pages");
// 1 page for every page of the PDF
Imports IronOcr

Private ocr As New IronTesseract()
Private OcrInput As using
input.LoadPdf("example.pdf", Password:= "password")
' We can also select specific PDF page numbers to OCR

Dim result As OcrResult = ocr.Read(input)

Console.WriteLine(result.Text)
Console.WriteLine($"{result.Pages.Length} Pages")
' 1 page for every page of the PDF
$vbLabelText   $csharpLabel

Can IronOCR Handle Documents with Multiple Languages?

When documents contain mixed languages, configure IronOCR for multi-language support:

Install-Package IronOcr.Languages.ChineseSimplified
:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-12.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();

using OcrInput input = new OcrInput();
input.Title = "Pdf Metadata Name";
input.LoadPdf("example.pdf", Password: "password");
OcrResult result = ocr.Read(input);
result.SaveAsSearchablePdf("searchable.pdf");
Imports IronOcr

Private ocr As New IronTesseract()

Private OcrInput As using
input.Title = "Pdf Metadata Name"
input.LoadPdf("example.pdf", Password:= "password")
Dim result As OcrResult = ocr.Read(input)
result.SaveAsSearchablePdf("searchable.pdf")
$vbLabelText   $csharpLabel

How to Process Multi-Page Documents with C# OCR?

IronOCR seamlessly combines multiple pages or images into a single OcrResult. This feature enables powerful capabilities like creating searchable PDFs and extracting text from entire document sets.

Mix and match various sources - images, TIFF frames, and PDF pages - in a single OCR operation:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-13.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();
using OcrInput input = new OcrInput();
input.Title = "Pdf Title";
var pageindices = new int[] { 1, 2 };
input.LoadImageFrames("example.tiff", pageindices);
OcrResult result = ocr.Read(input);
result.SaveAsSearchablePdf("searchable.pdf");
Imports IronOcr

Private ocr As New IronTesseract()
Private OcrInput As using
input.Title = "Pdf Title"
Dim pageindices = New Integer() { 1, 2 }
input.LoadImageFrames("example.tiff", pageindices)
Dim result As OcrResult = ocr.Read(input)
result.SaveAsSearchablePdf("searchable.pdf")
$vbLabelText   $csharpLabel

Process all pages of a TIFF file efficiently:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-14.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();

using OcrInput input = new OcrInput();
input.Title = "Html Title";

// Add more content as required...
input.LoadImage("image2.jpeg");
input.LoadPdf("example.pdf",Password: "password");
var pageindices = new int[] { 1, 2 };
input.LoadImageFrames("example.tiff", pageindices);

OcrResult result = ocr.Read(input);
result.SaveAsHocrFile("hocr.html");
Imports IronOcr

Private ocr As New IronTesseract()

Private OcrInput As using
input.Title = "Html Title"

' Add more content as required...
input.LoadImage("image2.jpeg")
input.LoadPdf("example.pdf",Password:= "password")
Dim pageindices = New Integer() { 1, 2 }
input.LoadImageFrames("example.tiff", pageindices)

Dim result As OcrResult = ocr.Read(input)
result.SaveAsHocrFile("hocr.html")
$vbLabelText   $csharpLabel

Convert TIFFs or PDFs to searchable formats:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-15.cs
using IronOcr;

IronTesseract ocr = new IronTesseract();

ocr.Configuration.ReadBarCodes = true;

using OcrInput input = new OcrInput();
input.LoadImage("img/Barcode.png");

OcrResult result = ocr.Read(input);

foreach (var barcode in result.Barcodes)
{
    Console.WriteLine(barcode.Value);
    // type and location properties also exposed
}
Imports IronOcr

Private ocr As New IronTesseract()

ocr.Configuration.ReadBarCodes = True

Using input As New OcrInput()
	input.LoadImage("img/Barcode.png")
	
	Dim result As OcrResult = ocr.Read(input)
	
	For Each barcode In result.Barcodes
		Console.WriteLine(barcode.Value)
		' type and location properties also exposed
	Next barcode
End Using
$vbLabelText   $csharpLabel

Convert existing PDFs to searchable versions:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-16.cs
using IronOcr;
using IronSoftware.Drawing;

// We can delve deep into OCR results as an object model of Pages, Barcodes, Paragraphs, Lines, Words and Characters
// This allows us to explore, export and draw OCR content using other APIs

IronTesseract ocr = new IronTesseract();
ocr.Configuration.ReadBarCodes = true;

using OcrInput input = new OcrInput();
var pageindices = new int[] { 1, 2 };
input.LoadImageFrames(@"img\Potter.tiff", pageindices);

OcrResult result = ocr.Read(input);

foreach (var page in result.Pages)
{
    // Page object
    int pageNumber = page.PageNumber;
    string pageText = page.Text;
    int pageWordCount = page.WordCount;

    // null if we don't set Ocr.Configuration.ReadBarCodes = true;
    OcrResult.Barcode[] barcodes = page.Barcodes;

    AnyBitmap pageImage = page.ToBitmap(input);
    System.Drawing.Bitmap pageImageLegacy = page.ToBitmap(input);
    double pageWidth = page.Width;
    double pageHeight = page.Height;

    foreach (var paragraph in page.Paragraphs)
    {
        // Pages -> Paragraphs
        int paragraphNumber = paragraph.ParagraphNumber;
        String paragraphText = paragraph.Text;
        System.Drawing.Bitmap paragraphImage = paragraph.ToBitmap(input);
        int paragraphXLocation = paragraph.X;
        int paragraphYLocation = paragraph.Y;
        int paragraphWidth = paragraph.Width;
        int paragraphHeight = paragraph.Height;
        double paragraphOcrAccuracy = paragraph.Confidence;
        var paragraphTextDirection = paragraph.TextDirection;

        foreach (var line in paragraph.Lines)
        {
            // Pages -> Paragraphs -> Lines
            int lineNumber = line.LineNumber;
            String lineText = line.Text;
            AnyBitmap lineImage = line.ToBitmap(input);
            System.Drawing.Bitmap lineImageLegacy = line.ToBitmap(input);
            int lineXLocation = line.X;
            int lineYLocation = line.Y;
            int lineWidth = line.Width;
            int lineHeight = line.Height;
            double lineOcrAccuracy = line.Confidence;
            double lineSkew = line.BaselineAngle;
            double lineOffset = line.BaselineOffset;

            foreach (var word in line.Words)
            {
                // Pages -> Paragraphs -> Lines -> Words
                int wordNumber = word.WordNumber;
                String wordText = word.Text;
                AnyBitmap wordImage = word.ToBitmap(input);
                System.Drawing.Image wordImageLegacy = word.ToBitmap(input);
                int wordXLocation = word.X;
                int wordYLocation = word.Y;
                int wordWidth = word.Width;
                int wordHeight = word.Height;
                double wordOcrAccuracy = word.Confidence;

                if (word.Font != null)
                {
                    // Word.Font is only set when using Tesseract Engine Modes rather than LTSM
                    String fontName = word.Font.FontName;
                    double fontSize = word.Font.FontSize;
                    bool isBold = word.Font.IsBold;
                    bool isFixedWidth = word.Font.IsFixedWidth;
                    bool isItalic = word.Font.IsItalic;
                    bool isSerif = word.Font.IsSerif;
                    bool isUnderlined = word.Font.IsUnderlined;
                    bool fontIsCaligraphic = word.Font.IsCaligraphic;
                }

                foreach (var character in word.Characters)
                {
                    // Pages -> Paragraphs -> Lines -> Words -> Characters
                    int characterNumber = character.CharacterNumber;
                    String characterText = character.Text;
                    AnyBitmap characterImage = character.ToBitmap(input);
                    System.Drawing.Bitmap characterImageLegacy = character.ToBitmap(input);
                    int characterXLocation = character.X;
                    int characterYLocation = character.Y;
                    int characterWidth = character.Width;
                    int characterHeight = character.Height;
                    double characterOcrAccuracy = character.Confidence;

                    // Output alternative symbols choices and their probability.
                    // Very useful for spell checking
                    OcrResult.Choice[] characterChoices = character.Choices;
                }
            }
        }
    }
}
Imports IronOcr
Imports IronSoftware.Drawing

' We can delve deep into OCR results as an object model of Pages, Barcodes, Paragraphs, Lines, Words and Characters
' This allows us to explore, export and draw OCR content using other APIs

Private ocr As New IronTesseract()
ocr.Configuration.ReadBarCodes = True

Using input As New OcrInput()
	Dim pageindices = New Integer() { 1, 2 }
	input.LoadImageFrames("img\Potter.tiff", pageindices)
	
	Dim result As OcrResult = ocr.Read(input)
	
	For Each page In result.Pages
		' Page object
		Dim pageNumber As Integer = page.PageNumber
		Dim pageText As String = page.Text
		Dim pageWordCount As Integer = page.WordCount
	
		' null if we don't set Ocr.Configuration.ReadBarCodes = true;
		Dim barcodes() As OcrResult.Barcode = page.Barcodes
	
		Dim pageImage As AnyBitmap = page.ToBitmap(input)
		Dim pageImageLegacy As System.Drawing.Bitmap = page.ToBitmap(input)
		Dim pageWidth As Double = page.Width
		Dim pageHeight As Double = page.Height
	
		For Each paragraph In page.Paragraphs
			' Pages -> Paragraphs
			Dim paragraphNumber As Integer = paragraph.ParagraphNumber
			Dim paragraphText As String = paragraph.Text
			Dim paragraphImage As System.Drawing.Bitmap = paragraph.ToBitmap(input)
			Dim paragraphXLocation As Integer = paragraph.X
			Dim paragraphYLocation As Integer = paragraph.Y
			Dim paragraphWidth As Integer = paragraph.Width
			Dim paragraphHeight As Integer = paragraph.Height
			Dim paragraphOcrAccuracy As Double = paragraph.Confidence
			Dim paragraphTextDirection = paragraph.TextDirection
	
			For Each line In paragraph.Lines
				' Pages -> Paragraphs -> Lines
				Dim lineNumber As Integer = line.LineNumber
				Dim lineText As String = line.Text
				Dim lineImage As AnyBitmap = line.ToBitmap(input)
				Dim lineImageLegacy As System.Drawing.Bitmap = line.ToBitmap(input)
				Dim lineXLocation As Integer = line.X
				Dim lineYLocation As Integer = line.Y
				Dim lineWidth As Integer = line.Width
				Dim lineHeight As Integer = line.Height
				Dim lineOcrAccuracy As Double = line.Confidence
				Dim lineSkew As Double = line.BaselineAngle
				Dim lineOffset As Double = line.BaselineOffset
	
				For Each word In line.Words
					' Pages -> Paragraphs -> Lines -> Words
					Dim wordNumber As Integer = word.WordNumber
					Dim wordText As String = word.Text
					Dim wordImage As AnyBitmap = word.ToBitmap(input)
					Dim wordImageLegacy As System.Drawing.Image = word.ToBitmap(input)
					Dim wordXLocation As Integer = word.X
					Dim wordYLocation As Integer = word.Y
					Dim wordWidth As Integer = word.Width
					Dim wordHeight As Integer = word.Height
					Dim wordOcrAccuracy As Double = word.Confidence
	
					If word.Font IsNot Nothing Then
						' Word.Font is only set when using Tesseract Engine Modes rather than LTSM
						Dim fontName As String = word.Font.FontName
						Dim fontSize As Double = word.Font.FontSize
						Dim isBold As Boolean = word.Font.IsBold
						Dim isFixedWidth As Boolean = word.Font.IsFixedWidth
						Dim isItalic As Boolean = word.Font.IsItalic
						Dim isSerif As Boolean = word.Font.IsSerif
						Dim isUnderlined As Boolean = word.Font.IsUnderlined
						Dim fontIsCaligraphic As Boolean = word.Font.IsCaligraphic
					End If
	
					For Each character In word.Characters
						' Pages -> Paragraphs -> Lines -> Words -> Characters
						Dim characterNumber As Integer = character.CharacterNumber
						Dim characterText As String = character.Text
						Dim characterImage As AnyBitmap = character.ToBitmap(input)
						Dim characterImageLegacy As System.Drawing.Bitmap = character.ToBitmap(input)
						Dim characterXLocation As Integer = character.X
						Dim characterYLocation As Integer = character.Y
						Dim characterWidth As Integer = character.Width
						Dim characterHeight As Integer = character.Height
						Dim characterOcrAccuracy As Double = character.Confidence
	
						' Output alternative symbols choices and their probability.
						' Very useful for spell checking
						Dim characterChoices() As OcrResult.Choice = character.Choices
					Next character
				Next word
			Next line
		Next paragraph
	Next page
End Using
$vbLabelText   $csharpLabel

Apply the same technique to TIFF conversions:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-17.cs
using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Configure document properties
    input.Title = "Scanned Archive Document";

    // Select pages to process
    var pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames("example.tiff", pageIndices);

    // Create searchable PDF from TIFF
    OcrResult result = ocr.Read(input);
    result.SaveAsSearchablePdf("searchable.pdf");
}
Imports IronOcr

Dim ocr As New IronTesseract()

Using input As New OcrInput()
    ' Configure document properties
    input.Title = "Scanned Archive Document"

    ' Select pages to process
    Dim pageIndices As Integer() = {1, 2}
    input.LoadImageFrames("example.tiff", pageIndices)

    ' Create searchable PDF from TIFF
    Dim result As OcrResult = ocr.Read(input)
    result.SaveAsSearchablePdf("searchable.pdf")
End Using
$vbLabelText   $csharpLabel

How to Export OCR Results as HOCR HTML?

IronOCR supports HOCR HTML export, enabling structured PDF to HTML and TIFF to HTML conversions while preserving layout information:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-18.cs
using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Set HTML title
    input.Title = "Document Archive";

    // Process multiple document types
    input.LoadImage("image2.jpeg");
    input.LoadPdf("example.pdf", "password");

    // Add TIFF pages
    var pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames("example.tiff", pageIndices);

    // Export as HOCR with position data
    OcrResult result = ocr.Read(input);
    result.SaveAsHocrFile("hocr.html");
}
Imports IronOcr

Dim ocr As New IronTesseract()

Using input As New OcrInput()
    ' Set HTML title
    input.Title = "Document Archive"

    ' Process multiple document types
    input.LoadImage("image2.jpeg")
    input.LoadPdf("example.pdf", "password")

    ' Add TIFF pages
    Dim pageIndices As Integer() = {1, 2}
    input.LoadImageFrames("example.tiff", pageIndices)

    ' Export as HOCR with position data
    Dim result As OcrResult = ocr.Read(input)
    result.SaveAsHocrFile("hocr.html")
End Using
$vbLabelText   $csharpLabel

Can IronOCR Read Barcodes Along with Text?

IronOCR uniquely combines text recognition with barcode reading capabilities, eliminating the need for separate libraries:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-19.cs
// Enable combined text and barcode recognition
using IronOcr;

var ocr = new IronTesseract();

// Enable barcode detection
ocr.Configuration.ReadBarCodes = true;

using (var input = new OcrInput())
{
    // Load image containing both text and barcodes
    input.LoadImage("img/Barcode.png");

    // Process both text and barcodes
    var result = ocr.Read(input);

    // Extract barcode data
    foreach (var barcode in result.Barcodes)
    {
        Console.WriteLine($"Barcode Value: {barcode.Value}");
        Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}");
    }
}
Imports IronOcr

Dim ocr As New IronTesseract()

' Enable barcode detection
ocr.Configuration.ReadBarCodes = True

Using input As New OcrInput()
    ' Load image containing both text and barcodes
    input.LoadImage("img/Barcode.png")

    ' Process both text and barcodes
    Dim result = ocr.Read(input)

    ' Extract barcode data
    For Each barcode In result.Barcodes
        Console.WriteLine($"Barcode Value: {barcode.Value}")
        Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}")
    Next
End Using
$vbLabelText   $csharpLabel

How to Access Detailed OCR Results and Metadata?

The IronOCR results object provides comprehensive data that advanced developers can leverage for sophisticated applications.

Each OcrResult contains hierarchical collections: pages, paragraphs, lines, words, and characters. All elements include detailed metadata like location, font information, and confidence scores.

Individual elements (paragraphs, words, barcodes) can be exported as images or bitmaps for further processing:

:path=/static-assets/ocr/content-code-examples/tutorials/how-to-read-text-from-an-image-in-csharp-net-20.cs
using System;
using IronOcr;
using IronSoftware.Drawing;

// Configure with barcode support
IronTesseract ocr = new IronTesseract
{
    Configuration = { ReadBarCodes = true }
};

using OcrInput input = new OcrInput();

// Process multi-page document
int[] pageIndices = { 1, 2 };
input.LoadImageFrames(@"img\Potter.tiff", pageIndices);

OcrResult result = ocr.Read(input);

// Navigate the complete results hierarchy
foreach (var page in result.Pages)
{
    // Page-level data
    int pageNumber = page.PageNumber;
    string pageText = page.Text;
    int pageWordCount = page.WordCount;

    // Extract page elements
    OcrResult.Barcode[] barcodes = page.Barcodes;
    AnyBitmap pageImage = page.ToBitmap();
    double pageWidth = page.Width;
    double pageHeight = page.Height;

    foreach (var paragraph in page.Paragraphs)
    {
        // Paragraph properties
        int paragraphNumber = paragraph.ParagraphNumber;
        string paragraphText = paragraph.Text;
        double paragraphConfidence = paragraph.Confidence;
        var textDirection = paragraph.TextDirection;

        foreach (var line in paragraph.Lines)
        {
            // Line details including baseline information
            string lineText = line.Text;
            double lineConfidence = line.Confidence;
            double baselineAngle = line.BaselineAngle;
            double baselineOffset = line.BaselineOffset;

            foreach (var word in line.Words)
            {
                // Word-level data
                string wordText = word.Text;
                double wordConfidence = word.Confidence;

                // Font information (when available)
                if (word.Font != null)
                {
                    string fontName = word.Font.FontName;
                    double fontSize = word.Font.FontSize;
                    bool isBold = word.Font.IsBold;
                    bool isItalic = word.Font.IsItalic;
                }

                foreach (var character in word.Characters)
                {
                    // Character-level analysis
                    string charText = character.Text;
                    double charConfidence = character.Confidence;

                    // Alternative character choices for spell-checking
                    OcrResult.Choice[] alternatives = character.Choices;
                }
            }
        }
    }
}
Imports System
Imports IronOcr
Imports IronSoftware.Drawing

' Configure with barcode support
Dim ocr As New IronTesseract With {
    .Configuration = New TesseractConfiguration With {
        .ReadBarCodes = True
    }
}

Using input As New OcrInput()

    ' Process multi-page document
    Dim pageIndices As Integer() = {1, 2}
    input.LoadImageFrames("img\Potter.tiff", pageIndices)

    Dim result As OcrResult = ocr.Read(input)

    ' Navigate the complete results hierarchy
    For Each page In result.Pages
        ' Page-level data
        Dim pageNumber As Integer = page.PageNumber
        Dim pageText As String = page.Text
        Dim pageWordCount As Integer = page.WordCount

        ' Extract page elements
        Dim barcodes As OcrResult.Barcode() = page.Barcodes
        Dim pageImage As AnyBitmap = page.ToBitmap()
        Dim pageWidth As Double = page.Width
        Dim pageHeight As Double = page.Height

        For Each paragraph In page.Paragraphs
            ' Paragraph properties
            Dim paragraphNumber As Integer = paragraph.ParagraphNumber
            Dim paragraphText As String = paragraph.Text
            Dim paragraphConfidence As Double = paragraph.Confidence
            Dim textDirection = paragraph.TextDirection

            For Each line In paragraph.Lines
                ' Line details including baseline information
                Dim lineText As String = line.Text
                Dim lineConfidence As Double = line.Confidence
                Dim baselineAngle As Double = line.BaselineAngle
                Dim baselineOffset As Double = line.BaselineOffset

                For Each word In line.Words
                    ' Word-level data
                    Dim wordText As String = word.Text
                    Dim wordConfidence As Double = word.Confidence

                    ' Font information (when available)
                    If word.Font IsNot Nothing Then
                        Dim fontName As String = word.Font.FontName
                        Dim fontSize As Double = word.Font.FontSize
                        Dim isBold As Boolean = word.Font.IsBold
                        Dim isItalic As Boolean = word.Font.IsItalic
                    End If

                    For Each character In word.Characters
                        ' Character-level analysis
                        Dim charText As String = character.Text
                        Dim charConfidence As Double = character.Confidence

                        ' Alternative character choices for spell-checking
                        Dim alternatives As OcrResult.Choice() = character.Choices
                    Next
                Next
            Next
        Next
    Next
End Using
$vbLabelText   $csharpLabel

Summary

IronOCR provides C# developers with the most advanced Tesseract API implementation, running seamlessly across Windows, Linux, and Mac platforms. Its ability to accurately read text from image using IronOCR - even from imperfect documents - sets it apart from basic OCR solutions.

The library's unique features include integrated barcode reading and the ability to export results as searchable PDFs or HOCR HTML, capabilities unavailable in standard Tesseract implementations.

Moving Forward

To continue mastering IronOCR:

Source Code Download

Ready to implement C# OCR image to text conversion in your applications? Download IronOCR and start your free trial today.

Frequently Asked Questions

How can I convert images to text in C# without using Tesseract?

You can use IronOCR to convert images to text in C# without the need for Tesseract. IronOCR simplifies the process with built-in methods that handle image-to-text conversion directly.

How do I improve OCR accuracy on low-quality images?

IronOCR provides image filters such as Input.Deskew() and Input.DeNoise() that can be used to enhance low-quality images by correcting skew and reducing noise, thus improving OCR accuracy significantly.

What are the steps to extract text from a multi-page document using OCR in C#?

To extract text from multi-page documents, IronOCR allows you to load and process each page using methods like LoadPdf() for PDFs or handling TIFF files, effectively converting each page to text.

Is it possible to read barcodes and text simultaneously from an image?

Yes, IronOCR can read both text and barcodes from a single image. You can enable barcode reading with ocr.Configuration.ReadBarCodes = true, which allows the extraction of both text and barcode data.

How can I set up OCR for processing documents in multiple languages?

IronOCR supports over 125 languages and allows you to set a primary language using ocr.Language and add additional languages with ocr.AddSecondaryLanguage() for multilingual document processing.

What methods are available to export OCR results in different formats?

IronOCR offers several methods to export OCR results, such as SaveAsSearchablePdf() for PDFs, SaveAsTextFile() for plain text, and SaveAsHocrFile() for HOCR HTML format.

How can I optimize OCR processing speed for large image files?

To optimize OCR processing speed, use IronOCR's OcrLanguage.EnglishFast for faster language recognition and define specific regions for OCR using System.Drawing.Rectangle to reduce processing time.

How do I handle OCR processing for protected PDF files?

When dealing with protected PDFs, use the LoadPdf() method along with the correct password. IronOCR handles image-based PDFs by converting pages to images automatically for OCR processing.

What should I do if the OCR results are not accurate?

If OCR results are inaccurate, consider using IronOCR's image enhancement features like Input.Deskew() and Input.DeNoise(), and ensure that the correct language packs are installed.

Can I customize the OCR process to exclude certain characters?

Yes, IronOCR allows customization of the OCR process by using the BlackListCharacters property to exclude specific characters, improving accuracy and processing speed by focusing only on relevant text.

Jacob Mellor, Chief Technology Officer @ Team Iron
Chief Technology Officer

Jacob Mellor is Chief Technology Officer at Iron Software and a visionary engineer pioneering C# PDF technology. As the original developer behind Iron Software's core codebase, he has shaped the company's product architecture since its inception, transforming it alongside CEO Cameron Rimington into a 50+ person company serving NASA, Tesla, ...

Read More
Reviewed by
Jeff Fritz
Jeffrey T. Fritz
Principal Program Manager - .NET Community Team
Jeff is also a Principal Program Manager for the .NET and Visual Studio teams. He is the executive producer of the .NET Conf virtual conference series and hosts 'Fritz and Friends' a live stream for developers that airs twice weekly where he talks tech and writes code together with viewers. Jeff writes workshops, presentations, and plans content for the largest Microsoft developer events including Microsoft Build, Microsoft Ignite, .NET Conf, and the Microsoft MVP Summit
Ready to Get Started?
Nuget Downloads 5,937,198 | Version: 2026.6 just released
Still Scrolling Icon

Still Scrolling?

Want proof fast? PM > Install-Package IronOcr
run a sample watch your image become searchable text.