IronOCR Tutorials How to Read Text from an Image in .NET C# OCR Image to Text Tutorial: Convert Images to Text Without Tesseract ByJacob Mellor August 28, 2018 Updated July 13, 2025 Share: Looking to convert images to text in C# without the hassle of complex Tesseract configurations? This comprehensive IronOCR C# tutorial shows you how to implement powerful optical character recognition in your .NET applications with just a few lines of code. View the IronOCR YouTube Playlist How to Convert Images to Text in C# Using IronOCR Download IronOCR - the C# OCR library for image to text conversion Use IronTesseract class to read text from images instantly Apply image filters to enhance OCR accuracy on low-quality scans Process multiple languages with downloadable language packs Export results as searchable PDFs or extract text strings How Do I Read Text from Images in .NET Applications? To achieve C# OCR image to text functionality in your .NET applications, you'll need a reliable OCR library. IronOCR provides a managed solution using the IronOcr.IronTesseract class that maximizes both accuracy and speed without requiring external dependencies. First, install IronOCR into your Visual Studio project. You can download the IronOCR DLL directly or use NuGet Package Manager. Install-Package IronOcr Why Choose IronOCR for C# OCR Without Tesseract? When you need to convert images to text in C#, IronOCR offers significant advantages over traditional Tesseract implementations: Works immediately in pure .NET environments No Tesseract installation or configuration required Runs the latest engines: Tesseract 5 (plus Tesseract 4 & 3) Compatible with .NET Framework 4.5+, .NET Standard 2+, and .NET Core 2, 3, 5, 6, 7, 8, 9, and 10 Improves accuracy and speed compared to vanilla Tesseract Supports Xamarin, Mono, Azure, and Docker deployments Manages complex Tesseract dictionaries through NuGet packages Handles PDFs, MultiFrame TIFFs, and all major image formats automatically Corrects low-quality and skewed scans for optimal results Start using IronOCR in your project today with a free trial. First Step: Start for Free How to Use IronOCR C# Tutorial for Basic OCR? This Iron Tesseract C# example demonstrates the simplest way to read text from image using IronOCR. The IronOcr.IronTesseract class extracts text and returns it as a string. // Basic C# OCR image to text conversion using IronOCR // This example shows how to extract text from images without complex setup using IronOcr; using System; try { // Initialize IronTesseract for OCR operations var ocrEngine = new IronTesseract(); // Path to your image file - supports PNG, JPG, TIFF, BMP, and more var imagePath = @"img\Screenshot.png"; // Create input and perform OCR to convert image to text using (var input = new OcrInput(imagePath)) { // Read text from image and get results OcrResult result = ocrEngine.Read(input); // Display extracted text Console.WriteLine(result.Text); } } catch (OcrException ex) { // Handle OCR-specific errors Console.WriteLine($"OCR Error: {ex.Message}"); } catch (Exception ex) { // Handle general errors Console.WriteLine($"Error: {ex.Message}"); } // Basic C# OCR image to text conversion using IronOCR // This example shows how to extract text from images without complex setup using IronOcr; using System; try { // Initialize IronTesseract for OCR operations var ocrEngine = new IronTesseract(); // Path to your image file - supports PNG, JPG, TIFF, BMP, and more var imagePath = @"img\Screenshot.png"; // Create input and perform OCR to convert image to text using (var input = new OcrInput(imagePath)) { // Read text from image and get results OcrResult result = ocrEngine.Read(input); // Display extracted text Console.WriteLine(result.Text); } } catch (OcrException ex) { // Handle OCR-specific errors Console.WriteLine($"OCR Error: {ex.Message}"); } catch (Exception ex) { // Handle general errors Console.WriteLine($"Error: {ex.Message}"); } ' Basic C# OCR image to text conversion using IronOCR ' This example shows how to extract text from images without complex setup Imports IronOcr Imports System Try ' Initialize IronTesseract for OCR operations Dim ocrEngine = New IronTesseract() ' Path to your image file - supports PNG, JPG, TIFF, BMP, and more Dim imagePath = "img\Screenshot.png" ' Create input and perform OCR to convert image to text Using input = New OcrInput(imagePath) ' Read text from image and get results Dim result As OcrResult = ocrEngine.Read(input) ' Display extracted text Console.WriteLine(result.Text) End Using Catch ex As OcrException ' Handle OCR-specific errors Console.WriteLine($"OCR Error: {ex.Message}") Catch ex As Exception ' Handle general errors Console.WriteLine($"Error: {ex.Message}") End Try $vbLabelText $csharpLabel This code achieves 100% accuracy on clear images, extracting text exactly as it appears: IronOCR Simple Example In this simple example we test the accuracy of our C# OCR library to read text from a PNG Image. This is a very basic test, but things will get more complicated as the tutorial continues. The quick brown fox jumps over the lazy dog The IronTesseract class handles complex OCR operations internally. It automatically scans for alignment, optimizes resolution, and uses AI to read text from image using IronOCR with human-level accuracy. Despite the sophisticated processing happening behind the scenes - including image analysis, engine optimization, and intelligent text recognition - the OCR process matches human reading speed while maintaining exceptional accuracy levels. Screenshot demonstrating IronOCR's ability to extract text from a PNG image with perfect accuracy How to Implement Advanced C# OCR Without Tesseract Configuration? For production applications requiring optimal performance when you convert images to text in C#, use the OcrInput and IronTesseract classes together. This approach provides fine-grained control over the OCR process. OcrInput Class Features Processes multiple image formats: JPEG, TIFF, GIF, BMP, PNG Imports complete PDFs or specific pages Enhances contrast, resolution, and image quality automatically Corrects rotation, scan noise, skew, and negative images IronTesseract Class Features Access to 125+ prepackaged languages Tesseract 5, 4, and 3 engines included Document type specification (screenshot, snippet, or full document) Integrated barcode reading capabilities Multiple output formats: Searchable PDFs, HOCR HTML, DOM objects, and strings How to Get Started with OcrInput and IronTesseract? Here's a recommended configuration for this IronOCR C# tutorial that works well with most document types: using IronOcr; // Initialize IronTesseract for advanced OCR operations IronTesseract ocr = new IronTesseract(); // Create input container for processing multiple images using (OcrInput input = new OcrInput()) { // Process specific pages from multi-page TIFF files int[] pageIndices = new int[] { 1, 2 }; // Load TIFF frames - perfect for scanned documents input.LoadImageFrames(@"img\Potter.tiff", pageIndices); // Execute OCR to read text from image using IronOCR OcrResult result = ocr.Read(input); // Output the extracted text Console.WriteLine(result.Text); } using IronOcr; // Initialize IronTesseract for advanced OCR operations IronTesseract ocr = new IronTesseract(); // Create input container for processing multiple images using (OcrInput input = new OcrInput()) { // Process specific pages from multi-page TIFF files int[] pageIndices = new int[] { 1, 2 }; // Load TIFF frames - perfect for scanned documents input.LoadImageFrames(@"img\Potter.tiff", pageIndices); // Execute OCR to read text from image using IronOCR OcrResult result = ocr.Read(input); // Output the extracted text Console.WriteLine(result.Text); } Imports IronOcr ' Initialize IronTesseract for advanced OCR operations Private ocr As New IronTesseract() ' Create input container for processing multiple images Using input As New OcrInput() ' Process specific pages from multi-page TIFF files Dim pageIndices() As Integer = { 1, 2 } ' Load TIFF frames - perfect for scanned documents input.LoadImageFrames("img\Potter.tiff", pageIndices) ' Execute OCR to read text from image using IronOCR Dim result As OcrResult = ocr.Read(input) ' Output the extracted text Console.WriteLine(result.Text) End Using $vbLabelText $csharpLabel This configuration consistently achieves near-perfect accuracy on medium-quality scans. The LoadImageFrames method efficiently handles multi-page documents, making it ideal for batch processing scenarios. Sample TIFF document demonstrating IronOCR's multi-page text extraction capabilities The ability to read text from images and barcodes in scanned documents like TIFFs showcases how IronOCR simplifies complex OCR tasks. The library excels with real-world documents, seamlessly handling multi-page TIFFs and PDF text extraction. How Does IronOCR Handle Low-Quality Scans? Low-resolution document with noise that IronOCR can process accurately using image filters When working with imperfect scans containing distortion and digital noise, IronOCR outperforms other C# OCR libraries. It's specifically designed for real-world scenarios rather than pristine test images. // Advanced Iron Tesseract C# example for low-quality images using IronOcr; using System; var ocr = new IronTesseract(); try { using (var input = new OcrInput()) { // Load specific pages from poor-quality TIFF var pageIndices = new int[] { 0, 1 }; input.LoadImageFrames(@"img\Potter.LowQuality.tiff", pageIndices); // Apply deskew filter to correct rotation and perspective input.Deskew(); // Critical for improving accuracy on skewed scans // Perform OCR with enhanced preprocessing OcrResult result = ocr.Read(input); // Display results Console.WriteLine("Recognized Text:"); Console.WriteLine(result.Text); } } catch (Exception ex) { Console.WriteLine($"Error during OCR: {ex.Message}"); } // Advanced Iron Tesseract C# example for low-quality images using IronOcr; using System; var ocr = new IronTesseract(); try { using (var input = new OcrInput()) { // Load specific pages from poor-quality TIFF var pageIndices = new int[] { 0, 1 }; input.LoadImageFrames(@"img\Potter.LowQuality.tiff", pageIndices); // Apply deskew filter to correct rotation and perspective input.Deskew(); // Critical for improving accuracy on skewed scans // Perform OCR with enhanced preprocessing OcrResult result = ocr.Read(input); // Display results Console.WriteLine("Recognized Text:"); Console.WriteLine(result.Text); } } catch (Exception ex) { Console.WriteLine($"Error during OCR: {ex.Message}"); } ' Advanced Iron Tesseract C# example for low-quality images Imports IronOcr Imports System Private ocr = New IronTesseract() Try Using input = New OcrInput() ' Load specific pages from poor-quality TIFF Dim pageIndices = New Integer() { 0, 1 } input.LoadImageFrames("img\Potter.LowQuality.tiff", pageIndices) ' Apply deskew filter to correct rotation and perspective input.Deskew() ' Critical for improving accuracy on skewed scans ' Perform OCR with enhanced preprocessing Dim result As OcrResult = ocr.Read(input) ' Display results Console.WriteLine("Recognized Text:") Console.WriteLine(result.Text) End Using Catch ex As Exception Console.WriteLine($"Error during OCR: {ex.Message}") End Try $vbLabelText $csharpLabel Using Input.Deskew(), accuracy improves to 99.8% on low-quality scans, nearly matching high-quality results. This demonstrates why IronOCR is the preferred choice for C# OCR without Tesseract complications. Image filters may slightly increase processing time but significantly reduce overall OCR duration. Finding the right balance depends on your document quality. For most scenarios, Input.Deskew() and Input.DeNoise() provide reliable improvements to OCR performance. Learn more about image preprocessing techniques. How to Optimize OCR Performance and Speed? The most significant factor affecting OCR speed when you convert images to text in C# is input quality. Higher DPI (~200 dpi) with minimal noise produces the fastest and most accurate results. While IronOCR excels at correcting imperfect documents, this enhancement requires additional processing time. Choose image formats with minimal compression artifacts. TIFF and PNG typically yield faster results than JPEG due to lower digital noise. Which Image Filters Improve OCR Speed? The following filters can dramatically enhance performance in your C# OCR image to text workflow: OcrInput.Rotate(double degrees): Rotates images clockwise (negative for counterclockwise) OcrInput.Binarize(): Converts to black/white, improving performance in low-contrast scenarios OcrInput.ToGrayScale(): Converts to grayscale for potential speed improvements OcrInput.Contrast(): Auto-adjusts contrast for better accuracy OcrInput.DeNoise(): Removes digital artifacts when noise is expected OcrInput.Invert(): Inverts colors for white-on-black text OcrInput.Dilate(): Expands text boundaries OcrInput.Erode(): Reduces text boundaries OcrInput.Deskew(): Corrects alignment - essential for skewed documents OcrInput.DeepCleanBackgroundNoise(): Aggressive noise removal OcrInput.EnhanceResolution: Improves low-resolution image quality How to Configure IronOCR for Maximum Speed? Use these settings to optimize speed when processing high-quality scans: using IronOcr; // Configure for speed - ideal for clean documents IronTesseract ocr = new IronTesseract(); // Exclude problematic characters to speed up recognition ocr.Configuration.BlackListCharacters = "~`$#^*_{[]} \\"; // Use automatic page segmentation ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto; // Select fast English language pack ocr.Language = OcrLanguage.EnglishFast; using (OcrInput input = new OcrInput()) { // Load specific pages from document int[] pageIndices = new int[] { 1, 2 }; input.LoadImageFrames(@"img\Potter.tiff", pageIndices); // Read with optimized settings OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); } using IronOcr; // Configure for speed - ideal for clean documents IronTesseract ocr = new IronTesseract(); // Exclude problematic characters to speed up recognition ocr.Configuration.BlackListCharacters = "~`$#^*_{[]} \\"; // Use automatic page segmentation ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto; // Select fast English language pack ocr.Language = OcrLanguage.EnglishFast; using (OcrInput input = new OcrInput()) { // Load specific pages from document int[] pageIndices = new int[] { 1, 2 }; input.LoadImageFrames(@"img\Potter.tiff", pageIndices); // Read with optimized settings OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); } Imports IronOcr ' Configure for speed - ideal for clean documents Private ocr As New IronTesseract() ' Exclude problematic characters to speed up recognition ocr.Configuration.BlackListCharacters = "~`$#^*_{[]} \" ' Use automatic page segmentation ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto ' Select fast English language pack ocr.Language = OcrLanguage.EnglishFast Using input As New OcrInput() ' Load specific pages from document Dim pageIndices() As Integer = { 1, 2 } input.LoadImageFrames("img\Potter.tiff", pageIndices) ' Read with optimized settings Dim result As OcrResult = ocr.Read(input) Console.WriteLine(result.Text) End Using $vbLabelText $csharpLabel This optimized setup maintains 99.8% accuracy while achieving a 35% speed improvement compared to default settings. How to Read Specific Areas of Images Using C# OCR? The Iron Tesseract C# example below shows how to target specific regions using System.Drawing.Rectangle. This technique is invaluable for processing standardized forms where text appears in predictable locations. Can IronOCR Process Cropped Regions for Faster Results? Using pixel-based coordinates, you can limit OCR to specific areas, dramatically improving speed and preventing unwanted text extraction: using IronOcr; using IronSoftware.Drawing; // Initialize OCR engine for targeted region processing var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Define exact region for OCR - coordinates in pixels var contentArea = new System.Drawing.Rectangle( x: 215, y: 1250, width: 1335, height: 280 ); // Load image with specific area - perfect for forms and invoices input.AddImage("img/ComSci.png", contentArea); // Process only the defined region OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); } using IronOcr; using IronSoftware.Drawing; // Initialize OCR engine for targeted region processing var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Define exact region for OCR - coordinates in pixels var contentArea = new System.Drawing.Rectangle( x: 215, y: 1250, width: 1335, height: 280 ); // Load image with specific area - perfect for forms and invoices input.AddImage("img/ComSci.png", contentArea); // Process only the defined region OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); } Imports IronOcr Imports IronSoftware.Drawing ' Initialize OCR engine for targeted region processing Private ocr = New IronTesseract() Using input = New OcrInput() ' Define exact region for OCR - coordinates in pixels Dim contentArea = New System.Drawing.Rectangle(x:= 215, y:= 1250, width:= 1335, height:= 280) ' Load image with specific area - perfect for forms and invoices input.AddImage("img/ComSci.png", contentArea) ' Process only the defined region Dim result As OcrResult = ocr.Read(input) Console.WriteLine(result.Text) End Using $vbLabelText $csharpLabel This targeted approach provides a 41% speed improvement while extracting only relevant text. It's ideal for structured documents like invoices, checks, and forms. The same cropping technique works seamlessly with PDF OCR operations. Document demonstrating precise region-based text extraction using IronOCR's rectangle selection How Many Languages Does IronOCR Support? IronOCR provides 125 international languages through convenient language packs. Download them as DLLs from our website or via NuGet Package Manager. Install language packs through the NuGet interface (search "IronOcr.Languages") or visit the complete language pack listing. Supported languages include Arabic, Chinese (Simplified/Traditional), Japanese, Korean, Hindi, Russian, German, French, Spanish, and 115+ others, each optimized for accurate text recognition. How to Implement OCR in Multiple Languages? This IronOCR C# tutorial example demonstrates Arabic text recognition: Install-Package IronOcr.Languages.Arabic IronOCR accurately extracting Arabic text from a GIF image // Install-Package IronOcr.Languages.Arabic using IronOcr; // Configure for Arabic language OCR var ocr = new IronTesseract(); ocr.Language = OcrLanguage.Arabic; using (var input = new OcrInput()) { // Load Arabic text image input.AddImage("img/arabic.gif"); // IronOCR handles low-quality Arabic text that standard Tesseract cannot var result = ocr.Read(input); // Save to file (console may not display Arabic correctly) result.SaveAsTextFile("arabic.txt"); } // Install-Package IronOcr.Languages.Arabic using IronOcr; // Configure for Arabic language OCR var ocr = new IronTesseract(); ocr.Language = OcrLanguage.Arabic; using (var input = new OcrInput()) { // Load Arabic text image input.AddImage("img/arabic.gif"); // IronOCR handles low-quality Arabic text that standard Tesseract cannot var result = ocr.Read(input); // Save to file (console may not display Arabic correctly) result.SaveAsTextFile("arabic.txt"); } ' Install-Package IronOcr.Languages.Arabic Imports IronOcr ' Configure for Arabic language OCR Private ocr = New IronTesseract() ocr.Language = OcrLanguage.Arabic Using input = New OcrInput() ' Load Arabic text image input.AddImage("img/arabic.gif") ' IronOCR handles low-quality Arabic text that standard Tesseract cannot Dim result = ocr.Read(input) ' Save to file (console may not display Arabic correctly) result.SaveAsTextFile("arabic.txt") End Using $vbLabelText $csharpLabel Can IronOCR Handle Documents with Multiple Languages? When documents contain mixed languages, configure IronOCR for multi-language support: Install-Package IronOcr.Languages.ChineseSimplified // Multi-language OCR configuration using IronOcr; var ocr = new IronTesseract(); // Set primary language ocr.Language = OcrLanguage.ChineseSimplified; // Add secondary languages as needed ocr.AddSecondaryLanguage(OcrLanguage.English); // Custom .traineddata files can be added for specialized recognition // ocr.AddSecondaryLanguage("path/to/custom.traineddata"); using (var input = new OcrInput()) { // Process multi-language document input.AddImage("img/MultiLanguage.jpeg"); var result = ocr.Read(input); result.SaveAsTextFile("MultiLanguage.txt"); } // Multi-language OCR configuration using IronOcr; var ocr = new IronTesseract(); // Set primary language ocr.Language = OcrLanguage.ChineseSimplified; // Add secondary languages as needed ocr.AddSecondaryLanguage(OcrLanguage.English); // Custom .traineddata files can be added for specialized recognition // ocr.AddSecondaryLanguage("path/to/custom.traineddata"); using (var input = new OcrInput()) { // Process multi-language document input.AddImage("img/MultiLanguage.jpeg"); var result = ocr.Read(input); result.SaveAsTextFile("MultiLanguage.txt"); } ' Multi-language OCR configuration Imports IronOcr Private ocr = New IronTesseract() ' Set primary language ocr.Language = OcrLanguage.ChineseSimplified ' Add secondary languages as needed ocr.AddSecondaryLanguage(OcrLanguage.English) ' Custom .traineddata files can be added for specialized recognition ' ocr.AddSecondaryLanguage("path/to/custom.traineddata"); Using input = New OcrInput() ' Process multi-language document input.AddImage("img/MultiLanguage.jpeg") Dim result = ocr.Read(input) result.SaveAsTextFile("MultiLanguage.txt") End Using $vbLabelText $csharpLabel How to Process Multi-Page Documents with C# OCR? IronOCR seamlessly combines multiple pages or images into a single OcrResult. This feature enables powerful capabilities like creating searchable PDFs and extracting text from entire document sets. Mix and match various sources - images, TIFF frames, and PDF pages - in a single OCR operation: // Multi-source document processing using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { // Add various image formats input.AddImage("image1.jpeg"); input.AddImage("image2.png"); // Process specific frames from multi-frame images int[] frameNumbers = { 1, 2 }; input.AddImageFrames("image3.gif", frameNumbers); // Process all sources together OcrResult result = ocr.Read(input); // Verify page count Console.WriteLine($"{result.Pages.Count} Pages processed."); } // Multi-source document processing using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { // Add various image formats input.AddImage("image1.jpeg"); input.AddImage("image2.png"); // Process specific frames from multi-frame images int[] frameNumbers = { 1, 2 }; input.AddImageFrames("image3.gif", frameNumbers); // Process all sources together OcrResult result = ocr.Read(input); // Verify page count Console.WriteLine($"{result.Pages.Count} Pages processed."); } ' Multi-source document processing Imports IronOcr Private ocr As New IronTesseract() Using input As New OcrInput() ' Add various image formats input.AddImage("image1.jpeg") input.AddImage("image2.png") ' Process specific frames from multi-frame images Dim frameNumbers() As Integer = { 1, 2 } input.AddImageFrames("image3.gif", frameNumbers) ' Process all sources together Dim result As OcrResult = ocr.Read(input) ' Verify page count Console.WriteLine($"{result.Pages.Count} Pages processed.") End Using $vbLabelText $csharpLabel Process all pages of a TIFF file efficiently: using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { // Define pages to process (0-based indexing) int[] pageIndices = new int[] { 0, 1 }; // Load specific TIFF frames input.LoadImageFrames("MultiFrame.Tiff", pageIndices); // Extract text from all frames OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); Console.WriteLine($"{result.Pages.Count} Pages processed"); } using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { // Define pages to process (0-based indexing) int[] pageIndices = new int[] { 0, 1 }; // Load specific TIFF frames input.LoadImageFrames("MultiFrame.Tiff", pageIndices); // Extract text from all frames OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); Console.WriteLine($"{result.Pages.Count} Pages processed"); } Imports IronOcr Private ocr As New IronTesseract() Using input As New OcrInput() ' Define pages to process (0-based indexing) Dim pageIndices() As Integer = { 0, 1 } ' Load specific TIFF frames input.LoadImageFrames("MultiFrame.Tiff", pageIndices) ' Extract text from all frames Dim result As OcrResult = ocr.Read(input) Console.WriteLine(result.Text) Console.WriteLine($"{result.Pages.Count} Pages processed") End Using $vbLabelText $csharpLabel Convert TIFFs or PDFs to searchable formats: using System; using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { try { // Load password-protected PDF if needed input.LoadPdf("example.pdf", "password"); // Process entire document OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); Console.WriteLine($"{result.Pages.Count} Pages recognized"); } catch (Exception ex) { Console.WriteLine($"Error processing PDF: {ex.Message}"); } } using System; using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { try { // Load password-protected PDF if needed input.LoadPdf("example.pdf", "password"); // Process entire document OcrResult result = ocr.Read(input); Console.WriteLine(result.Text); Console.WriteLine($"{result.Pages.Count} Pages recognized"); } catch (Exception ex) { Console.WriteLine($"Error processing PDF: {ex.Message}"); } } Imports System Imports IronOcr Private ocr As New IronTesseract() Using input As New OcrInput() Try ' Load password-protected PDF if needed input.LoadPdf("example.pdf", "password") ' Process entire document Dim result As OcrResult = ocr.Read(input) Console.WriteLine(result.Text) Console.WriteLine($"{result.Pages.Count} Pages recognized") Catch ex As Exception Console.WriteLine($"Error processing PDF: {ex.Message}") End Try End Using $vbLabelText $csharpLabel How to Create Searchable PDFs from Images? IronOCR excels at creating searchable PDFs - a critical feature for database systems, SEO optimization, and document accessibility. using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { // Set document metadata input.Title = "Quarterly Report"; // Combine multiple sources input.AddImage("image1.jpeg"); input.AddImage("image2.png"); // Add specific frames from animated images int[] gifFrames = new int[] { 1, 2 }; input.AddImageFrames("image3.gif", gifFrames); // Create searchable PDF OcrResult result = ocr.Read(input); result.SaveAsSearchablePdf("searchable.pdf"); } using IronOcr; IronTesseract ocr = new IronTesseract(); using (OcrInput input = new OcrInput()) { // Set document metadata input.Title = "Quarterly Report"; // Combine multiple sources input.AddImage("image1.jpeg"); input.AddImage("image2.png"); // Add specific frames from animated images int[] gifFrames = new int[] { 1, 2 }; input.AddImageFrames("image3.gif", gifFrames); // Create searchable PDF OcrResult result = ocr.Read(input); result.SaveAsSearchablePdf("searchable.pdf"); } Imports IronOcr Private ocr As New IronTesseract() Using input As New OcrInput() ' Set document metadata input.Title = "Quarterly Report" ' Combine multiple sources input.AddImage("image1.jpeg") input.AddImage("image2.png") ' Add specific frames from animated images Dim gifFrames() As Integer = { 1, 2 } input.AddImageFrames("image3.gif", gifFrames) ' Create searchable PDF Dim result As OcrResult = ocr.Read(input) result.SaveAsSearchablePdf("searchable.pdf") End Using $vbLabelText $csharpLabel Convert existing PDFs to searchable versions: using IronOcr; var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Set PDF metadata input.Title = "Annual Report 2024"; // Process existing PDF input.LoadPdf("example.pdf", "password"); // Generate searchable version var result = ocr.Read(input); result.SaveAsSearchablePdf("searchable.pdf"); } using IronOcr; var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Set PDF metadata input.Title = "Annual Report 2024"; // Process existing PDF input.LoadPdf("example.pdf", "password"); // Generate searchable version var result = ocr.Read(input); result.SaveAsSearchablePdf("searchable.pdf"); } Imports IronOcr Private ocr = New IronTesseract() Using input = New OcrInput() ' Set PDF metadata input.Title = "Annual Report 2024" ' Process existing PDF input.LoadPdf("example.pdf", "password") ' Generate searchable version Dim result = ocr.Read(input) result.SaveAsSearchablePdf("searchable.pdf") End Using $vbLabelText $csharpLabel Apply the same technique to TIFF conversions: using IronOcr; var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Configure document properties input.Title = "Scanned Archive Document"; // Select pages to process var pageIndices = new int[] { 1, 2 }; input.LoadImageFrames("example.tiff", pageIndices); // Create searchable PDF from TIFF OcrResult result = ocr.Read(input); result.SaveAsSearchablePdf("searchable.pdf"); } using IronOcr; var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Configure document properties input.Title = "Scanned Archive Document"; // Select pages to process var pageIndices = new int[] { 1, 2 }; input.LoadImageFrames("example.tiff", pageIndices); // Create searchable PDF from TIFF OcrResult result = ocr.Read(input); result.SaveAsSearchablePdf("searchable.pdf"); } Imports IronOcr Private ocr = New IronTesseract() Using input = New OcrInput() ' Configure document properties input.Title = "Scanned Archive Document" ' Select pages to process Dim pageIndices = New Integer() { 1, 2 } input.LoadImageFrames("example.tiff", pageIndices) ' Create searchable PDF from TIFF Dim result As OcrResult = ocr.Read(input) result.SaveAsSearchablePdf("searchable.pdf") End Using $vbLabelText $csharpLabel How to Export OCR Results as HOCR HTML? IronOCR supports HOCR HTML export, enabling structured PDF to HTML and TIFF to HTML conversions while preserving layout information: using IronOcr; var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Set HTML title input.Title = "Document Archive"; // Process multiple document types input.AddImage("image2.jpeg"); input.AddPdf("example.pdf", "password"); // Add TIFF pages var pageIndices = new int[] { 1, 2 }; input.AddTiff("example.tiff", pageIndices); // Export as HOCR with position data OcrResult result = ocr.Read(input); result.SaveAsHocrFile("hocr.html"); } using IronOcr; var ocr = new IronTesseract(); using (var input = new OcrInput()) { // Set HTML title input.Title = "Document Archive"; // Process multiple document types input.AddImage("image2.jpeg"); input.AddPdf("example.pdf", "password"); // Add TIFF pages var pageIndices = new int[] { 1, 2 }; input.AddTiff("example.tiff", pageIndices); // Export as HOCR with position data OcrResult result = ocr.Read(input); result.SaveAsHocrFile("hocr.html"); } Imports IronOcr Private ocr = New IronTesseract() Using input = New OcrInput() ' Set HTML title input.Title = "Document Archive" ' Process multiple document types input.AddImage("image2.jpeg") input.AddPdf("example.pdf", "password") ' Add TIFF pages Dim pageIndices = New Integer() { 1, 2 } input.AddTiff("example.tiff", pageIndices) ' Export as HOCR with position data Dim result As OcrResult = ocr.Read(input) result.SaveAsHocrFile("hocr.html") End Using $vbLabelText $csharpLabel Can IronOCR Read Barcodes Along with Text? IronOCR uniquely combines text recognition with barcode reading capabilities, eliminating the need for separate libraries: // Enable combined text and barcode recognition using IronOcr; var ocr = new IronTesseract(); // Enable barcode detection ocr.Configuration.ReadBarCodes = true; using (var input = new OcrInput()) { // Load image containing both text and barcodes input.AddImage("img/Barcode.png"); // Process both text and barcodes var result = ocr.Read(input); // Extract barcode data foreach (var barcode in result.Barcodes) { Console.WriteLine($"Barcode Value: {barcode.Value}"); Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}"); } } // Enable combined text and barcode recognition using IronOcr; var ocr = new IronTesseract(); // Enable barcode detection ocr.Configuration.ReadBarCodes = true; using (var input = new OcrInput()) { // Load image containing both text and barcodes input.AddImage("img/Barcode.png"); // Process both text and barcodes var result = ocr.Read(input); // Extract barcode data foreach (var barcode in result.Barcodes) { Console.WriteLine($"Barcode Value: {barcode.Value}"); Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}"); } } ' Enable combined text and barcode recognition Imports IronOcr Private ocr = New IronTesseract() ' Enable barcode detection ocr.Configuration.ReadBarCodes = True Using input = New OcrInput() ' Load image containing both text and barcodes input.AddImage("img/Barcode.png") ' Process both text and barcodes Dim result = ocr.Read(input) ' Extract barcode data For Each barcode In result.Barcodes Console.WriteLine($"Barcode Value: {barcode.Value}") Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}") Next barcode End Using $vbLabelText $csharpLabel How to Access Detailed OCR Results and Metadata? The IronOCR results object provides comprehensive data that advanced developers can leverage for sophisticated applications. Each OcrResult contains hierarchical collections: pages, paragraphs, lines, words, and characters. All elements include detailed metadata like location, font information, and confidence scores. Individual elements (paragraphs, words, barcodes) can be exported as images or bitmaps for further processing: using System; using IronOcr; using IronSoftware.Drawing; // Configure with barcode support IronTesseract ocr = new IronTesseract { Configuration = { ReadBarCodes = true } }; using OcrInput input = new OcrInput(); // Process multi-page document int[] pageIndices = { 1, 2 }; input.LoadImageFrames(@"img\Potter.tiff", pageIndices); OcrResult result = ocr.Read(input); // Navigate the complete results hierarchy foreach (var page in result.Pages) { // Page-level data int pageNumber = page.PageNumber; string pageText = page.Text; int pageWordCount = page.WordCount; // Extract page elements OcrResult.Barcode[] barcodes = page.Barcodes; AnyBitmap pageImage = page.ToBitmap(); double pageWidth = page.Width; double pageHeight = page.Height; foreach (var paragraph in page.Paragraphs) { // Paragraph properties int paragraphNumber = paragraph.ParagraphNumber; string paragraphText = paragraph.Text; double paragraphConfidence = paragraph.Confidence; var textDirection = paragraph.TextDirection; foreach (var line in paragraph.Lines) { // Line details including baseline information string lineText = line.Text; double lineConfidence = line.Confidence; double baselineAngle = line.BaselineAngle; double baselineOffset = line.BaselineOffset; foreach (var word in line.Words) { // Word-level data string wordText = word.Text; double wordConfidence = word.Confidence; // Font information (when available) if (word.Font != null) { string fontName = word.Font.FontName; double fontSize = word.Font.FontSize; bool isBold = word.Font.IsBold; bool isItalic = word.Font.IsItalic; } foreach (var character in word.Characters) { // Character-level analysis string charText = character.Text; double charConfidence = character.Confidence; // Alternative character choices for spell-checking OcrResult.Choice[] alternatives = character.Choices; } } } } } using System; using IronOcr; using IronSoftware.Drawing; // Configure with barcode support IronTesseract ocr = new IronTesseract { Configuration = { ReadBarCodes = true } }; using OcrInput input = new OcrInput(); // Process multi-page document int[] pageIndices = { 1, 2 }; input.LoadImageFrames(@"img\Potter.tiff", pageIndices); OcrResult result = ocr.Read(input); // Navigate the complete results hierarchy foreach (var page in result.Pages) { // Page-level data int pageNumber = page.PageNumber; string pageText = page.Text; int pageWordCount = page.WordCount; // Extract page elements OcrResult.Barcode[] barcodes = page.Barcodes; AnyBitmap pageImage = page.ToBitmap(); double pageWidth = page.Width; double pageHeight = page.Height; foreach (var paragraph in page.Paragraphs) { // Paragraph properties int paragraphNumber = paragraph.ParagraphNumber; string paragraphText = paragraph.Text; double paragraphConfidence = paragraph.Confidence; var textDirection = paragraph.TextDirection; foreach (var line in paragraph.Lines) { // Line details including baseline information string lineText = line.Text; double lineConfidence = line.Confidence; double baselineAngle = line.BaselineAngle; double baselineOffset = line.BaselineOffset; foreach (var word in line.Words) { // Word-level data string wordText = word.Text; double wordConfidence = word.Confidence; // Font information (when available) if (word.Font != null) { string fontName = word.Font.FontName; double fontSize = word.Font.FontSize; bool isBold = word.Font.IsBold; bool isItalic = word.Font.IsItalic; } foreach (var character in word.Characters) { // Character-level analysis string charText = character.Text; double charConfidence = character.Confidence; // Alternative character choices for spell-checking OcrResult.Choice[] alternatives = character.Choices; } } } } } Imports System Imports IronOcr Imports IronSoftware.Drawing ' Configure with barcode support Private ocr As New IronTesseract With { .Configuration = { ReadBarCodes = True } } Private OcrInput As using ' Process multi-page document Private pageIndices() As Integer = { 1, 2 } input.LoadImageFrames("img\Potter.tiff", pageIndices) Dim result As OcrResult = ocr.Read(input) ' Navigate the complete results hierarchy For Each page In result.Pages ' Page-level data Dim pageNumber As Integer = page.PageNumber Dim pageText As String = page.Text Dim pageWordCount As Integer = page.WordCount ' Extract page elements Dim barcodes() As OcrResult.Barcode = page.Barcodes Dim pageImage As AnyBitmap = page.ToBitmap() Dim pageWidth As Double = page.Width Dim pageHeight As Double = page.Height For Each paragraph In page.Paragraphs ' Paragraph properties Dim paragraphNumber As Integer = paragraph.ParagraphNumber Dim paragraphText As String = paragraph.Text Dim paragraphConfidence As Double = paragraph.Confidence Dim textDirection = paragraph.TextDirection For Each line In paragraph.Lines ' Line details including baseline information Dim lineText As String = line.Text Dim lineConfidence As Double = line.Confidence Dim baselineAngle As Double = line.BaselineAngle Dim baselineOffset As Double = line.BaselineOffset For Each word In line.Words ' Word-level data Dim wordText As String = word.Text Dim wordConfidence As Double = word.Confidence ' Font information (when available) If word.Font IsNot Nothing Then Dim fontName As String = word.Font.FontName Dim fontSize As Double = word.Font.FontSize Dim isBold As Boolean = word.Font.IsBold Dim isItalic As Boolean = word.Font.IsItalic End If For Each character In word.Characters ' Character-level analysis Dim charText As String = character.Text Dim charConfidence As Double = character.Confidence ' Alternative character choices for spell-checking Dim alternatives() As OcrResult.Choice = character.Choices Next character Next word Next line Next paragraph Next page $vbLabelText $csharpLabel Summary IronOCR provides C# developers with the most advanced Tesseract API implementation, running seamlessly across Windows, Linux, and Mac platforms. Its ability to accurately read text from image using IronOCR - even from imperfect documents - sets it apart from basic OCR solutions. The library's unique features include integrated barcode reading and the ability to export results as searchable PDFs or HOCR HTML, capabilities unavailable in standard Tesseract implementations. Moving Forward To continue mastering IronOCR: Explore our comprehensive getting started guide Browse practical C# code examples Reference the detailed API documentation Source Code Download GitHub Repository Download Complete Source Ready to implement C# OCR image to text conversion in your applications? Download IronOCR and start your free trial today. Frequently Asked Questions How do I fix low OCR accuracy on scanned documents? Use IronOCR's image filters like Input.Deskew() and Input.DeNoise() to improve accuracy on low-quality scans. These filters correct rotation, remove noise, and enhance contrast, often improving accuracy from 95% to 99.8%. What image formats are supported for OCR processing? IronOCR supports all major image formats including PNG, JPEG, TIFF, BMP, and GIF. It also processes PDFs and multi-frame TIFF files, allowing you to mix different formats in a single OCR operation. How can I speed up OCR processing for large documents? Configure IronOCR with OcrLanguage.EnglishFast, use BlackListCharacters to exclude unnecessary symbols, and process specific regions using System.Drawing.Rectangle. These optimizations can improve speed by 35-41%. Can I extract text from specific areas of an image? Yes, use OcrInput.AddImage() with a System.Drawing.Rectangle parameter to define exact pixel coordinates for OCR processing. This is ideal for forms, invoices, and structured documents. How do I handle documents with multiple languages? Set the primary language using ocr.Language and add secondary languages with ocr.AddSecondaryLanguage(). IronOCR supports 125 languages through downloadable NuGet packages. What's the difference between Read() and ReadAsync() methods? The Read() method performs synchronous OCR processing, while ReadAsync() enables asynchronous operations for better application responsiveness when processing large documents or multiple files. How can I export OCR results to different formats? IronOCR provides multiple export methods: SaveAsSearchablePdf() for searchable PDFs, SaveAsHocrFile() for HOCR HTML with layout data, and SaveAsTextFile() for plain text output. Why is my OCR failing on PDF files? Ensure you're using LoadPdf() method with the correct password if the PDF is protected. For image-based PDFs, IronOCR automatically converts pages to images for processing. Check that the PDF isn't corrupted. How do I access confidence scores for OCR results? Each element in the OcrResult hierarchy (pages, paragraphs, lines, words, characters) has a Confidence property ranging from 0 to 1, indicating the recognition certainty. Can I read both text and barcodes from the same document? Yes, enable barcode reading with ocr.Configuration.ReadBarCodes = true. The OcrResult will contain both text content and a Barcodes collection with detected barcode values and types. Jacob Mellor Chat with engineering team now Chief Technology Officer Jacob Mellor is Chief Technology Officer at Iron Software and a visionary engineer pioneering C# PDF technology. As the original developer behind Iron Software's core codebase, he has shaped the company's product architecture since its inception, transforming it alongside CEO Cameron Rimington into a 50+ person company serving NASA, Tesla, and global government agencies.Jacob holds a First-Class Honours Bachelor of Engineering (BEng) in Civil Engineering from the University of Manchester (1998–2001). After opening his first software business in London in 1999 and creating his first .NET components in 2005, he specialized in solving complex problems across the Microsoft ecosystem.His flagship IronPDF & IronSuite .NET libraries have achieved over 30 million NuGet installations globally, with his foundational code continuing to power developer tools used worldwide. With 25 years of commercial experience and 41 years of coding expertise, Jacob remains focused on driving innovation in enterprise-grade C#, Java, and Python PDF technologies while mentoring the next generation of technical leaders. Ready to Get Started? Free NuGet Download Total downloads: 4,046,915 View Licenses