Test in production without watermarks.
Works wherever you need it to.
Get 30 days of fully functional product.
Have it up and running in minutes.
Full access to our support engineering team during your product trial
In this hands-on tutorial, you’ll learn how to extract text from PDF files in C# using IronOCR, a powerful .NET OCR library. The walkthrough begins with setting up IronOCR and initializing the OCR engine using your license key. You’ll see how to extract text from an entire PDF document, then refine the process to read only specific pages using indexed page ranges.
For more precision, the tutorial demonstrates region-based text extraction using Rectangle
objects—perfect for extracting content from forms, tables, or designated areas on each page. IronOCR provides flexibility and precision in parsing scanned or image-based PDFs, making it an essential tool for automating document processing, data extraction, and PDF analysis in C#. With clear code examples and console output, this video helps developers get started quickly with practical OCR implementations. Try it for yourself by downloading the IronOCR trial and integrating PDF OCR into your own C# applications.
Here's a sample C# code snippet to demonstrate how this can be achieved:
using IronOcr;
class Program
{
static void Main()
{
// Initialize OcrEngine with your IronOCR license key
var ocr = new IronTesseract();
// Load and extract text from a PDF file
using (var input = new OcrInput(@"path\to\sample.pdf"))
{
// Perform OCR on the entire PDF
OcrResult result = ocr.Read(input);
// Display and process the extracted text
Console.WriteLine(result.Text);
}
}
}
using IronOcr;
class Program
{
static void Main()
{
// Initialize OcrEngine with your IronOCR license key
var ocr = new IronTesseract();
// Load and extract text from a PDF file
using (var input = new OcrInput(@"path\to\sample.pdf"))
{
// Perform OCR on the entire PDF
OcrResult result = ocr.Read(input);
// Display and process the extracted text
Console.WriteLine(result.Text);
}
}
}
Imports IronOcr
Friend Class Program
Shared Sub Main()
' Initialize OcrEngine with your IronOCR license key
Dim ocr = New IronTesseract()
' Load and extract text from a PDF file
Using input = New OcrInput("path\to\sample.pdf")
' Perform OCR on the entire PDF
Dim result As OcrResult = ocr.Read(input)
' Display and process the extracted text
Console.WriteLine(result.Text)
End Using
End Sub
End Class
using IronOcr;
using System;
class Program
{
static void Main(string[] args)
{
var ocr = new IronTesseract();
// Define a specific page range for OCR
using (var input = new OcrInput())
{
// Add only specific pages to the OcrInput
input.AddPdfPages(@"path\to\sample.pdf", new int[] { 1, 2 }); // Read only pages 1 and 2
OcrResult result = ocr.Read(input);
// Output the text of the specified pages
Console.WriteLine(result.Text);
}
}
}
using IronOcr;
using System;
class Program
{
static void Main(string[] args)
{
var ocr = new IronTesseract();
// Define a specific page range for OCR
using (var input = new OcrInput())
{
// Add only specific pages to the OcrInput
input.AddPdfPages(@"path\to\sample.pdf", new int[] { 1, 2 }); // Read only pages 1 and 2
OcrResult result = ocr.Read(input);
// Output the text of the specified pages
Console.WriteLine(result.Text);
}
}
}
Imports IronOcr
Imports System
Friend Class Program
Shared Sub Main(ByVal args() As String)
Dim ocr = New IronTesseract()
' Define a specific page range for OCR
Using input = New OcrInput()
' Add only specific pages to the OcrInput
input.AddPdfPages("path\to\sample.pdf", New Integer() { 1, 2 }) ' Read only pages 1 and 2
Dim result As OcrResult = ocr.Read(input)
' Output the text of the specified pages
Console.WriteLine(result.Text)
End Using
End Sub
End Class
using IronOcr;
using System;
using System.Drawing;
class Program
{
static void Main(string[] args)
{
var ocr = new IronTesseract();
using (var input = new OcrInput(@"path\to\sample.pdf"))
{
// Define a region of interest with a Rectangle
var region = new Rectangle(100, 50, 200, 100);
// Only pass the defined region for OCR
input.SelectRegions(region);
OcrResult result = ocr.Read(input);
// Display and process the extracted text from the specific region
Console.WriteLine(result.Text);
}
}
}
using IronOcr;
using System;
using System.Drawing;
class Program
{
static void Main(string[] args)
{
var ocr = new IronTesseract();
using (var input = new OcrInput(@"path\to\sample.pdf"))
{
// Define a region of interest with a Rectangle
var region = new Rectangle(100, 50, 200, 100);
// Only pass the defined region for OCR
input.SelectRegions(region);
OcrResult result = ocr.Read(input);
// Display and process the extracted text from the specific region
Console.WriteLine(result.Text);
}
}
}
Imports IronOcr
Imports System
Imports System.Drawing
Friend Class Program
Shared Sub Main(ByVal args() As String)
Dim ocr = New IronTesseract()
Using input = New OcrInput("path\to\sample.pdf")
' Define a region of interest with a Rectangle
Dim region = New Rectangle(100, 50, 200, 100)
' Only pass the defined region for OCR
input.SelectRegions(region)
Dim result As OcrResult = ocr.Read(input)
' Display and process the extracted text from the specific region
Console.WriteLine(result.Text)
End Using
End Sub
End Class