OCR TOOLS

How to OCR a PDF Tutorial (Free Online Tools)

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

Updated:June 22, 2025

OCR or Optical Character Recognition is a process of converting textual information into digital form. PDF OCR is a popular application that can be used to improve business processes. One of the benefits of PDF OCR is that it can be used to improve the accessibility of information. This is particularly important for documents that are not available in a format that everyone can use or read. PDF OCR can be used to produce a copy of the document that is available in a format that everyone can use.

Another use of PDF OCR is in the tracking of documents. When a document is filed, scanned, or transcribed, it can be difficult to track which version of the document is associated with which file. With PDF OCR, it is possible to track the changes made to a document and determine which versions are associated with which file. This can be useful for managing document archives and preventing the loss of important information.

In this article, you'll learn how you can use OCR for any PDF file using Adobe Acrobat Pro software. This article will also introduce the .NET OCR library IronOCR, which is one of the most efficient and feature-rich libraries available. Let's begin with Adobe Acrobat Pro.

OCR a PDF using Adobe Acrobat Pro DC

Adobe Acrobat Pro DC is the Pro version of Adobe Acrobat Reader DC. It is the most popular and powerful tool for PDF manipulation. With this software, you can create, edit, sign, and review any PDF document. Moreover, it enables you to convert PDFs to PowerPoint presentations, Word documents, or Excel files. It can also edit scanned documents.

The new version of Acrobat DC is also a document scanner that can quickly turn scanned documents into digital files using OCR technology. It features Optical Character Recognition as well as intelligent business card scanning that automatically detects and saves contact information from cards in seconds.

Along with being able to extract text from PDF files, Acrobat Pro DC has many features that make it a valuable tool for PDF transcription.

Let's see how we can use OCR of a scanned document using Adobe Acrobat Pro.

Open the desired PDF document, in our example a scanned PDF file, in Adobe Acrobat.
Select "Edit PDF" from the right pane of the document.

This will open the interface of the Adobe Reader OCR PDF tool.
Click on the "Edit" button on the top ribbon.
This will convert scanned PDF documents to fully editable PDF documents. You'll be able to edit text and image files on the PDF file itself.

You can also change the text block location, text font, etc.

After making any changes, save the file and you'll see these changes reflected in the document.

IronOCR: A .NET OCR Library

IronOCR is a .NET OCR library and OCR tool which can read text documents and images by converting them into a machine-readable format.

This Optical Character Recognition library was developed with the following considerations in mind:

The need for a robust and accurate OCR engine that can be used with different languages without needing any external software.
The need for an easy-to-use API that works across different platforms such as Windows, Linux, and macOS.
The need for an OCR engine that can be easily integrated into various .NET applications and supports both WPF and console apps.

IronOCR makes it easier for developers to create software that supports scanning documents, extracting text and metadata, indexing scanned image files, converting images to searchable PDFs, and converting scanned documents into readable text. IronOCR offers a lot of options when it comes to encoding, image format conversion, and text recognition and extraction. IronOCR supports 125 languages.

IronOCR provides an intuitive, robust, and accurate OCR process to recognize text from scanned documents, photographs, and screenshots while reducing time-consuming tasks like page segmentation and layout analysis. The library is developed in C# and its API design is straightforward with good readability.

Let's explore some code examples using IronOCR:

Code Examples

using IronOcr;

var Ocr = new IronTesseract();

// Initialize OCR input
using (var Input = new OcrInput())
{
    // OCR entire document
    Input.AddPdf("example.pdf", "password");

    // Alternatively, OCR selected page numbers
    Input.AddPdfPages("example.pdf", new[] { 1, 2, 3 }, "password");

    // Read the PDF and output the recognized text
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

using IronOcr;

var Ocr = new IronTesseract();

// Initialize OCR input
using (var Input = new OcrInput())
{
    // OCR entire document
    Input.AddPdf("example.pdf", "password");

    // Alternatively, OCR selected page numbers
    Input.AddPdfPages("example.pdf", new[] { 1, 2, 3 }, "password");

    // Read the PDF and output the recognized text
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

Imports IronOcr

Private Ocr = New IronTesseract()

' Initialize OCR input
Using Input = New OcrInput()
	' OCR entire document
	Input.AddPdf("example.pdf", "password")

	' Alternatively, OCR selected page numbers
	Input.AddPdfPages("example.pdf", { 1, 2, 3 }, "password")

	' Read the PDF and output the recognized text
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

This example demonstrates how to use IronOCR to process either an entire PDF document or specific pages from the document.

PDF File (input)

Output in the Console

You can convert a PDF into a selectable PDF using IronOCR. It's very simple and straightforward. See the code snippet of the PDF conversion below:

using IronOcr;

var Ocr = new IronTesseract();

// Initialize OCR input
using (var Input = new OcrInput())
{
    // Add PDF for processing
    Input.AddPdf("scan.pdf", "password");

    // Clean up twisted pages to improve OCR results
    Input.Deskew();

    // Run OCR and save as a searchable PDF
    var Result = Ocr.Read(Input);
    Result.SaveAsSearchablePdf("searchable.pdf");
}

using IronOcr;

var Ocr = new IronTesseract();

// Initialize OCR input
using (var Input = new OcrInput())
{
    // Add PDF for processing
    Input.AddPdf("scan.pdf", "password");

    // Clean up twisted pages to improve OCR results
    Input.Deskew();

    // Run OCR and save as a searchable PDF
    var Result = Ocr.Read(Input);
    Result.SaveAsSearchablePdf("searchable.pdf");
}

Imports IronOcr

Private Ocr = New IronTesseract()

' Initialize OCR input
Using Input = New OcrInput()
	' Add PDF for processing
	Input.AddPdf("scan.pdf", "password")

	' Clean up twisted pages to improve OCR results
	Input.Deskew()

	' Run OCR and save as a searchable PDF
	Dim Result = Ocr.Read(Input)
	Result.SaveAsSearchablePdf("searchable.pdf")
End Using

$vbLabelText $csharpLabel

IronOCR offers many other tools and features. You can explore IronOCR features by visiting the following link.

Conclusion

The IronOCR library has several advantages over other libraries available on the market. You can modify and extend its functionality by adding your own modules with just a few lines of code. IronOCR can currently read texts in over 125 languages. It has been developed to produce higher quality, more reliable results while consuming much less time and memory resources when compared to other libraries.

IronOCR is free for development. IronOCR also offers a free trial for testing in production. For more details about pricing and a free trial of IronOCR, follow the link.

Kannapat Udonpant

Chat with engineering team now

Software Engineer

Before becoming a Software Engineer, Kannapat completed a Environmental Resources PhD from Hokkaido University in Japan. While pursuing his degree, Kannapat also became a member of the Vehicle Robotics Laboratory, which is part of the Department of Bioproduction Engineering. In 2022, he leveraged his C# skills to join Iron Software's engineering ...