A Comparison between IronOCR and PDFTRON OCR

OCR stands for "Optical Character Recognition". This is the process of converting paper documents or images into readable text. Various methods exist to do this such as scanning or manual input through a keyboard. This is done to convert any scanned files, PDFs, or handwritten documents to their original text format. This process has proven invaluable in criminal cases where documents are too damaged for manual transcription but can be scanned and interpreted by OCR software.

With the advancement of technology and the ubiquitous adoption of digital tools, OCR has also been implemented in other fields such as document conversion on apps like Google Docs, as well as in academia and the world of business. There are two main types of OCR, "static" and "dynamic". The most common type is static OCR in which the entire document is scanned at once. Dynamic OCR, on the other hand, scans one line at a time and can process more sophisticated layouts such as tabular data.

This article will discuss the comparison between two of the most prevalent applications and document libraries for OCR and PDF. These are:

  • PDFTron OCR
  • IronOCR

1.0 Introduction

1.1 PDFTron OCR Introduction and Features

To use OCR with PDFTron SDK, we have to install a separate OCR module add-on utility. This helps the SDK to detect text from documents. It can make text selectable and searchable. The PDFTron SDK supports up to 100 languages. The PDFTron OCR engine is supported by an open-source LSTM neural network from Tesseract. PDFTron OCR supports multiple image types of formats for text detection. PDF files with only raster images are also accepted for OCR with the output quality dependent upon the input image. The best images for OCR are grayscale images with 300 DPI resolution.

The Features of PDFTron OCR

  • Making images searchable in printed documents.
  • Converting simple PDF to a searchable PDF.
  • Detecting important information from business documents.
  • Making book scanning easier.
  • Detection of vehicle numbers from images of vehicles.
  • Helping visually impaired users.
  • It becomes easy to perform data entry in multiple documents from files using OCR.
  • Transfer information from business cards to contact data lists easily.

1.2 IronOCR — Introduction and Features

Iron Software provides software engineers with IronOCR for .NET to read text content from photos and PDFs in .NET apps and websites. The software helps to scan photos for text and barcodes, supports numerous worldwide languages, and outputs as plain text or structured data. Iron Softwares OCR Library can be utilized in Web, console, MVC, and numerous .NET desktop applications. In the case of commercial deployment, direct assistance from the development team is provided alongside purchased licenses.

  • IronOCR uses the latest Tesseract 5 engine which reads text, QR codes, and barcodes, from any PDF format or. Adding OCR to desktop, web applications, and console with this library ensures quick integration.
  • IronOCR supports 127 international languages. It also supports custom languages and word lists.
  • IronOCR can read more than 20 barcode and QR code formats.
  • Supports multi-page GIF and TIFF image formats.
  • Corrects low-quality scans and images.
  • Supports multithreading. It can execute one or more processes at a time.
  • IronOCR can provide structured data output to pages, paragraphs, lines, words, characters, etc.
  • IronOCR supports operating systems such as Windows, Linux, macOS, etc.

2. Creating a New Project in Visual Studio

Open Visual Studio and locate to the File menu. Select new project, then select Console Application. To generate PDF documents we will be using the Console Application in this article.

Enter the project name and select the file path in the appropriate text box. Then, click the Create button. Also, select the required .NET Framework.

The Visual Studio project will now generate the structure for the selected application.

The structure of the project will now be generated by Visual Studio. The program.cs file will open if you have selected Windows, console, and web application so you can input the code and build/run your application.

For the next step, we need to add the library to test the code.

3.0 Install

3.1 Install PDFtron OCR

The PDFTron OCR installation is completed manually and can be downloaded as a zip from the given link. Unzip and configure it with the project. The guide will aid in helping you execute PDFTron samples using the free integrated trial of the PDFTron SDK into a .NET Framework application using Windows. Support from solution engineers and unlimited trial usage are included in the free trial.

Prerequisites

Visual Studio: Make sure that the .NET Desktop Development and .NET Framework 4.5.1+ development tools workload is part of your installation. This guide will use Visual Studio 2017 and PDFTron's C# .NET PDF Library for Windows. Download the library using this link, .NET PDF SDK Download.

Initial setup

Extract the folder from the .zip file. PDFNET_BASE is used in this guide to select the path into the folder that you extracted.

PDFNET_BASE = path/to/extraction/folder/PDFNetDotNet4/
PDFNET_BASE = path/to/extraction/folder/PDFNetDotNet4/
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'PDFNET_BASE = path/@to/extraction/folder/PDFNetDotNet4/
VB   C#

Run the samples

Navigate to the location of extracted contents. Find and enter the Samples folder (PDFNET_BASE/Samples). This folder holds numerous sample codes for features supported by the PDFTron SDK.

  1. Open Samples_20XX.sln in Visual Studio. Choose an appropriate version for your Visual Studio installation.
  2. Select the sample code and set it as the Startup Project for the solution.
  3. Run the project.

Integrate into your application

This is called the "PDFTron Hello World" application. It is easy to integrate the rest of the PDFTron SDK if you can open, save and close a PDF document.

  1. In Visual Studio, create a .NET Framework console application project in your preferred language. You can find them under the Visual C# or Visual Basic category.
  2. Navigate into your project's folder. By default, the path should be similar to: C:/Users/User_Name/source/repos/myApp
  3. From PDFNET_BASE to your project folder (this folder will have your .csproj or .vbproj file), copy the Lib folder.
  4. Find the Solution Explorer on the right-hand side. Select the Add Reference option by Right-clicking on References. This opens a Reference Manager dialog.
  5. Located at the bottom of the dialog, select Browse. Locate into the copied Lib folder to add PDFNetLoader.dll to the references
  6. Additionally, from the x86 folder add the suitable version of PDFNet.dll as another reference (path/to/your/project/folder/Lib/PDFNet/x86/PDFNet.dll). This will ensure that the application will run on both 32-bit and 64-bit OS.
  7. Click on PDFNet.dll. Make sure to set its Copy Local property to False.

3.2 Install IronOCR

The IronOCR Library can be installed in four ways.

These are:

  • Using Visual Studio.
  • Using the Visual Studio Command-Line.
  • Direct download from the NuGet website.
  • Direct download from the IronPDF website.

3.2.1 Using Visual Studio

The Visual Studio software provides the NuGet Package manager option to install the package directly to the solution. The screenshot demonstrates how to open the NuGet Package Manager.

This will provide a search box to display the list of the packages from the NuGet website. In the package manager, we need to search for the keyword "IronOCR", as in the below screenshot:

From the above image, we can see the list of the related search results. To install the package into the solution, we must select the required option.

3.2.2 Using the Visual Studio Command-Line

  • In Visual Studio: Go to Tools-> NuGet Package manager -> Package manager console
  • Input the subsequent code snippet into the package manager console tab.
Install-Package IronOcr

The package will now install directly into the current project which will then be ready to use.

3.2.3 Direct Download from the NuGet Website

For the third method, we can download the NuGet package straight from the website

  • Navigate to this Link.
  • From the right-hand side menu, ensure that you select the download package option.
  • Double-click the downloaded package. It will be installed automatically.
  • Next, reload the solution and begin using it in the project.

3.2.4 Direct Download from the IronOCR Website

You can directly download the latest package from the website by clicking this link. Follow the provided instructions to add the package to the project once the latest package is downloaded.

  • Right-click the project from the solution window.
  • Select Option Reference to browse the location of the downloaded reference.
  • Then, click OK to add the reference.

4.0 OCR Image

IronOCR and PDFtron OCR both have OCR technology that will convert images into text searching.

4.1 Using PDFTron

Convert PDF to DOCX, DOC, HTML, SVG, TIFF, PNG, JPEG, XPS, EPUB, TXT, and many other formats.

PDFDoc doc = new PDFDoc(filename);

// Convert PDF document to SVG
Convert.ToSvg(doc, output_filename + ".svg");

// Convert PDF document to XPS
Convert.ToXps(filename, output_filename + ".xps");

// Convert PDF document to multipage TIFF
Convert.TiffOutputOptions tiff_options = new Convert.TiffOutputOptions();
tiff_options.SetDPI(200);
tiff_options.SetDither(true);
tiff_options.SetMono(true);
Convert.ToTiff(filename, output_filename + ".tiff", tiff_options);

// Convert PDF to XOD
Convert.ToXod(filename, output_filename + ".xod");

// Convert PDF to HTML
Convert.ToHtml(filename, output_filename + ".html");
PDFDoc doc = new PDFDoc(filename);

// Convert PDF document to SVG
Convert.ToSvg(doc, output_filename + ".svg");

// Convert PDF document to XPS
Convert.ToXps(filename, output_filename + ".xps");

// Convert PDF document to multipage TIFF
Convert.TiffOutputOptions tiff_options = new Convert.TiffOutputOptions();
tiff_options.SetDPI(200);
tiff_options.SetDither(true);
tiff_options.SetMono(true);
Convert.ToTiff(filename, output_filename + ".tiff", tiff_options);

// Convert PDF to XOD
Convert.ToXod(filename, output_filename + ".xod");

// Convert PDF to HTML
Convert.ToHtml(filename, output_filename + ".html");
Dim doc As New PDFDoc(filename)

' Convert PDF document to SVG
Convert.ToSvg(doc, output_filename & ".svg")

' Convert PDF document to XPS
Convert.ToXps(filename, output_filename & ".xps")

' Convert PDF document to multipage TIFF
Dim tiff_options As New Convert.TiffOutputOptions()
tiff_options.SetDPI(200)
tiff_options.SetDither(True)
tiff_options.SetMono(True)
Convert.ToTiff(filename, output_filename & ".tiff", tiff_options)

' Convert PDF to XOD
Convert.ToXod(filename, output_filename & ".xod")

' Convert PDF to HTML
Convert.ToHtml(filename, output_filename & ".html")
VB   C#

4.2 Using IronOCR

var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage(@"3.png");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Console.ReadKey();
}
var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage(@"3.png");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Console.ReadKey();
}
Dim Ocr = New IronTesseract() ' nothing to configure
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Using Input = New OcrInput()
	Input.AddImage("3.png")
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
	Console.ReadKey()
End Using
VB   C#

Demonstrated above is the process of converting image files into text with the Tesseract 5 API. The above line of code was utilized to create an object for Iron Tesseract. Additionally, to ensure we can add one or more picture files, we produced an OcrInput object which will require the available picture path. In the Iron Tesseract object, the function "Read" can be used to obtain the images by parsing the picture file and extracting the produced outcome into the OCR result. It is able to extract text from photos and convert it to a string.

Tesseract can be utilized to add multi-frame images using the "AddMultiFrameTiff" method for this process. Each frame in the image is read and treated as a distinct page by the Tesseract Library. Each frame of the image is read before proceeding onto the subsequent frame until each frame has been successfully scanned. The TIFF image format is the only supported format in this method.

The successful conversion of the data into editable text is displayed in the above image, a result of the IronOCR accuracy.

5.0 OCR PDF File

IronOCR and PDFTron OCR convert PDF files into editable text. PDFTron OCR provides a list of options to the user such as saving the page, editing the image, recognizing the page, etc. Additionally it can provide save options such as document, text, HTML format, etc. IronOCR also allows us to save a converted OCR file into HTML, text, PDF, etc.

5.1 Using PDFTron OCR

The full sample code shows how to use PDFTron OCR for direct, high-quality conversion between PDF, XPS, EMF, SVG, TIFF, PNG, JPEG, and other image formats.

// Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.

using System;
using System.Drawing;
using System.Drawing.Drawing2D;
using pdftron;
using pdftron.Common;
using pdftron.Filters;
using pdftron.SDF;
using pdftron.PDF;

// The following code showcases the conversion of documents to formats // such as SVG, PDF, EMF, or XPS.

namespace ConvertTestCS
{
    class Testfile
    {
        public string inputFile, outputFile;
        public bool requiresWindowsPlatform;
        public Testfile(string inFile, string outFile, bool requiresWindowsPlatform_)
        {
            inputFile = inFile;
            outputFile = outFile;
            requiresWindowsPlatform = requiresWindowsPlatform_;
        }
    };

    class Class1
    {
        private static pdftron.PDFNetLoader pdfNetLoader = pdftron.PDFNetLoader.Instance();
        static Class1() {}

        // Relative path to the folder containing test files.
        const string inputPath = "../../../../TestFiles/";
        const string outputPath = "../../../../TestFiles/Output/";

        static bool ConvertSpecificFormats()
        {
            bool err = false;
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    Console.WriteLine("Converting from XPS");
                    pdftron.PDF.Convert.FromXps(pdfdoc, inputPath + "simple-xps.xps");
                    pdfdoc.Save(outputPath + "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved xps2pdf v2.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////
    if (Environment.OSVersion.Platform == PlatformID.Win32NT) {
                try
                {
                    using (PDFDoc pdfdoc = new PDFDoc())
                    {
                        Console.WriteLine("Converting from EMF");
                        pdftron.PDF.Convert.FromEmf(pdfdoc, inputPath + "simple-emf.emf");
                        pdfdoc.Save(outputPath + "emf2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                        Console.WriteLine("Saved emf2pdf v2.pdf");
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

        ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    // Add a dictionary
                    ObjSet set = new ObjSet();
                    Obj options = set.CreateDict();

                    // Put options
                    options.PutNumber("FontSize", 15);
                    options.PutBool("UseSourceCodeFormatting", true);
                    options.PutNumber("PageWidth", 12);
                    options.PutNumber("PageHeight", 6);

                    // Convert from .txt file
                    Console.WriteLine("Converting from txt");
                    pdftron.PDF.Convert.FromText(pdfdoc, inputPath + "simple-text.txt", options);
                    pdfdoc.Save(outputPath + "simple-text.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved simple-text.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc(inputPath + "newsletter.pdf"))
                {
                    // Convert PDF document to SVG
                    Console.WriteLine("Converting pdfdoc to SVG");
                    pdftron.PDF.Convert.ToSvg(pdfdoc, outputPath + "pdf2svg v2.svg");
                    Console.WriteLine("Saved pdf2svg v2.svg");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////////
            try
            {
                // Convert PNG image to XPS
                Console.WriteLine("Converting PNG to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "butterfly.png", outputPath + "butterfly.xps");
                Console.WriteLine("Saved butterfly.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }
    ///////////////////////////////////////////
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    // Convert MSWord document to XPS
                    Console.WriteLine("Converting DOCX to XPS");
                    pdftron.PDF.Convert.ToXps(inputPath + "simple-word_2007.docx", outputPath + "simple-word_2007.xps");
                    Console.WriteLine("Saved simple-word_2007.xps");
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            ////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to XPS
                Console.WriteLine("Converting PDF to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "newsletter.pdf", outputPath + "newsletter.xps");
                Console.WriteLine("Saved newsletter.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to HTML
                Console.WriteLine("Converting PDF to HTML");
                pdftron.PDF.Convert.ToHtml(inputPath + "newsletter.pdf", outputPath + "newsletter");
                Console.WriteLine("Saved newsletter as HTML");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to EPUB
                Console.WriteLine("Converting PDF to EPUB");
                pdftron.PDF.Convert.ToEpub(inputPath + "newsletter.pdf", outputPath + "newsletter.epub");
                Console.WriteLine("Saved newsletter.epub");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            /////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to multipage TIFF
                Console.WriteLine("Converting PDF to multipage TIFF");
                pdftron.PDF.Convert.TiffOutputOptions tiff_options = new pdftron.PDF.Convert.TiffOutputOptions();
                tiff_options.SetDPI(200);
                tiff_options.SetDither(true);
                tiff_options.SetMono(true);
                pdftron.PDF.Convert.ToTiff(inputPath + "newsletter.pdf", outputPath + "newsletter.tiff", tiff_options);
                Console.WriteLine("Saved newsletter.tiff");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            return err;
        }

        static Boolean ConvertToPdfFromFile()
        {
            System.Collections.ArrayList testfiles = new System.Collections.ArrayList();
            testfiles.Add(new ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-powerpoint_2007.pptx", "pptx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-excel_2007.xlsx", "xlsx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-publisher.pub", "pub2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-text.txt", "txt2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-rtf.rtf", "rtf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("butterfly.png", "png2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-emf.emf", "emf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-xps.xps", "xps2pdf.pdf", false));
            // testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.mht", "mht2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.html", "html2pdf.pdf", true));

            bool err = false;
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    if (pdftron.PDF.Convert.Printer.IsInstalled("PDFTron PDFNet"))
                    {
                        pdftron.PDF.Convert.Printer.SetPrinterName("PDFTron PDFNet");
                    }
                    else if (!pdftron.PDF.Convert.Printer.IsInstalled())
                    {
                        try
                        {
                            Console.WriteLine("Installing printer (requires Windows platform and administrator)");
                            pdftron.PDF.Convert.Printer.Install();
                            Console.WriteLine("Installed printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                            // the function ConvertToXpsFromFile may require the printer so leave it installed
                            // uninstallPrinterWhenDone = true;
                        }
                        catch (PDFNetException e)
                        {
                            Console.WriteLine("ERROR: Unable to install printer.");
                            Console.WriteLine(e.Message);
                            err = true;
                        }
                        catch
                        {
                            Console.WriteLine("ERROR: Unable to install printer. Make sure that the package's bitness matches your operating system's bitness and that you are running with administrator privileges.");
                        }
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: Unable to install printer.");
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            foreach (Testfile file in testfiles)
            {
                if ( Environment.OSVersion.Platform != PlatformID.Win32NT)
                {
                    if (file.requiresWindowsPlatform)
                    {
                        continue;
                    }
                }
                try
                {
                    using (pdftron.PDF.PDFDoc pdfdoc = new PDFDoc())
                    {

                        if (pdftron.PDF.Convert.RequiresPrinter(inputPath + file.inputFile))
                        {
                            Console.WriteLine("Using PDFNet printer to convert file " + file.inputFile);
                        }
                        pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath + file.inputFile);
                        pdfdoc.Save(outputPath + file.outputFile, SDFDoc.SaveOptions.e_linearized);
                        Console.WriteLine("Converted file: " + file.inputFile);
                        Console.WriteLine("to: " + file.outputFile);
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: on input file " + file.inputFile);
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

            return err;
        }
        [STAThread]
        static void Main(string[] args)
        {
            PDFNet.Initialize(PDFTronLicense.Key);
            bool err = false;

            err = ConvertToPdfFromFile();
            if (err)
            {
                Console.WriteLine("ConvertFile failed");
            }
            else
            {
                Console.WriteLine("ConvertFile succeeded");
            }

            err = ConvertSpecificFormats();
            if (err)
            {
                Console.WriteLine("ConvertSpecificFormats failed");
            }
            else
            {
                Console.WriteLine("ConvertSpecificFormats succeeded");
            }

            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                if (pdftron.PDF.Convert.Printer.IsInstalled())
                {
                    try
                    {
                        Console.WriteLine("Uninstalling printer (requires Windows platform and administrator)");
                        pdftron.PDF.Convert.Printer.Uninstall();
                        Console.WriteLine("Uninstalled Printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                    }
                    catch
                    {
                        Console.WriteLine("Unable to uninstall printer");
                    }
                }
            }
            PDFNet.Terminate();
            Console.WriteLine("Done.");
        }
    }
}
// Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.

using System;
using System.Drawing;
using System.Drawing.Drawing2D;
using pdftron;
using pdftron.Common;
using pdftron.Filters;
using pdftron.SDF;
using pdftron.PDF;

// The following code showcases the conversion of documents to formats // such as SVG, PDF, EMF, or XPS.

namespace ConvertTestCS
{
    class Testfile
    {
        public string inputFile, outputFile;
        public bool requiresWindowsPlatform;
        public Testfile(string inFile, string outFile, bool requiresWindowsPlatform_)
        {
            inputFile = inFile;
            outputFile = outFile;
            requiresWindowsPlatform = requiresWindowsPlatform_;
        }
    };

    class Class1
    {
        private static pdftron.PDFNetLoader pdfNetLoader = pdftron.PDFNetLoader.Instance();
        static Class1() {}

        // Relative path to the folder containing test files.
        const string inputPath = "../../../../TestFiles/";
        const string outputPath = "../../../../TestFiles/Output/";

        static bool ConvertSpecificFormats()
        {
            bool err = false;
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    Console.WriteLine("Converting from XPS");
                    pdftron.PDF.Convert.FromXps(pdfdoc, inputPath + "simple-xps.xps");
                    pdfdoc.Save(outputPath + "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved xps2pdf v2.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////
    if (Environment.OSVersion.Platform == PlatformID.Win32NT) {
                try
                {
                    using (PDFDoc pdfdoc = new PDFDoc())
                    {
                        Console.WriteLine("Converting from EMF");
                        pdftron.PDF.Convert.FromEmf(pdfdoc, inputPath + "simple-emf.emf");
                        pdfdoc.Save(outputPath + "emf2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                        Console.WriteLine("Saved emf2pdf v2.pdf");
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

        ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    // Add a dictionary
                    ObjSet set = new ObjSet();
                    Obj options = set.CreateDict();

                    // Put options
                    options.PutNumber("FontSize", 15);
                    options.PutBool("UseSourceCodeFormatting", true);
                    options.PutNumber("PageWidth", 12);
                    options.PutNumber("PageHeight", 6);

                    // Convert from .txt file
                    Console.WriteLine("Converting from txt");
                    pdftron.PDF.Convert.FromText(pdfdoc, inputPath + "simple-text.txt", options);
                    pdfdoc.Save(outputPath + "simple-text.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved simple-text.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc(inputPath + "newsletter.pdf"))
                {
                    // Convert PDF document to SVG
                    Console.WriteLine("Converting pdfdoc to SVG");
                    pdftron.PDF.Convert.ToSvg(pdfdoc, outputPath + "pdf2svg v2.svg");
                    Console.WriteLine("Saved pdf2svg v2.svg");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////////
            try
            {
                // Convert PNG image to XPS
                Console.WriteLine("Converting PNG to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "butterfly.png", outputPath + "butterfly.xps");
                Console.WriteLine("Saved butterfly.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }
    ///////////////////////////////////////////
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    // Convert MSWord document to XPS
                    Console.WriteLine("Converting DOCX to XPS");
                    pdftron.PDF.Convert.ToXps(inputPath + "simple-word_2007.docx", outputPath + "simple-word_2007.xps");
                    Console.WriteLine("Saved simple-word_2007.xps");
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            ////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to XPS
                Console.WriteLine("Converting PDF to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "newsletter.pdf", outputPath + "newsletter.xps");
                Console.WriteLine("Saved newsletter.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to HTML
                Console.WriteLine("Converting PDF to HTML");
                pdftron.PDF.Convert.ToHtml(inputPath + "newsletter.pdf", outputPath + "newsletter");
                Console.WriteLine("Saved newsletter as HTML");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to EPUB
                Console.WriteLine("Converting PDF to EPUB");
                pdftron.PDF.Convert.ToEpub(inputPath + "newsletter.pdf", outputPath + "newsletter.epub");
                Console.WriteLine("Saved newsletter.epub");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            /////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to multipage TIFF
                Console.WriteLine("Converting PDF to multipage TIFF");
                pdftron.PDF.Convert.TiffOutputOptions tiff_options = new pdftron.PDF.Convert.TiffOutputOptions();
                tiff_options.SetDPI(200);
                tiff_options.SetDither(true);
                tiff_options.SetMono(true);
                pdftron.PDF.Convert.ToTiff(inputPath + "newsletter.pdf", outputPath + "newsletter.tiff", tiff_options);
                Console.WriteLine("Saved newsletter.tiff");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            return err;
        }

        static Boolean ConvertToPdfFromFile()
        {
            System.Collections.ArrayList testfiles = new System.Collections.ArrayList();
            testfiles.Add(new ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-powerpoint_2007.pptx", "pptx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-excel_2007.xlsx", "xlsx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-publisher.pub", "pub2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-text.txt", "txt2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-rtf.rtf", "rtf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("butterfly.png", "png2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-emf.emf", "emf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-xps.xps", "xps2pdf.pdf", false));
            // testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.mht", "mht2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.html", "html2pdf.pdf", true));

            bool err = false;
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    if (pdftron.PDF.Convert.Printer.IsInstalled("PDFTron PDFNet"))
                    {
                        pdftron.PDF.Convert.Printer.SetPrinterName("PDFTron PDFNet");
                    }
                    else if (!pdftron.PDF.Convert.Printer.IsInstalled())
                    {
                        try
                        {
                            Console.WriteLine("Installing printer (requires Windows platform and administrator)");
                            pdftron.PDF.Convert.Printer.Install();
                            Console.WriteLine("Installed printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                            // the function ConvertToXpsFromFile may require the printer so leave it installed
                            // uninstallPrinterWhenDone = true;
                        }
                        catch (PDFNetException e)
                        {
                            Console.WriteLine("ERROR: Unable to install printer.");
                            Console.WriteLine(e.Message);
                            err = true;
                        }
                        catch
                        {
                            Console.WriteLine("ERROR: Unable to install printer. Make sure that the package's bitness matches your operating system's bitness and that you are running with administrator privileges.");
                        }
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: Unable to install printer.");
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            foreach (Testfile file in testfiles)
            {
                if ( Environment.OSVersion.Platform != PlatformID.Win32NT)
                {
                    if (file.requiresWindowsPlatform)
                    {
                        continue;
                    }
                }
                try
                {
                    using (pdftron.PDF.PDFDoc pdfdoc = new PDFDoc())
                    {

                        if (pdftron.PDF.Convert.RequiresPrinter(inputPath + file.inputFile))
                        {
                            Console.WriteLine("Using PDFNet printer to convert file " + file.inputFile);
                        }
                        pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath + file.inputFile);
                        pdfdoc.Save(outputPath + file.outputFile, SDFDoc.SaveOptions.e_linearized);
                        Console.WriteLine("Converted file: " + file.inputFile);
                        Console.WriteLine("to: " + file.outputFile);
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: on input file " + file.inputFile);
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

            return err;
        }
        [STAThread]
        static void Main(string[] args)
        {
            PDFNet.Initialize(PDFTronLicense.Key);
            bool err = false;

            err = ConvertToPdfFromFile();
            if (err)
            {
                Console.WriteLine("ConvertFile failed");
            }
            else
            {
                Console.WriteLine("ConvertFile succeeded");
            }

            err = ConvertSpecificFormats();
            if (err)
            {
                Console.WriteLine("ConvertSpecificFormats failed");
            }
            else
            {
                Console.WriteLine("ConvertSpecificFormats succeeded");
            }

            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                if (pdftron.PDF.Convert.Printer.IsInstalled())
                {
                    try
                    {
                        Console.WriteLine("Uninstalling printer (requires Windows platform and administrator)");
                        pdftron.PDF.Convert.Printer.Uninstall();
                        Console.WriteLine("Uninstalled Printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                    }
                    catch
                    {
                        Console.WriteLine("Unable to uninstall printer");
                    }
                }
            }
            PDFNet.Terminate();
            Console.WriteLine("Done.");
        }
    }
}
' Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.

Imports System
Imports System.Drawing
Imports System.Drawing.Drawing2D
Imports pdftron
Imports pdftron.Common
Imports pdftron.Filters
Imports pdftron.SDF
Imports pdftron.PDF

' The following code showcases the conversion of documents to formats // such as SVG, PDF, EMF, or XPS.

Namespace ConvertTestCS
	Friend Class Testfile
		Public inputFile, outputFile As String
		Public requiresWindowsPlatform As Boolean
		Public Sub New(ByVal inFile As String, ByVal outFile As String, ByVal requiresWindowsPlatform_ As Boolean)
			inputFile = inFile
			outputFile = outFile
			requiresWindowsPlatform = requiresWindowsPlatform_
		End Sub
	End Class

	Friend Class Class1
		Private Shared pdfNetLoader As pdftron.PDFNetLoader = pdftron.PDFNetLoader.Instance()
		Shared Sub New()
		End Sub

		' Relative path to the folder containing test files.
		Private Const inputPath As String = "../../../../TestFiles/"
		Private Const outputPath As String = "../../../../TestFiles/Output/"

		Private Shared Function ConvertSpecificFormats() As Boolean
			Dim err As Boolean = False
			Try
				Using pdfdoc As New PDFDoc()
					Console.WriteLine("Converting from XPS")
					pdftron.PDF.Convert.FromXps(pdfdoc, inputPath & "simple-xps.xps")
					pdfdoc.Save(outputPath & "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused)
					Console.WriteLine("Saved xps2pdf v2.pdf")
				End Using
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

	'///////////////////////////////////////////
	If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				Try
					Using pdfdoc As New PDFDoc()
						Console.WriteLine("Converting from EMF")
						pdftron.PDF.Convert.FromEmf(pdfdoc, inputPath & "simple-emf.emf")
						pdfdoc.Save(outputPath & "emf2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused)
						Console.WriteLine("Saved emf2pdf v2.pdf")
					End Using
				Catch e As PDFNetException
					Console.WriteLine(e.Message)
					err = True
				End Try
	End If

		'/////////////////////////////////////////
			Try
				Using pdfdoc As New PDFDoc()
					' Add a dictionary
					Dim [set] As New ObjSet()
					Dim options As Obj = [set].CreateDict()

					' Put options
					options.PutNumber("FontSize", 15)
					options.PutBool("UseSourceCodeFormatting", True)
					options.PutNumber("PageWidth", 12)
					options.PutNumber("PageHeight", 6)

					' Convert from .txt file
					Console.WriteLine("Converting from txt")
					pdftron.PDF.Convert.FromText(pdfdoc, inputPath & "simple-text.txt", options)
					pdfdoc.Save(outputPath & "simple-text.pdf", SDFDoc.SaveOptions.e_remove_unused)
					Console.WriteLine("Saved simple-text.pdf")
				End Using
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

	'/////////////////////////////////////////
			Try
				Using pdfdoc As New PDFDoc(inputPath & "newsletter.pdf")
					' Convert PDF document to SVG
					Console.WriteLine("Converting pdfdoc to SVG")
					pdftron.PDF.Convert.ToSvg(pdfdoc, outputPath & "pdf2svg v2.svg")
					Console.WriteLine("Saved pdf2svg v2.svg")
				End Using
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

	'///////////////////////////////////////////////
			Try
				' Convert PNG image to XPS
				Console.WriteLine("Converting PNG to XPS")
				pdftron.PDF.Convert.ToXps(inputPath & "butterfly.png", outputPath & "butterfly.xps")
				Console.WriteLine("Saved butterfly.xps")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try
	'/////////////////////////////////////////
			If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				Try
					' Convert MSWord document to XPS
					Console.WriteLine("Converting DOCX to XPS")
					pdftron.PDF.Convert.ToXps(inputPath & "simple-word_2007.docx", outputPath & "simple-word_2007.xps")
					Console.WriteLine("Saved simple-word_2007.xps")
				Catch e As PDFNetException
					Console.WriteLine(e.Message)
					err = True
				End Try
			End If
			'//////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to XPS
				Console.WriteLine("Converting PDF to XPS")
				pdftron.PDF.Convert.ToXps(inputPath & "newsletter.pdf", outputPath & "newsletter.xps")
				Console.WriteLine("Saved newsletter.xps")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			'////////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to HTML
				Console.WriteLine("Converting PDF to HTML")
				pdftron.PDF.Convert.ToHtml(inputPath & "newsletter.pdf", outputPath & "newsletter")
				Console.WriteLine("Saved newsletter as HTML")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			'////////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to EPUB
				Console.WriteLine("Converting PDF to EPUB")
				pdftron.PDF.Convert.ToEpub(inputPath & "newsletter.pdf", outputPath & "newsletter.epub")
				Console.WriteLine("Saved newsletter.epub")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			'///////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to multipage TIFF
				Console.WriteLine("Converting PDF to multipage TIFF")
				Dim tiff_options As New pdftron.PDF.Convert.TiffOutputOptions()
				tiff_options.SetDPI(200)
				tiff_options.SetDither(True)
				tiff_options.SetMono(True)
				pdftron.PDF.Convert.ToTiff(inputPath & "newsletter.pdf", outputPath & "newsletter.tiff", tiff_options)
				Console.WriteLine("Saved newsletter.tiff")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			Return err
		End Function

		Private Shared Function ConvertToPdfFromFile() As Boolean
			Dim testfiles As New System.Collections.ArrayList()
			testfiles.Add(New ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-powerpoint_2007.pptx", "pptx2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-excel_2007.xlsx", "xlsx2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-publisher.pub", "pub2pdf.pdf", True))
			testfiles.Add(New ConvertTestCS.Testfile("simple-text.txt", "txt2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-rtf.rtf", "rtf2pdf.pdf", True))
			testfiles.Add(New ConvertTestCS.Testfile("butterfly.png", "png2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-emf.emf", "emf2pdf.pdf", True))
			testfiles.Add(New ConvertTestCS.Testfile("simple-xps.xps", "xps2pdf.pdf", False))
			' testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.mht", "mht2pdf.pdf", true));
			testfiles.Add(New ConvertTestCS.Testfile("simple-webpage.html", "html2pdf.pdf", True))

			Dim err As Boolean = False
			If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				Try
					If pdftron.PDF.Convert.Printer.IsInstalled("PDFTron PDFNet") Then
						pdftron.PDF.Convert.Printer.SetPrinterName("PDFTron PDFNet")
					ElseIf Not pdftron.PDF.Convert.Printer.IsInstalled() Then
						Try
							Console.WriteLine("Installing printer (requires Windows platform and administrator)")
							pdftron.PDF.Convert.Printer.Install()
							Console.WriteLine("Installed printer " & pdftron.PDF.Convert.Printer.GetPrinterName())
							' the function ConvertToXpsFromFile may require the printer so leave it installed
							' uninstallPrinterWhenDone = true;
						Catch e As PDFNetException
							Console.WriteLine("ERROR: Unable to install printer.")
							Console.WriteLine(e.Message)
							err = True
						Catch
							Console.WriteLine("ERROR: Unable to install printer. Make sure that the package's bitness matches your operating system's bitness and that you are running with administrator privileges.")
						End Try
					End If
				Catch e As PDFNetException
					Console.WriteLine("ERROR: Unable to install printer.")
					Console.WriteLine(e.Message)
					err = True
				End Try
			End If
			For Each file As Testfile In testfiles
				If Environment.OSVersion.Platform <> PlatformID.Win32NT Then
					If file.requiresWindowsPlatform Then
						Continue For
					End If
				End If
				Try
					Using pdfdoc As pdftron.PDF.PDFDoc = New PDFDoc()

						If pdftron.PDF.Convert.RequiresPrinter(inputPath & file.inputFile) Then
							Console.WriteLine("Using PDFNet printer to convert file " & file.inputFile)
						End If
						pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath & file.inputFile)
						pdfdoc.Save(outputPath & file.outputFile, SDFDoc.SaveOptions.e_linearized)
						Console.WriteLine("Converted file: " & file.inputFile)
						Console.WriteLine("to: " & file.outputFile)
					End Using
				Catch e As PDFNetException
					Console.WriteLine("ERROR: on input file " & file.inputFile)
					Console.WriteLine(e.Message)
					err = True
				End Try
			Next file

			Return err
		End Function
		<STAThread>
		Shared Sub Main(ByVal args() As String)
			PDFNet.Initialize(PDFTronLicense.Key)
			Dim err As Boolean = False

			err = ConvertToPdfFromFile()
			If err Then
				Console.WriteLine("ConvertFile failed")
			Else
				Console.WriteLine("ConvertFile succeeded")
			End If

			err = ConvertSpecificFormats()
			If err Then
				Console.WriteLine("ConvertSpecificFormats failed")
			Else
				Console.WriteLine("ConvertSpecificFormats succeeded")
			End If

			If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				If pdftron.PDF.Convert.Printer.IsInstalled() Then
					Try
						Console.WriteLine("Uninstalling printer (requires Windows platform and administrator)")
						pdftron.PDF.Convert.Printer.Uninstall()
						Console.WriteLine("Uninstalled Printer " & pdftron.PDF.Convert.Printer.GetPrinterName())
					Catch
						Console.WriteLine("Unable to uninstall printer")
					End Try
				End If
			End If
			PDFNet.Terminate()
			Console.WriteLine("Done.")
		End Sub
	End Class
End Namespace
VB   C#

5.2 Using IronOCR

Management of PDF files can be completed using OCRInput function. Every page in a document will be read by the Iron Tesseract class. The text will then be extracted from the pages. A second function called "AddPDF" will allow us to open protected documents and ensures that we can add PDFs to our list of documents (password if it is protected). To open a password-protected PDF document, utilize the below code snippet:

var Ocr = new IronTesseract(); // nothing to configure
using (var Input = new OcrInput())
{
    Input.AddPdf("example.pdf", "password");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
var Ocr = new IronTesseract(); // nothing to configure
using (var Input = new OcrInput())
{
    Input.AddPdf("example.pdf", "password");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Dim Ocr = New IronTesseract() ' nothing to configure
Using Input = New OcrInput()
	Input.AddPdf("example.pdf", "password")
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using
VB   C#

Reading and extracting contents from one page in a PDF file can be achieved by utilizing the "Addpdfpage" function. Only specify the exact page number from which we want to extract text. "AddPdfPage" will allow you to extract text from multiple pages that you specify. IEnumerablewill allow you to efficiently specify numerous pages. You must also include the location and the extension of the file. The code snippet below demonstrates this:

IEnumerable<int> numbers = new List<int> {2,8,10 };
 var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    //single page
    Input.AddPdfPage("example.pdf",10);
    //Multiple page
    Input.AddPdfPages("example.pdf", numbers);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Result.SaveAsTextFile("ocrtext.txt");
}
IEnumerable<int> numbers = new List<int> {2,8,10 };
 var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    //single page
    Input.AddPdfPage("example.pdf",10);
    //Multiple page
    Input.AddPdfPages("example.pdf", numbers);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Result.SaveAsTextFile("ocrtext.txt");
}
Dim numbers As IEnumerable(Of Integer) = New List(Of Integer) From {2, 8, 10}
 Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
	'single page
	Input.AddPdfPage("example.pdf",10)
	'Multiple page
	Input.AddPdfPages("example.pdf", numbers)
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
	Result.SaveAsTextFile("ocrtext.txt")
End Using
VB   C#

Use the SaveAsTextFile function to directly store the result in a text file format so you can directly download the file to the output directory path. To save the file into HTML format use SaveAsHocrFile.

6.1 Using PDFTron

We can use the PDFTron SDK to extract images from PDF files, along with their positioning information and DPI. Instead of converting PDF images to a Bitmap, you can also extract uncompressed/compressed image data directly using elements.GetImageData() (described in the PDF Data Extraction code sample). Learn more about our C# PDF Library and the PDF Parsing and Content Extraction Library.

6.2 Using IronOCR

IronOCR has an impressive number of features that will allow you to read QR codes and barcodes directly from scanned documents. The code snippet below demonstrates how you can scan the barcode from a given image or document.

var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage("barcode.gif");
    var Result = Ocr.Read(Input);

    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
    }
}
var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage("barcode.gif");
    var Result = Ocr.Read(Input);

    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
    }
}
Dim Ocr = New IronTesseract() ' nothing to configure
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.ReadBarCodes = True
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Using Input = New OcrInput()
	Input.AddImage("barcode.gif")
	Dim Result = Ocr.Read(Input)

	For Each Barcode In Result.Barcodes
		Console.WriteLine(Barcode.Value)
	Next Barcode
End Using
VB   C#

The above is the code that helps to read the barcode from a given image or PDF document. Numerous barcodes can be read at the same time in a single image or page. IronOCR has a distinctive method that will read the barcode, Ocr.Configuration.ReadBarCodes.

The data is stored in an object called OCRResult after scanning the input. The property in OCRResult is called Barcodes which will have a list of all the available barcode data. We can obtain each individual data related to the barcode details by utilizing a for-each loop. Two operations are completed in a single process- the scanning and reading of the value of the barcode.

Support for threading options is also available and multiple OCR processes can be completed at the same time. Additionally, IronOCR can recognize a precise area from a specified region.

var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
    Input.Add("document.png", ContentArea);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
    Input.Add("document.png", ContentArea);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}
Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
	Dim ContentArea = New System.Drawing.Rectangle() With {
		.X = 215,
		.Y = 1250,
		.Height = 280,
		.Width = 1335
	}
	Input.Add("document.png", ContentArea)
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using
VB   C#

The code snippet above demonstrates how to perform OCR on a distinct region. You are only required to specify the rectangular region in the PDF/image as IronOCRs Tesseract engine will aid in recognizing the text.

IronOCR and PDFtron OCR License Models and Pricing

IronOCR License Models and Pricing

A 30-Day Money-Back Guarantee: Once a license is purchased, you will get 30-days money back guarantee. Within the 30-days you wish to return the product, you will receive your money back.

Easy Integration: The integration of IronOCR with any project and environment is so effortless that it can be achieved in a single line of code simply by adding it as a NuGet Package. On the other hand, another way to integrate the environment is to directly download it from the web.

Perpetual Licensing: Each license purchased does not require renewal.

Free Support and Product Updates: Every license will have support directly from the group behind the product and will come with a year of free product updates. Purchasing extensions is available at any time.

Immediate Licenses: Once payment is received, registered license keys will be sent out immediately.

All licenses are perpetual and apply to development, staging, and production.

The Lite License

  • 1 developer
  • 1 location
  • 1 project
  • Perpetual license

This package allows a single software developer in an organization to utilize Iron Software in one location. IronSoftware can be utilized in a single intranet application, web application, or desktop software program. It is prohibited to share licenses outside of an organization or an agency/client relationship as they are non-transferable. This license type, like all other license types, expressly excludes all rights not expressly granted under the Agreement without OEM redistribution and utilizing the Iron Software as SaaS without purchasing additional coverage.

Pricing: Starts from $749 per year.

The Professional License

  • 10 developers
  • 10 locations
  • 10 projects
  • Perpetual license

This license allows a predetermined number of software developers in an organization to utilize Iron Software in numerous locations, with up to a maximum of ten. Iron Software can be used in as many websites, intranet applications, or desktop software applications as you like. Licenses are non-transferable, and they cannot be shared outside of an organization or an agency/client relationship. This license type, like all other license types, expressly excludes all rights not expressly granted under the Agreement, including OEM redistribution and utilizing the Iron Software as SaaS without purchasing additional coverage. This license can be integrated with a single project up to a maximum of 10.

Pricing: Starts from $999 per year.

The Unlimited License

  • Unlimited developers
  • Unlimited locations
  • Unlimited projects
  • Perpetual license

This allows an unlimited number of software developers in an organization to utilize IronSoftware in an unlimited number of locations. Iron Software can be used in as many intranet applications, desktop software applications, or websites as you like. Licenses are non-transferable, and they cannot be shared outside of an organization or an agency/client relationship. This license type, like all other license types, expressly excludes all rights not granted under the Agreement, including OEM redistribution and utilizing the Iron Software as a SaaS without purchasing additional coverage.

Pricing: Starts from $2,999 per year.

Royalty-Free Redistribution — This allows you to distribute the Iron Software as part of several differently packaged commercial products (without having to pay royalties) based on the number of projects covered by the base license. This will allow the deployment of Iron Software within SaaS software services, which is based on the number of projects covered by the base license.

Pricing: Starts from $1,599 per year.

PDFTron License Models and Pricing

PDFTron Packages (Custom licenses)

  • Prices for custom licenses vary- get a quote to suit your specific budget.
  • Deploy PDFTron's powerful document viewing and editing technology for rendering and document processing on web, mobile, and desktop platforms
  • Ready for integration, OEM redistribution, and organizations with large document volumes or unique requirements
  • Multi-domain pricing and favorable multi-year discounts
  • Offline and sandboxed operations supported
  • Custom, comprehensive contract terms
  • Consulting and training services

PDFTron custom licenses are tailored to match your application and business requirements. Pricing is dependent on your feature scope.g

The IronOCR Lite license is an undefined package that includes one developer with one year of support, and costs around $749. The IronOCR professional license including 10-developer packages and one year of support costs $999, while again, PDFTron packages are undefined. To buy a package, you must contact the support center to get a quote.

The IronOCR Lite and Professional packages include OEM or SaaS service with a 5-year support option. The Lite version includes a one-developer package with 5-year support and Saas and OEM service costs $2,897 with a customized support option. The IronOCR Professional version includes a 10-developer package with 5-year support, Saas, and OEM service and costs $3,397. PDFTron's 10-developer package with one year of support, Saas, and OEM service does not have a defined price.

7.0 Conclusion

IronOCR in the .NET Framework context supplies Tesseract that is straightforward to use with the support of photos and PDF documents achieved in numerous ways. It also provides several settings for improving the Tesseract OCRs library performance. A diverse number of languages are supported, with the ability to have numerous languages in a single operation. Visit their website to learn more regarding the Tesseract OCR.

PDFTron is a software application that uses different engines to recognize images and PDF documents. It also provides various settings to improve the performance of the OCR process and the choice to select multiple languages. PDFTron has limitations on the usage of page conversions. It also has various prices for different operating systems.

IronOCR is a competitive software product and can offer greater accuracy than competing brands. Similar products have at times failed to recognize low-quality images resulting in unknown characters. On the other hand, IronOCR not only provides accurate results but allows us to recognize barcode data and read the value of barcodes from images.

The IronOCR packages provide competitive licensing and support at a single price for all platforms. By comparison, PDFtrons OCR products are all exclusively custom-selected which tend to be more expensive. Pricing varies between both products with IronOCR starting at a price of $749, while due to custom selection, PDFTron's starting price is undefined. In conclusion, IronOCR provides a wider assortment of features for a lower price.

So, what are you waiting for? The free trial is open to all. Obtain the License here and start immediately!