與其他組件比較

IronOCR與PDFTRON OCR的比較

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

坎納帕特·烏頓潘

2022年9月5日

OCR 代表「光學字符識別」。這是將紙質文件或圖像轉換為可讀文本的過程。為達成此目的，有多種方法，如掃描或透過鍵盤手動輸入。這樣做是為了將任何掃描文件和PDF轉換為其原始文本格式。在刑事案件中，這個過程被證明是非常有價值的，特別是在文件受損過於嚴重無法進行人工抄錄的情況下，但可以通過OCR軟體進行掃描和識別。

隨著科技的進步和數位工具的普及應用，OCR 也被應用在其他領域，例如 Google 文件等應用程式的文件轉換，以及學術界和商業世界。有兩種主要的 OCR 類型，「靜態」和「動態」。最常見的類型是靜態OCR，其中整個文檔一次性掃描。另一方面，動態 OCR 逐行掃描，並且可以處理更複雜的佈局，例如表格數據。

本文將討論兩種最常見的 OCR 和 PDF 應用程式及文件庫的比較。這些是：

PDFTron OCR
IronOCR

1.0 介紹

1.1 PDFTron 光學字符識別簡介和功能

若要在 PDFTron SDK 中使用 OCR，我們必須安裝一個單獨的 OCR 模組附加工具。這有助於 SDK 從文件中檢測文字。它可以讓文字可選擇和可搜尋。 PDFTron SDK 支援多達 100 種語言。 PDFTron 的 OCR 引擎由 Tesseract 的開源 LSTM 神經網路支持。 PDFTron OCR 支援多種影像格式以進行文字檢測。僅包含光柵圖像的 PDF 文件也可用於 OCR，其輸出質量取決於輸入圖像。進行光學字符識別（OCR）的最佳圖像是具有300 DPI解析度的灰度圖像。

PDFTron OCR 的功能

使印刷文件中的圖像可搜索。
將簡單的 PDF 轉換為可搜尋的 PDF。
從商業文件中檢測重要資訊。
讓書籍掃描更簡單。
從車輛圖像中檢測車牌號碼。
幫助視障用戶。
使用OCR從檔案中處理多個文件的數據輸入變得容易。
輕鬆地將名片上的資訊轉移到聯絡人數據列表中。

1.2 IronOCR — 介紹與功能

Iron Software 為軟體工程師提供 IronOCR for .NET，使其能在 .NET 應用程式和網站中從照片和 PDF 讀取文本內容。該軟體有助於掃描照片中的文字和條碼，支援多種全球語言，並以純文字或結構化資料的形式輸出。 Iron Software 的 OCR 庫可以在 Web、控制台、MVC 和眾多 .NET 桌面應用程式中使用。在商業部署的情況下，購買許可證的同時會提供開發團隊的直接協助。

IronOCR 使用最新的 Tesseract 5 引擎，可以從任何 PDF 格式中讀取文本、QR 碼和條碼。使用此資料庫為桌面、網路應用程式和控制台添加OCR可確保快速整合。
IronOCR 支持 127 種國際語言。它也支持自訂語言和單字列表。
IronOCR 可以讀取超過 20 種條碼和 QR 碼格式。
支援多頁 GIF 和 TIFF 影像格式。
修正低品質掃描和影像。
支援多執行緒。它可以同時執行一個或多個進程。
IronOCR 可以將結構化數據輸出到頁面、段落、行、單詞、字符等。
IronOCR 支援作業系統，如 Windows、Linux、macOS 等。

2. 在 Visual Studio 中建立新專案

打開 Visual Studio，並找到「檔案」選單。選擇新專案，然後選擇主控台應用程式。在本文中，我們將使用控制台應用程式來生成 PDF 文件。

在相應的文字框中輸入專案名稱並選擇文件路徑。然後，點擊Create按鈕。另外，選擇所需的 .NET Framework。

Visual Studio 專案現在將為選定的應用程式生成結構。

現在，專案的結構將由 Visual Studio 生成。如果您選擇了 Windows、控制台和 Web 應用程式，則會開啟 program.cs 檔案，您可以輸入程式碼並建置/執行您的應用程式。

下一步，我們需要添加庫來測試代碼。

3.0 安裝

3.1 安裝 PDFtron OCR

PDFTron OCR 的安裝需要手動完成，可以從給定的[連結](https://www.pdftron.com/downloads/OCRModuleWindows.zip" target="_blank" rel="nofollow noopener noreferrer)下載作為 zip 檔案。解壓縮並將其配置到專案中。本指南將幫助您使用 PDFTron SDK 的免費集成試用版，在 Windows 上的 .NET Framework 應用程式中執行 PDFTron 範例。免費試用包含解決方案工程師的支援和無限次試用。

先決條件

Visual Studio：確保您的安裝中包含.NET 桌面開發和.NET Framework 4.5.1+ 開發工具工作負載。本指南將使用 Visual Studio 2017 和 PDFTron 的 C# .NET PDF 庫於 Windows。使用此鏈接下載庫，[.NET PDF SDK 下載](https://www.pdftron.com/downloads/PDFNetDotNet4.zip" target="_blank" rel="nofollow noopener noreferrer)。

初始設定

從 .zip 文件中提取資料夾。PDFNET_BASE 用於本指南中選擇您提取的資料夾路徑。

PDFNET_BASE = path/to/extraction/folder/PDFNetDotNet4/

PDFNET_BASE = path/to/extraction/folder/PDFNetDotNet4/

'INSTANT VB TODO TASK: The following line uses invalid syntax:
'PDFNET_BASE = path/@to/extraction/folder/PDFNetDotNet4/

$vbLabelText $csharpLabel

運行範例

導航至提取內容的位置。找到並進入 Samples 資料夾 (PDFNET_BASE/Samples)。此資料夾包含 PDFTron SDK 支援功能的多個範例代碼。

在 Visual Studio 中打開 Samples_20XX.sln。為您的 Visual Studio 安裝選擇合適的版本。
選擇範例代碼並[將其設置為解決方案的啟動項目](https://www.pdftron.com/documentation/dotnet/faq/set-sample-as-startup-project/" target="_blank" rel="nofollow noopener noreferrer)。
運行專案。

整合到您的應用程式中

這被稱為 "PDFTron Hello World" 應用程式。如果您能開啟、儲存和關閉 PDF 文件，那麼整合剩餘的 PDFTron SDK 將很容易。

在 Visual Studio 中，以您偏好的語言創建一個 .NET Framework 主控台應用程式項目。您可以在 Visual C# 或 Visual Basic 類別下找到它們。
進入專案的資料夾。預設情況下，路徑應類似於：C:/Users/User_Name/source/repos/myApp
將 Lib 資料夾從 PDFNET_BASE 複製到您的專案資料夾（該資料夾將包含您的 .csproj 或 .vbproj 文件）。
在右側找到方案總管。右鍵點擊「References」選擇「Add Reference」選項。這會打開一個參考管理器對話框。
在對話框底部，選擇「瀏覽」。在複製的 Lib 資料夾中找到 PDFNetLoader.dll，然後將其添加到引用中。
此外，從 x86 文件夾中添加合適版本的 PDFNet.dll 作為另一個引用（path/to/your/project/folder/Lib/PDFNet/x86/PDFNet.dll）。這將確保應用程式可以在 32 位元和 64 位元操作系統上運行。
點擊 PDFNet.dll。請確保將其 Copy Local 屬性設置為 False。

3.2 安裝 IronOCR

IronOCR 庫可以通過四種方式安裝。

這些是：

使用 Visual Studio。
使用 Visual Studio 命令行。
從 NuGet 網站直接下載。
直接從 IronPDF 網站下載。

3.2.1 使用 Visual Studio

Visual Studio 軟體提供 NuGet 套件管理器選項，以直接將套件安裝至解決方案。截圖展示了如何打開 NuGet 套件管理器。

這將提供一個搜索框，以顯示來自NuGet網站的套件列表。在套件管理器中，我們需要搜索關鍵字"IronOCR"，如下圖所示：

從上面的圖像中，我們可以看到相關搜索結果的列表。要將套件安裝到解決方案中，我們必須選擇所需的選項。

3.2.2 使用 Visual Studio 命令列

在 Visual Studio 中：前往工具-> NuGet 程式包管理器 -> 程式包管理器主控台
將以下代碼片段輸入到包管理器控制台選項卡中。

Install-Package IronOcr


該套件現在將直接安裝到當前專案中，然後即可使用。

### 3.2.3 從 NuGet 網站直接下載

對於第三種方法，我們可以直接從網站下載NuGet套件。
* 導航至此[連結](https://www.nuget.org/packages/Ironocr)。
* 從右側菜單中，確保選擇下載套件選項。
* 雙擊下載的包。 將自動安裝。
* 接下來，重新載入解決方案並開始在專案中使用它。

### 3.2.4 從 IronOCR 網站直接下載

您可以點擊此[鏈接](https://ironsoftware.com/csharp/ocr/packages/IronOcr.zip)直接從網站下載最新的套件。 按照提供的指示，在下載最新的套件後將其添加到專案中。
* 在解決方案窗口中右鍵點擊專案。
* 選擇“選項參考”以瀏覽下載的參考資料位置。
* 然後，點擊「確定」以添加參考。

### 4.0 OCR 圖片

IronOCR和**PDFtron OCR**都具備OCR技術，可以將圖像轉換為可搜尋的文字。

### 4.1 使用**PDFTron**

將 PDF 轉換為 DOCX、DOC、HTML、SVG、TIFF、PNG、JPEG、XPS、EPUB、TXT 以及許多其他格式。

``` cs

PDFDoc doc = new PDFDoc(filename);

// 將 PDF 文件轉換為 SVG

Convert.ToSvg(doc, output_filename + ".svg");

// 將 PDF 文件轉換為 XPS

Convert.ToXps(filename, output_filename + ".xps");

// 將 PDF 文件轉換為多頁 TIFF

Convert.TiffOutputOptions tiff_options = new Convert.TiffOutputOptions();

tiff_options.SetDPI(200);

tiff_options.SetDither(true);

tiff_options.SetMono(true);

Convert.ToTiff(filename, output_filename + ".tiff", tiff_options);

// 將 PDF 轉換為 XOD

Convert.ToXod(filename, output_filename + ".xod");

// 將 PDF 轉換為 HTML

Convert.ToHtml(filename, output_filename + ".html");

4.2 使用 IronOCR

var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage(@"3.png");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Console.ReadKey();
}

var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage(@"3.png");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Console.ReadKey();
}

Dim Ocr = New IronTesseract() ' nothing to configure
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Using Input = New OcrInput()
	Input.AddImage("3.png")
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
	Console.ReadKey()
End Using

$vbLabelText $csharpLabel

如上所示，是以 Tesseract 5 API 將圖像檔案轉換為文字的過程。上面的代碼行用於為 Iron Tesseract 創建一個對象。此外，為了確保我們能夠添加一個或多個圖片文件，我們創建了一個 OcrInput 對象，該對象將需要可用的圖片路徑。在 Iron Tesseract 物件中，函數 "Read" 可用來透過解析圖片檔並將生成的結果提取到 OCR 結果中來獲取圖像。它能從照片中提取文本並將其轉換為字串。

Tesseract 可以通過使用 "AddMultiFrameTiff" 方法來添加多幀圖像。 Tesseract 庫將影像中的每一幀讀取並視為一個獨立的頁面。在進行下一幀之前，將讀取圖像的每一幀，直到成功掃描每一幀。此方法僅支持 TIFF 圖像格式。

數據成功轉換為可編輯文本顯示在上圖中，這是IronOCR精確度的結果。

5.0 OCR PDF 文件

IronOCR和PDFTron OCR將PDF文件轉換為可編輯文本。 PDFTron OCR 提供了一系列選項給用戶，例如儲存頁面、編輯圖像、識別頁面等。此外，它可以提供儲存選項，如文件、文本、HTML 格式等。IronOCR 也允許我們將轉換後的 OCR 文件儲存為 HTML、文本、PDF 等。

5.1 使用 PDFTron OCR

完整範例程式碼展示如何使用 PDFTron OCR 進行 PDF、XPS、EMF、SVG、TIFF、PNG、JPEG 和其他影像格式之間的直接高品質轉換。

// Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.

using System;
using System.Drawing;
using System.Drawing.Drawing2D;
using pdftron;
using pdftron.Common;
using pdftron.Filters;
using pdftron.SDF;
using pdftron.PDF;

// The following code showcases the conversion of documents to formats // such as SVG, PDF, EMF, or XPS.

namespace ConvertTestCS
{
    class Testfile
    {
        public string inputFile, outputFile;
        public bool requiresWindowsPlatform;
        public Testfile(string inFile, string outFile, bool requiresWindowsPlatform_)
        {
            inputFile = inFile;
            outputFile = outFile;
            requiresWindowsPlatform = requiresWindowsPlatform_;
        }
    };

    class Class1
    {
        private static pdftron.PDFNetLoader pdfNetLoader = pdftron.PDFNetLoader.Instance();
        static Class1() {}

        // Relative path to the folder containing test files.
        const string inputPath = "../../../../TestFiles/";
        const string outputPath = "../../../../TestFiles/Output/";

        static bool ConvertSpecificFormats()
        {
            bool err = false;
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    Console.WriteLine("Converting from XPS");
                    pdftron.PDF.Convert.FromXps(pdfdoc, inputPath + "simple-xps.xps");
                    pdfdoc.Save(outputPath + "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved xps2pdf v2.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////
    if (Environment.OSVersion.Platform == PlatformID.Win32NT) {
                try
                {
                    using (PDFDoc pdfdoc = new PDFDoc())
                    {
                        Console.WriteLine("Converting from EMF");
                        pdftron.PDF.Convert.FromEmf(pdfdoc, inputPath + "simple-emf.emf");
                        pdfdoc.Save(outputPath + "emf2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                        Console.WriteLine("Saved emf2pdf v2.pdf");
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

        ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    // Add a dictionary
                    ObjSet set = new ObjSet();
                    Obj options = set.CreateDict();

                    // Put options
                    options.PutNumber("FontSize", 15);
                    options.PutBool("UseSourceCodeFormatting", true);
                    options.PutNumber("PageWidth", 12);
                    options.PutNumber("PageHeight", 6);

                    // Convert from .txt file
                    Console.WriteLine("Converting from txt");
                    pdftron.PDF.Convert.FromText(pdfdoc, inputPath + "simple-text.txt", options);
                    pdfdoc.Save(outputPath + "simple-text.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved simple-text.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc(inputPath + "newsletter.pdf"))
                {
                    // Convert PDF document to SVG
                    Console.WriteLine("Converting pdfdoc to SVG");
                    pdftron.PDF.Convert.ToSvg(pdfdoc, outputPath + "pdf2svg v2.svg");
                    Console.WriteLine("Saved pdf2svg v2.svg");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////////
            try
            {
                // Convert PNG image to XPS
                Console.WriteLine("Converting PNG to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "butterfly.png", outputPath + "butterfly.xps");
                Console.WriteLine("Saved butterfly.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }
    ///////////////////////////////////////////
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    // Convert MSWord document to XPS
                    Console.WriteLine("Converting DOCX to XPS");
                    pdftron.PDF.Convert.ToXps(inputPath + "simple-word_2007.docx", outputPath + "simple-word_2007.xps");
                    Console.WriteLine("Saved simple-word_2007.xps");
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            ////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to XPS
                Console.WriteLine("Converting PDF to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "newsletter.pdf", outputPath + "newsletter.xps");
                Console.WriteLine("Saved newsletter.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to HTML
                Console.WriteLine("Converting PDF to HTML");
                pdftron.PDF.Convert.ToHtml(inputPath + "newsletter.pdf", outputPath + "newsletter");
                Console.WriteLine("Saved newsletter as HTML");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to EPUB
                Console.WriteLine("Converting PDF to EPUB");
                pdftron.PDF.Convert.ToEpub(inputPath + "newsletter.pdf", outputPath + "newsletter.epub");
                Console.WriteLine("Saved newsletter.epub");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            /////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to multipage TIFF
                Console.WriteLine("Converting PDF to multipage TIFF");
                pdftron.PDF.Convert.TiffOutputOptions tiff_options = new pdftron.PDF.Convert.TiffOutputOptions();
                tiff_options.SetDPI(200);
                tiff_options.SetDither(true);
                tiff_options.SetMono(true);
                pdftron.PDF.Convert.ToTiff(inputPath + "newsletter.pdf", outputPath + "newsletter.tiff", tiff_options);
                Console.WriteLine("Saved newsletter.tiff");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            return err;
        }

        static Boolean ConvertToPdfFromFile()
        {
            System.Collections.ArrayList testfiles = new System.Collections.ArrayList();
            testfiles.Add(new ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-powerpoint_2007.pptx", "pptx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-excel_2007.xlsx", "xlsx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-publisher.pub", "pub2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-text.txt", "txt2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-rtf.rtf", "rtf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("butterfly.png", "png2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-emf.emf", "emf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-xps.xps", "xps2pdf.pdf", false));
            // testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.mht", "mht2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.html", "html2pdf.pdf", true));

            bool err = false;
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    if (pdftron.PDF.Convert.Printer.IsInstalled("PDFTron PDFNet"))
                    {
                        pdftron.PDF.Convert.Printer.SetPrinterName("PDFTron PDFNet");
                    }
                    else if (!pdftron.PDF.Convert.Printer.IsInstalled())
                    {
                        try
                        {
                            Console.WriteLine("Installing printer (requires Windows platform and administrator)");
                            pdftron.PDF.Convert.Printer.Install();
                            Console.WriteLine("Installed printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                            // the function ConvertToXpsFromFile may require the printer so leave it installed
                            // uninstallPrinterWhenDone = true;
                        }
                        catch (PDFNetException e)
                        {
                            Console.WriteLine("ERROR: Unable to install printer.");
                            Console.WriteLine(e.Message);
                            err = true;
                        }
                        catch
                        {
                            Console.WriteLine("ERROR: Unable to install printer. Make sure that the package's bitness matches your operating system's bitness and that you are running with administrator privileges.");
                        }
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: Unable to install printer.");
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            foreach (Testfile file in testfiles)
            {
                if ( Environment.OSVersion.Platform != PlatformID.Win32NT)
                {
                    if (file.requiresWindowsPlatform)
                    {
                        continue;
                    }
                }
                try
                {
                    using (pdftron.PDF.PDFDoc pdfdoc = new PDFDoc())
                    {

                        if (pdftron.PDF.Convert.RequiresPrinter(inputPath + file.inputFile))
                        {
                            Console.WriteLine("Using PDFNet printer to convert file " + file.inputFile);
                        }
                        pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath + file.inputFile);
                        pdfdoc.Save(outputPath + file.outputFile, SDFDoc.SaveOptions.e_linearized);
                        Console.WriteLine("Converted file: " + file.inputFile);
                        Console.WriteLine("to: " + file.outputFile);
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: on input file " + file.inputFile);
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

            return err;
        }
        [STAThread]
        static void Main(string [] args)
        {
            PDFNet.Initialize(PDFTronLicense.Key);
            bool err = false;

            err = ConvertToPdfFromFile();
            if (err)
            {
                Console.WriteLine("ConvertFile failed");
            }
            else
            {
                Console.WriteLine("ConvertFile succeeded");
            }

            err = ConvertSpecificFormats();
            if (err)
            {
                Console.WriteLine("ConvertSpecificFormats failed");
            }
            else
            {
                Console.WriteLine("ConvertSpecificFormats succeeded");
            }

            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                if (pdftron.PDF.Convert.Printer.IsInstalled())
                {
                    try
                    {
                        Console.WriteLine("Uninstalling printer (requires Windows platform and administrator)");
                        pdftron.PDF.Convert.Printer.Uninstall();
                        Console.WriteLine("Uninstalled Printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                    }
                    catch
                    {
                        Console.WriteLine("Unable to uninstall printer");
                    }
                }
            }
            PDFNet.Terminate();
            Console.WriteLine("Done.");
        }
    }
}

// Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.

using System;
using System.Drawing;
using System.Drawing.Drawing2D;
using pdftron;
using pdftron.Common;
using pdftron.Filters;
using pdftron.SDF;
using pdftron.PDF;

// The following code showcases the conversion of documents to formats // such as SVG, PDF, EMF, or XPS.

namespace ConvertTestCS
{
    class Testfile
    {
        public string inputFile, outputFile;
        public bool requiresWindowsPlatform;
        public Testfile(string inFile, string outFile, bool requiresWindowsPlatform_)
        {
            inputFile = inFile;
            outputFile = outFile;
            requiresWindowsPlatform = requiresWindowsPlatform_;
        }
    };

    class Class1
    {
        private static pdftron.PDFNetLoader pdfNetLoader = pdftron.PDFNetLoader.Instance();
        static Class1() {}

        // Relative path to the folder containing test files.
        const string inputPath = "../../../../TestFiles/";
        const string outputPath = "../../../../TestFiles/Output/";

        static bool ConvertSpecificFormats()
        {
            bool err = false;
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    Console.WriteLine("Converting from XPS");
                    pdftron.PDF.Convert.FromXps(pdfdoc, inputPath + "simple-xps.xps");
                    pdfdoc.Save(outputPath + "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved xps2pdf v2.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////
    if (Environment.OSVersion.Platform == PlatformID.Win32NT) {
                try
                {
                    using (PDFDoc pdfdoc = new PDFDoc())
                    {
                        Console.WriteLine("Converting from EMF");
                        pdftron.PDF.Convert.FromEmf(pdfdoc, inputPath + "simple-emf.emf");
                        pdfdoc.Save(outputPath + "emf2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused);
                        Console.WriteLine("Saved emf2pdf v2.pdf");
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

        ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc())
                {
                    // Add a dictionary
                    ObjSet set = new ObjSet();
                    Obj options = set.CreateDict();

                    // Put options
                    options.PutNumber("FontSize", 15);
                    options.PutBool("UseSourceCodeFormatting", true);
                    options.PutNumber("PageWidth", 12);
                    options.PutNumber("PageHeight", 6);

                    // Convert from .txt file
                    Console.WriteLine("Converting from txt");
                    pdftron.PDF.Convert.FromText(pdfdoc, inputPath + "simple-text.txt", options);
                    pdfdoc.Save(outputPath + "simple-text.pdf", SDFDoc.SaveOptions.e_remove_unused);
                    Console.WriteLine("Saved simple-text.pdf");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    ///////////////////////////////////////////
            try
            {
                using (PDFDoc pdfdoc = new PDFDoc(inputPath + "newsletter.pdf"))
                {
                    // Convert PDF document to SVG
                    Console.WriteLine("Converting pdfdoc to SVG");
                    pdftron.PDF.Convert.ToSvg(pdfdoc, outputPath + "pdf2svg v2.svg");
                    Console.WriteLine("Saved pdf2svg v2.svg");
                }
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

    /////////////////////////////////////////////////
            try
            {
                // Convert PNG image to XPS
                Console.WriteLine("Converting PNG to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "butterfly.png", outputPath + "butterfly.xps");
                Console.WriteLine("Saved butterfly.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }
    ///////////////////////////////////////////
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    // Convert MSWord document to XPS
                    Console.WriteLine("Converting DOCX to XPS");
                    pdftron.PDF.Convert.ToXps(inputPath + "simple-word_2007.docx", outputPath + "simple-word_2007.xps");
                    Console.WriteLine("Saved simple-word_2007.xps");
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            ////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to XPS
                Console.WriteLine("Converting PDF to XPS");
                pdftron.PDF.Convert.ToXps(inputPath + "newsletter.pdf", outputPath + "newsletter.xps");
                Console.WriteLine("Saved newsletter.xps");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to HTML
                Console.WriteLine("Converting PDF to HTML");
                pdftron.PDF.Convert.ToHtml(inputPath + "newsletter.pdf", outputPath + "newsletter");
                Console.WriteLine("Saved newsletter as HTML");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            //////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to EPUB
                Console.WriteLine("Converting PDF to EPUB");
                pdftron.PDF.Convert.ToEpub(inputPath + "newsletter.pdf", outputPath + "newsletter.epub");
                Console.WriteLine("Saved newsletter.epub");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            /////////////////////////////////////////////////////////////////////
            try
            {
                // Convert PDF document to multipage TIFF
                Console.WriteLine("Converting PDF to multipage TIFF");
                pdftron.PDF.Convert.TiffOutputOptions tiff_options = new pdftron.PDF.Convert.TiffOutputOptions();
                tiff_options.SetDPI(200);
                tiff_options.SetDither(true);
                tiff_options.SetMono(true);
                pdftron.PDF.Convert.ToTiff(inputPath + "newsletter.pdf", outputPath + "newsletter.tiff", tiff_options);
                Console.WriteLine("Saved newsletter.tiff");
            }
            catch (PDFNetException e)
            {
                Console.WriteLine(e.Message);
                err = true;
            }

            return err;
        }

        static Boolean ConvertToPdfFromFile()
        {
            System.Collections.ArrayList testfiles = new System.Collections.ArrayList();
            testfiles.Add(new ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-powerpoint_2007.pptx", "pptx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-excel_2007.xlsx", "xlsx2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-publisher.pub", "pub2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-text.txt", "txt2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-rtf.rtf", "rtf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("butterfly.png", "png2pdf.pdf", false));
            testfiles.Add(new ConvertTestCS.Testfile("simple-emf.emf", "emf2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-xps.xps", "xps2pdf.pdf", false));
            // testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.mht", "mht2pdf.pdf", true));
            testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.html", "html2pdf.pdf", true));

            bool err = false;
            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                try
                {
                    if (pdftron.PDF.Convert.Printer.IsInstalled("PDFTron PDFNet"))
                    {
                        pdftron.PDF.Convert.Printer.SetPrinterName("PDFTron PDFNet");
                    }
                    else if (!pdftron.PDF.Convert.Printer.IsInstalled())
                    {
                        try
                        {
                            Console.WriteLine("Installing printer (requires Windows platform and administrator)");
                            pdftron.PDF.Convert.Printer.Install();
                            Console.WriteLine("Installed printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                            // the function ConvertToXpsFromFile may require the printer so leave it installed
                            // uninstallPrinterWhenDone = true;
                        }
                        catch (PDFNetException e)
                        {
                            Console.WriteLine("ERROR: Unable to install printer.");
                            Console.WriteLine(e.Message);
                            err = true;
                        }
                        catch
                        {
                            Console.WriteLine("ERROR: Unable to install printer. Make sure that the package's bitness matches your operating system's bitness and that you are running with administrator privileges.");
                        }
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: Unable to install printer.");
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }
            foreach (Testfile file in testfiles)
            {
                if ( Environment.OSVersion.Platform != PlatformID.Win32NT)
                {
                    if (file.requiresWindowsPlatform)
                    {
                        continue;
                    }
                }
                try
                {
                    using (pdftron.PDF.PDFDoc pdfdoc = new PDFDoc())
                    {

                        if (pdftron.PDF.Convert.RequiresPrinter(inputPath + file.inputFile))
                        {
                            Console.WriteLine("Using PDFNet printer to convert file " + file.inputFile);
                        }
                        pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath + file.inputFile);
                        pdfdoc.Save(outputPath + file.outputFile, SDFDoc.SaveOptions.e_linearized);
                        Console.WriteLine("Converted file: " + file.inputFile);
                        Console.WriteLine("to: " + file.outputFile);
                    }
                }
                catch (PDFNetException e)
                {
                    Console.WriteLine("ERROR: on input file " + file.inputFile);
                    Console.WriteLine(e.Message);
                    err = true;
                }
            }

            return err;
        }
        [STAThread]
        static void Main(string [] args)
        {
            PDFNet.Initialize(PDFTronLicense.Key);
            bool err = false;

            err = ConvertToPdfFromFile();
            if (err)
            {
                Console.WriteLine("ConvertFile failed");
            }
            else
            {
                Console.WriteLine("ConvertFile succeeded");
            }

            err = ConvertSpecificFormats();
            if (err)
            {
                Console.WriteLine("ConvertSpecificFormats failed");
            }
            else
            {
                Console.WriteLine("ConvertSpecificFormats succeeded");
            }

            if (Environment.OSVersion.Platform == PlatformID.Win32NT)
            {
                if (pdftron.PDF.Convert.Printer.IsInstalled())
                {
                    try
                    {
                        Console.WriteLine("Uninstalling printer (requires Windows platform and administrator)");
                        pdftron.PDF.Convert.Printer.Uninstall();
                        Console.WriteLine("Uninstalled Printer " + pdftron.PDF.Convert.Printer.GetPrinterName());
                    }
                    catch
                    {
                        Console.WriteLine("Unable to uninstall printer");
                    }
                }
            }
            PDFNet.Terminate();
            Console.WriteLine("Done.");
        }
    }
}

' Copyright (c) 2001-2021 by PDFTron Systems Inc. All Rights Reserved.

Imports System
Imports System.Drawing
Imports System.Drawing.Drawing2D
Imports pdftron
Imports pdftron.Common
Imports pdftron.Filters
Imports pdftron.SDF
Imports pdftron.PDF

' The following code showcases the conversion of documents to formats // such as SVG, PDF, EMF, or XPS.

Namespace ConvertTestCS
	Friend Class Testfile
		Public inputFile, outputFile As String
		Public requiresWindowsPlatform As Boolean
		Public Sub New(ByVal inFile As String, ByVal outFile As String, ByVal requiresWindowsPlatform_ As Boolean)
			inputFile = inFile
			outputFile = outFile
			requiresWindowsPlatform = requiresWindowsPlatform_
		End Sub
	End Class

	Friend Class Class1
		Private Shared pdfNetLoader As pdftron.PDFNetLoader = pdftron.PDFNetLoader.Instance()
		Shared Sub New()
		End Sub

		' Relative path to the folder containing test files.
		Private Const inputPath As String = "../../../../TestFiles/"
		Private Const outputPath As String = "../../../../TestFiles/Output/"

		Private Shared Function ConvertSpecificFormats() As Boolean
			Dim err As Boolean = False
			Try
				Using pdfdoc As New PDFDoc()
					Console.WriteLine("Converting from XPS")
					pdftron.PDF.Convert.FromXps(pdfdoc, inputPath & "simple-xps.xps")
					pdfdoc.Save(outputPath & "xps2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused)
					Console.WriteLine("Saved xps2pdf v2.pdf")
				End Using
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

	'///////////////////////////////////////////
	If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				Try
					Using pdfdoc As New PDFDoc()
						Console.WriteLine("Converting from EMF")
						pdftron.PDF.Convert.FromEmf(pdfdoc, inputPath & "simple-emf.emf")
						pdfdoc.Save(outputPath & "emf2pdf v2.pdf", SDFDoc.SaveOptions.e_remove_unused)
						Console.WriteLine("Saved emf2pdf v2.pdf")
					End Using
				Catch e As PDFNetException
					Console.WriteLine(e.Message)
					err = True
				End Try
	End If

		'/////////////////////////////////////////
			Try
				Using pdfdoc As New PDFDoc()
					' Add a dictionary
					Dim [set] As New ObjSet()
					Dim options As Obj = [set].CreateDict()

					' Put options
					options.PutNumber("FontSize", 15)
					options.PutBool("UseSourceCodeFormatting", True)
					options.PutNumber("PageWidth", 12)
					options.PutNumber("PageHeight", 6)

					' Convert from .txt file
					Console.WriteLine("Converting from txt")
					pdftron.PDF.Convert.FromText(pdfdoc, inputPath & "simple-text.txt", options)
					pdfdoc.Save(outputPath & "simple-text.pdf", SDFDoc.SaveOptions.e_remove_unused)
					Console.WriteLine("Saved simple-text.pdf")
				End Using
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

	'/////////////////////////////////////////
			Try
				Using pdfdoc As New PDFDoc(inputPath & "newsletter.pdf")
					' Convert PDF document to SVG
					Console.WriteLine("Converting pdfdoc to SVG")
					pdftron.PDF.Convert.ToSvg(pdfdoc, outputPath & "pdf2svg v2.svg")
					Console.WriteLine("Saved pdf2svg v2.svg")
				End Using
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

	'///////////////////////////////////////////////
			Try
				' Convert PNG image to XPS
				Console.WriteLine("Converting PNG to XPS")
				pdftron.PDF.Convert.ToXps(inputPath & "butterfly.png", outputPath & "butterfly.xps")
				Console.WriteLine("Saved butterfly.xps")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try
	'/////////////////////////////////////////
			If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				Try
					' Convert MSWord document to XPS
					Console.WriteLine("Converting DOCX to XPS")
					pdftron.PDF.Convert.ToXps(inputPath & "simple-word_2007.docx", outputPath & "simple-word_2007.xps")
					Console.WriteLine("Saved simple-word_2007.xps")
				Catch e As PDFNetException
					Console.WriteLine(e.Message)
					err = True
				End Try
			End If
			'//////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to XPS
				Console.WriteLine("Converting PDF to XPS")
				pdftron.PDF.Convert.ToXps(inputPath & "newsletter.pdf", outputPath & "newsletter.xps")
				Console.WriteLine("Saved newsletter.xps")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			'////////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to HTML
				Console.WriteLine("Converting PDF to HTML")
				pdftron.PDF.Convert.ToHtml(inputPath & "newsletter.pdf", outputPath & "newsletter")
				Console.WriteLine("Saved newsletter as HTML")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			'////////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to EPUB
				Console.WriteLine("Converting PDF to EPUB")
				pdftron.PDF.Convert.ToEpub(inputPath & "newsletter.pdf", outputPath & "newsletter.epub")
				Console.WriteLine("Saved newsletter.epub")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			'///////////////////////////////////////////////////////////////////
			Try
				' Convert PDF document to multipage TIFF
				Console.WriteLine("Converting PDF to multipage TIFF")
				Dim tiff_options As New pdftron.PDF.Convert.TiffOutputOptions()
				tiff_options.SetDPI(200)
				tiff_options.SetDither(True)
				tiff_options.SetMono(True)
				pdftron.PDF.Convert.ToTiff(inputPath & "newsletter.pdf", outputPath & "newsletter.tiff", tiff_options)
				Console.WriteLine("Saved newsletter.tiff")
			Catch e As PDFNetException
				Console.WriteLine(e.Message)
				err = True
			End Try

			Return err
		End Function

		Private Shared Function ConvertToPdfFromFile() As Boolean
			Dim testfiles As New System.Collections.ArrayList()
			testfiles.Add(New ConvertTestCS.Testfile("simple-word_2007.docx", "docx2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-powerpoint_2007.pptx", "pptx2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-excel_2007.xlsx", "xlsx2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-publisher.pub", "pub2pdf.pdf", True))
			testfiles.Add(New ConvertTestCS.Testfile("simple-text.txt", "txt2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-rtf.rtf", "rtf2pdf.pdf", True))
			testfiles.Add(New ConvertTestCS.Testfile("butterfly.png", "png2pdf.pdf", False))
			testfiles.Add(New ConvertTestCS.Testfile("simple-emf.emf", "emf2pdf.pdf", True))
			testfiles.Add(New ConvertTestCS.Testfile("simple-xps.xps", "xps2pdf.pdf", False))
			' testfiles.Add(new ConvertTestCS.Testfile("simple-webpage.mht", "mht2pdf.pdf", true));
			testfiles.Add(New ConvertTestCS.Testfile("simple-webpage.html", "html2pdf.pdf", True))

			Dim err As Boolean = False
			If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				Try
					If pdftron.PDF.Convert.Printer.IsInstalled("PDFTron PDFNet") Then
						pdftron.PDF.Convert.Printer.SetPrinterName("PDFTron PDFNet")
					ElseIf Not pdftron.PDF.Convert.Printer.IsInstalled() Then
						Try
							Console.WriteLine("Installing printer (requires Windows platform and administrator)")
							pdftron.PDF.Convert.Printer.Install()
							Console.WriteLine("Installed printer " & pdftron.PDF.Convert.Printer.GetPrinterName())
							' the function ConvertToXpsFromFile may require the printer so leave it installed
							' uninstallPrinterWhenDone = true;
						Catch e As PDFNetException
							Console.WriteLine("ERROR: Unable to install printer.")
							Console.WriteLine(e.Message)
							err = True
						Catch
							Console.WriteLine("ERROR: Unable to install printer. Make sure that the package's bitness matches your operating system's bitness and that you are running with administrator privileges.")
						End Try
					End If
				Catch e As PDFNetException
					Console.WriteLine("ERROR: Unable to install printer.")
					Console.WriteLine(e.Message)
					err = True
				End Try
			End If
			For Each file As Testfile In testfiles
				If Environment.OSVersion.Platform <> PlatformID.Win32NT Then
					If file.requiresWindowsPlatform Then
						Continue For
					End If
				End If
				Try
					Using pdfdoc As pdftron.PDF.PDFDoc = New PDFDoc()

						If pdftron.PDF.Convert.RequiresPrinter(inputPath & file.inputFile) Then
							Console.WriteLine("Using PDFNet printer to convert file " & file.inputFile)
						End If
						pdftron.PDF.Convert.ToPdf(pdfdoc, inputPath & file.inputFile)
						pdfdoc.Save(outputPath & file.outputFile, SDFDoc.SaveOptions.e_linearized)
						Console.WriteLine("Converted file: " & file.inputFile)
						Console.WriteLine("to: " & file.outputFile)
					End Using
				Catch e As PDFNetException
					Console.WriteLine("ERROR: on input file " & file.inputFile)
					Console.WriteLine(e.Message)
					err = True
				End Try
			Next file

			Return err
		End Function
		<STAThread>
		Shared Sub Main(ByVal args() As String)
			PDFNet.Initialize(PDFTronLicense.Key)
			Dim err As Boolean = False

			err = ConvertToPdfFromFile()
			If err Then
				Console.WriteLine("ConvertFile failed")
			Else
				Console.WriteLine("ConvertFile succeeded")
			End If

			err = ConvertSpecificFormats()
			If err Then
				Console.WriteLine("ConvertSpecificFormats failed")
			Else
				Console.WriteLine("ConvertSpecificFormats succeeded")
			End If

			If Environment.OSVersion.Platform = PlatformID.Win32NT Then
				If pdftron.PDF.Convert.Printer.IsInstalled() Then
					Try
						Console.WriteLine("Uninstalling printer (requires Windows platform and administrator)")
						pdftron.PDF.Convert.Printer.Uninstall()
						Console.WriteLine("Uninstalled Printer " & pdftron.PDF.Convert.Printer.GetPrinterName())
					Catch
						Console.WriteLine("Unable to uninstall printer")
					End Try
				End If
			End If
			PDFNet.Terminate()
			Console.WriteLine("Done.")
		End Sub
	End Class
End Namespace

$vbLabelText $csharpLabel

5.2 使用 IronOCR

可以使用OCRInput函數完成PDF文件的管理。文件中的每一頁都將由Iron Tesseract類進行讀取。文本將從頁面中提取。第二個名為"AddPDF"的函數將允許我們打開受保護的文件，並確保我們可以將PDF添加到我們的文件列表中（如果受保護則使用密碼）。要打開受密碼保護的 PDF 文件，請使用以下程式碼片段：

var Ocr = new IronTesseract(); // nothing to configure
using (var Input = new OcrInput())
{
    Input.AddPdf("example.pdf", "password");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

var Ocr = new IronTesseract(); // nothing to configure
using (var Input = new OcrInput())
{
    Input.AddPdf("example.pdf", "password");
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

Dim Ocr = New IronTesseract() ' nothing to configure
Using Input = New OcrInput()
	Input.AddPdf("example.pdf", "password")
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

使用 "Addpdfpage" 函數可以實現從 PDF 文件的一頁中讀取和提取內容。僅指定我們要提取文本的確切頁碼。 “AddPdfPage” 將允許您從您指定的多個頁面中提取文本。 IEnumerable將允許您高效指定多個頁面。您還必須包含文件的位置和擴展名。下面的程式碼片段示範了這一點：

IEnumerable<int> numbers = new List<int> {2,8,10 };
 var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    //single page
    Input.AddPdfPage("example.pdf",10);
    //Multiple page
    Input.AddPdfPages("example.pdf", numbers);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Result.SaveAsTextFile("ocrtext.txt");
}

IEnumerable<int> numbers = new List<int> {2,8,10 };
 var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    //single page
    Input.AddPdfPage("example.pdf",10);
    //Multiple page
    Input.AddPdfPages("example.pdf", numbers);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
    Result.SaveAsTextFile("ocrtext.txt");
}

Dim numbers As IEnumerable(Of Integer) = New List(Of Integer) From {2, 8, 10}
 Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
	'single page
	Input.AddPdfPage("example.pdf",10)
	'Multiple page
	Input.AddPdfPages("example.pdf", numbers)
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
	Result.SaveAsTextFile("ocrtext.txt")
End Using

$vbLabelText $csharpLabel

使用 SaveAsTextFile 函數將結果直接儲存為文字檔案格式，以便您可以直接將檔案下載到輸出目錄路徑。若要將文件保存為 HTML 格式，請使用 SaveAsHocrFile。

6.1 使用 PDFTron

我們可以使用 PDFTron SDK 從 PDF 文件中提取圖像，連同它們的定位信息和 DPI。除了將 PDF 圖像轉換為 Bitmap，您還可以直接使用 elements.GetImageData() 提取未壓縮/壓縮的圖像數據（在 PDF 數據提取代碼範例中有所描述）。了解更多關於我們的 C# PDF 庫以及 PDF 解析和內容擷取庫的信息。

6.2 使用 IronOCR

IronOCR 具有多項強大的功能，可以讓您直接從掃描的文件中讀取 QR 碼和條形碼。以下程式碼片段演示如何從給定的圖像或文件中掃描條碼。

var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage("barcode.gif");
    var Result = Ocr.Read(Input);

    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
    }
}

var Ocr = new IronTesseract(); // nothing to configure
Ocr.Language = OcrLanguage.EnglishBest;
Ocr.Configuration.ReadBarCodes = true;
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5;
using (var Input = new OcrInput())
{
    Input.AddImage("barcode.gif");
    var Result = Ocr.Read(Input);

    foreach (var Barcode in Result.Barcodes)
    {
        Console.WriteLine(Barcode.Value);
    }
}

Dim Ocr = New IronTesseract() ' nothing to configure
Ocr.Language = OcrLanguage.EnglishBest
Ocr.Configuration.ReadBarCodes = True
Ocr.Configuration.TesseractVersion = TesseractVersion.Tesseract5
Using Input = New OcrInput()
	Input.AddImage("barcode.gif")
	Dim Result = Ocr.Read(Input)

	For Each Barcode In Result.Barcodes
		Console.WriteLine(Barcode.Value)
	Next Barcode
End Using

$vbLabelText $csharpLabel

上述是有助於從給定的圖像或 PDF 文檔中讀取條碼的代碼。在單一圖像或頁面中可以同時讀取多個條碼。 IronOCR 擁有一種獨特的方法來讀取條碼，即 Ocr.Configuration.ReadBarCodes。

在掃描輸入後，數據存儲在名為 OCRResult 的對象中。在 OCRResult 中的屬性稱為 Barcodes，將包含所有可用的條碼數據列表。我們可以使用 for-each 迴圈獲取每個與條碼細節相關的個別數據。在一個流程中完成兩項操作——掃描和讀取條碼的值。

支援線程選項並且可以同時完成多個OCR處理。此外，IronOCR可以識別指定區域中的精確範圍。

var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
    Input.Add("document.png", ContentArea);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

var Ocr = new IronTesseract();
using (var Input = new OcrInput())
{
    var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
    Input.Add("document.png", ContentArea);
    var Result = Ocr.Read(Input);
    Console.WriteLine(Result.Text);
}

Dim Ocr = New IronTesseract()
Using Input = New OcrInput()
	Dim ContentArea = New System.Drawing.Rectangle() With {
		.X = 215,
		.Y = 1250,
		.Height = 280,
		.Width = 1335
	}
	Input.Add("document.png", ContentArea)
	Dim Result = Ocr.Read(Input)
	Console.WriteLine(Result.Text)
End Using

$vbLabelText $csharpLabel

上述程式碼片段展示了如何對特定區域進行光學字符識別 (OCR)。您只需指定 PDF/圖像中的矩形區域，IronOCR 的 Tesseract 引擎將幫助識別文本。

IronOCR和PDFtron OCR授權模型和定價

IronOCR 授權模式和定價

30天退款保證：購買授權後，您將享有30天退款保證。在您希望退還產品的 30 天內，您將收到退款。

易於整合：IronOCR與任何專案和環境的整合非常簡單，可以只用一行程式碼通過將其作為NuGet套件添加來實現。另一方面，將環境整合的另一種方法是直接從網上下載。

永久授權：每次購買的授權不需更新。

免費支援和產品更新：每個授權都會得到產品背後團隊的直接支援，並且會附帶一年的免費產品更新。您可以隨時購買擴展功能。

即時授權：一旦收到付款，註冊的授權密鑰將立即發送。

所有授權都是永久性的，適用於開發、測試和生產環境。

Lite 授權

1 開發人員
1 地點
1 專案
永久授權
此套件允許組織中的單一軟體開發人員在一個地點使用 Iron Software。 Iron Software 可以用於單一內部網路應用程式、網路應用程式或桌面軟體程式。禁止在組織或機構/客戶關係之外共享授權，因為它們是不可轉讓的。此授權類型與其他所有授權類型一樣，明確排除在協議中未明確授予的所有權利，不包括OEM重新分發，並且在未購買額外保障的情況下將Iron Software用作SaaS。
定價：每年起始於$749。

專業授權

10 位開發人員
10個地點
10個專案
永久授權
此授權允許一個組織中的預定數量的軟體開發人員在多個地點使用Iron Software，最多可至十個。 Iron Software 可以用於您喜歡的任意多個網站、內聯網應用程序或桌面軟體應用程序。許可證不可轉讓，且不得在組織或代理/客戶關係之外共享。此許可證類型，如所有其他許可證類型，明確排除所有在協議下未明確授予的權利，包括 OEM 再分發和在未購買額外保險的情況下作為 SaaS 使用 Iron Software。此許可證可以整合至最多10個單一專案。
價格：每年起價 $999。

無限授權

無限開發者
不限地點
無限項目
永久授權
這允許一個組織中的無限數量的軟體開發人員在無限數量的地點使用 Iron Software。 Iron Software 可以用於任意數量的內聯網應用程式、桌面軟體應用程式或網站。許可是不可轉讓的，並且不能在組織或機構/客戶關係之外分享。這種類型的許可證，如同所有其他類型的許可證明確排除協議下未授予的所有權利，包括 OEM 重發佈以及在未購買額外保障的情況下將 Iron Software 用作 SaaS。
定價： 每年從 $3,999 開始。
免版稅重分發 — 這允許您根據基本許可證涵蓋的項目數分發作為多種不同包裝的商業產品一部分的Iron Software（無需支付版稅）。這將允許在 SaaS 軟體服務中部署 Iron Software，這基於基礎許可涵蓋的專案數量。
價格：每年起始價格為 $1,599。

PDFTron授權模式和定價

PDFTron 套件（自定義授權）

自訂授權的價格各有不同- 請獲取報價以符合您的特定預算。
部署 PDFTron 強大的文件檢視和編輯技術，用於在網頁、行動裝置和桌面平台上進行渲染和文件處理。
適合整合、OEM再分發以及具有大量文件或特殊需求的組織使用。
多域定價和有利的多年折扣
支援離線及沙盒操作
自訂、全面的合約條款
諮詢和培訓服務
PDFTron 的自定義許可證專為滿足您的應用程式和業務需求而設計。定價取決於您的功能範圍。
IronOCR Lite 授權是一個未定義的套件，包含一名開發人員和一年的支援，費用大約為 $749。 IronOCR 專業授權包含10個開發者套件和一年支援費用為$999，而PDFTron套件則是未定義。要購買套件，您必須聯繫支援中心以獲取報價。

IronOCR Lite 和 Professional 套件包含 OEM 或 SaaS 服務，提供 5 年支援選項。 Lite 版本包括一個開發人員包，包含 5 年支援和 Saas 以及 OEM 服務，費用為 $2,897，並提供客制化支援選項。 IronOCR Professional 版本包括一個 10 名開發者的套裝，提供 5 年的支持、Saas 和 OEM 服務，費用為 $3,397。PDFTron 的 10 名開發者套裝包括一年的支持、Saas 和 OEM 服務，但沒有明確的價格。

7.0 結論

在 .NET Framework 環境中，IronOCR 提供 Tesseract，能夠以多種方式輕鬆使用，支持照片和 PDF 文件。它還提供了多種設置來提高 Tesseract OCR 庫的性能。支援多種語言，並且能夠在單一操作中使用多種語言。造訪他們的網站以了解更多有關 Tesseract OCR 的資訊。

PDFTron 是一個使用不同引擎來識別圖像和 PDF 文件的軟體應用程式。它還提供各種設置以提高OCR處理的性能，以及選擇多種語言的選項。 PDFTron對頁面轉換的使用有限制。它也針對不同的操作系統提供多種價格。

IronOCR 是一款具有競爭力的軟體產品，並且可以提供比競爭品牌更高的準確性。類似產品有時無法識別低品質圖像，導致未知字符。另一方面，IronOCR不僅提供準確的結果，還允許我們識別條碼數據並從圖像中讀取條碼的值。

IronOCR 的套件為所有平台提供具有競爭力的授權和支援，以單一價格提供。相比之下，PDFtron 的 OCR 產品全部是專門定制選擇的，因此價格往往更高。兩個產品的定價不同，IronOCR 的起始價格為 $749，而由於是定制選擇，PDFTron 的起始價格不固定。總之，IronOCR 以較低的價格提供更廣泛的功能選擇。

那麼，你還在等什麼呢？免費試用對所有人開放。在這裡獲取許可證並立即開始！

坎納帕特·烏頓潘

立即與工程團隊聊天

軟體工程師

在成為軟體工程師之前，Kannapat 在日本北海道大學完成了環境資源博士學位。在攻讀學位期間，Kannapat 也成為了車輛機器人實驗室的成員，該實驗室隸屬於生物生產工程學系。2022 年，他利用自己的 C# 技能，加入了 Iron Software 的工程團隊，專注於 IronPDF 的開發。Kannapat 珍視這份工作，因為他可以直接向負責撰寫大部分 IronPDF 程式碼的開發人員學習。除了同儕學習外，Kannapat 還享受在 Iron Software 工作的社交方面。當他不在撰寫程式碼或文件時，Kannapat 通常會在 PS5 上玩遊戲或重看《最後生還者》。

< 上一頁
Tesseract 替代方案（2022 更新）

下一個 >
IronOCR 與 Tesseract.NET 的比較