使用 IronOCR 在 C# 中將文字轉為圖片

更新:2026年4月21日

Translated

View the article in English

IronOCR 的 HighlightTextAndSaveAsImages 方法透過在偵測到的文字（字元、單字、行或段落）周圍繪製邊界框來視覺化 OCR 結果，並將其儲存為診斷圖像，讓開發人員能夠驗證 OCR 準確性並排除辨識問題。

OCR 結果的可視化，是指在引擎於圖像中偵測到的特定文字元素周圍，渲染出邊界框。此流程會在個別字元、WORD、行或段落的確切位置疊加鮮明的標記，提供一份清晰的已識別內容地圖。

此視覺化回饋對於除錯及驗證 OCR 輸出準確性至關重要，它能顯示軟體識別出的內容以及出錯的位置。在處理複雜文件或排除辨識問題時，視覺標示功能將成為不可或缺的診斷工具。

本文透過 IronOCR 的 HighlightTextAndSaveAsImages 方法，展示其診斷功能。此功能可標示特定文字區塊，並將其儲存為圖片以供核對。無論是建置文件處理系統、實施品質控管措施，還是驗證您的 OCR 實作，此功能都能針對 OCR 引擎偵測到的內容提供即時視覺回饋。

快速入門：立即在您的 PDF 中標示文字

此程式碼片段示範 IronOCR 的使用方式：載入 PDF 並標記文件中的每個單字，將結果儲存為圖片。只需一行文字，即可獲得 OCR 結果的視覺化回饋。

using NuGet 套件管理員安裝 https://www.nuget.org/packages/IronOcr
PM > Install-Package IronOcr

請複製並執行此程式碼片段。

new IronOcr.OcrInput().LoadPdf("document.pdf").HighlightTextAndSaveAsImages(new IronOcr.IronTesseract(), "highlight_page_", IronOcr.ResultHighlightType.Word);

部署至您的生產環境進行測試

立即透過免費試用，在您的專案中開始使用 IronOCR

簡化工作流程（5 個步驟）

下載 C# 函式庫以偵測頁面旋轉
實例化 OCR 引擎
使用 LoadPdf 載入 PDF 文件
使用 HighlightTextAndSaveAsImages 標記文字段落並將其儲存為圖片

如何選取文字並儲存為圖片？

使用 IronOCR，標記文字並將其儲存為圖片非常簡單。使用 LoadPdf 載入現有 PDF 檔案，接著呼叫 HighlightTextByType 方法來標示文字區段，並將其儲存為圖片。此技術可驗證 OCR 準確性，並排除文件中的文字辨識問題。

此方法接受三個參數：IronTesseract OCR 引擎、輸出檔名前綴，以及來自 ResultHighlightType 的枚舉值，該枚舉值決定要標示的文字類型。此範例使用 ResultHighlightType.Paragraph 來標示文字區塊為段落。 HighlightTextAndSaveAsImages

請注意此函式會使用輸出字串的前綴，並將頁面識別碼（例如"page_0"、"page_1"）附加至每頁的輸出影像檔名中。

此範例使用一份包含三個段落的 PDF 文件。

輸入的 PDF 檔案長什麼樣子？

如何實作高亮顯示程式碼？

以下範例程式碼展示了使用 OcrInput 類別的基本實作方式。

:path=/static-assets/ocr/content-code-examples/how-to/highlight-texts-as-images.cs

using IronOcr;

IronTesseract ocrTesseract = new IronTesseract();

using var ocrInput = new OcrInput();
ocrInput.LoadPdf("document.pdf");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_page_", ResultHighlightType.Paragraph);

Imports IronOcr

Private ocrTesseract As New IronTesseract()

Private ocrInput = New OcrInput()
ocrInput.LoadPdf("document.pdf")
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_page_", ResultHighlightType.Paragraph)

$vbLabelText $csharpLabel

輸出圖片顯示了什麼？

如上方的輸出圖片所示，這三個段落均已用淺紅色方框標示出來。此視覺化呈現有助於開發人員快速辨識 OCR 引擎如何將文件分割為可讀的區塊。

ResultHighlightType 有哪些不同的選項？

上述範例使用 ResultHighlightType.Paragraph 來標示文字區塊。 IronOCR 透過此枚舉提供額外的標示選項。以下是可用類型的完整清單，每種類型皆用於不同的診斷目的。

字元：在 OCR 引擎偵測到的每個字元周圍繪製一個邊界框。此功能對於除錯字元辨識或特殊字型非常有用，特別是在處理自訂語言檔案時。

Word：標示引擎識別出的每個完整單字。非常適合用於驗證單字邊界與正確的單字識別，特別是在結合 BarCode 與 QR 碼讀取功能進行文字辨識時。

Line：標示所有偵測到的文字行。適用於需要驗證行識別的複雜版面配置文件，例如處理掃描文件時。

段落：標示成段落分組的完整文字區塊。非常適合用於理解文件版面配置及驗證文字區塊的分割，在處理表格擷取時尤為實用。

如何比較不同的標示類型？

此完整範例展示了如何針對同一份文件中的所有不同類型產生標註，讓您能夠比較結果：

using IronOcr;
using System;

// Initialize the OCR engine with custom configuration
IronTesseract ocrTesseract = new IronTesseract();

// Configure for better accuracy if needed
ocrTesseract.Configuration.ReadBarCodes = false; // Disable if not needed for performance
ocrTesseract.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd;

// Load the PDF document
using var ocrInput = new OcrInput();
ocrInput.LoadPdf("document.pdf");

// Generate highlights for each type
Console.WriteLine("Generating character-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_character_", ResultHighlightType.Character);

Console.WriteLine("Generating word-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_word_", ResultHighlightType.Word);

Console.WriteLine("Generating line-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_line_", ResultHighlightType.Line);

Console.WriteLine("Generating paragraph-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_paragraph_", ResultHighlightType.Paragraph);

Console.WriteLine("All highlight images have been generated successfully!");

using IronOcr;
using System;

// Initialize the OCR engine with custom configuration
IronTesseract ocrTesseract = new IronTesseract();

// Configure for better accuracy if needed
ocrTesseract.Configuration.ReadBarCodes = false; // Disable if not needed for performance
ocrTesseract.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd;

// Load the PDF document
using var ocrInput = new OcrInput();
ocrInput.LoadPdf("document.pdf");

// Generate highlights for each type
Console.WriteLine("Generating character-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_character_", ResultHighlightType.Character);

Console.WriteLine("Generating word-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_word_", ResultHighlightType.Word);

Console.WriteLine("Generating line-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_line_", ResultHighlightType.Line);

Console.WriteLine("Generating paragraph-level highlights...");
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_paragraph_", ResultHighlightType.Paragraph);

Console.WriteLine("All highlight images have been generated successfully!");

Imports IronOcr
Imports System

' Initialize the OCR engine with custom configuration
Dim ocrTesseract As New IronTesseract()

' Configure for better accuracy if needed
ocrTesseract.Configuration.ReadBarCodes = False ' Disable if not needed for performance
ocrTesseract.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd

' Load the PDF document
Using ocrInput As New OcrInput()
    ocrInput.LoadPdf("document.pdf")

    ' Generate highlights for each type
    Console.WriteLine("Generating character-level highlights...")
    ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_character_", ResultHighlightType.Character)

    Console.WriteLine("Generating word-level highlights...")
    ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_word_", ResultHighlightType.Word)

    Console.WriteLine("Generating line-level highlights...")
    ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_line_", ResultHighlightType.Line)

    Console.WriteLine("Generating paragraph-level highlights...")
    ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_paragraph_", ResultHighlightType.Paragraph)
End Using

Console.WriteLine("All highlight images have been generated successfully!")

$vbLabelText $csharpLabel

如何處理多頁文件？

在處理多頁 PDF 或多幀 TIFF 檔案時，標示功能會自動分別處理每一頁。這在實作 PDF OCR 文字擷取工作流程時特別有用：

using IronOcr;
using System.IO;

IronTesseract ocrTesseract = new IronTesseract();

// Load a multi-page document
using var ocrInput = new OcrInput();
ocrInput.LoadPdf("multi-page-document.pdf");

// Create output directory if it doesn't exist
string outputDir = "highlighted_pages";
Directory.CreateDirectory(outputDir);

// Generate highlights for each page
// Files will be named: highlighted_pages/page_0.png, page_1.png, etc.
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, 
    Path.Combine(outputDir, "page_"), 
    ResultHighlightType.Word);

// Count generated files for verification
int pageCount = Directory.GetFiles(outputDir, "page_*.png").Length;
Console.WriteLine($"Generated {pageCount} highlighted page images");

using IronOcr;
using System.IO;

IronTesseract ocrTesseract = new IronTesseract();

// Load a multi-page document
using var ocrInput = new OcrInput();
ocrInput.LoadPdf("multi-page-document.pdf");

// Create output directory if it doesn't exist
string outputDir = "highlighted_pages";
Directory.CreateDirectory(outputDir);

// Generate highlights for each page
// Files will be named: highlighted_pages/page_0.png, page_1.png, etc.
ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, 
    Path.Combine(outputDir, "page_"), 
    ResultHighlightType.Word);

// Count generated files for verification
int pageCount = Directory.GetFiles(outputDir, "page_*.png").Length;
Console.WriteLine($"Generated {pageCount} highlighted page images");

Imports IronOcr
Imports System.IO

Dim ocrTesseract As New IronTesseract()

' Load a multi-page document
Using ocrInput As New OcrInput()
    ocrInput.LoadPdf("multi-page-document.pdf")

    ' Create output directory if it doesn't exist
    Dim outputDir As String = "highlighted_pages"
    Directory.CreateDirectory(outputDir)

    ' Generate highlights for each page
    ' Files will be named: highlighted_pages/page_0.png, page_1.png, etc.
    ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, 
                                          Path.Combine(outputDir, "page_"), 
                                          ResultHighlightType.Word)

    ' Count generated files for verification
    Dim pageCount As Integer = Directory.GetFiles(outputDir, "page_*.png").Length
    Console.WriteLine($"Generated {pageCount} highlighted page images")
End Using

$vbLabelText $csharpLabel

有哪些效能最佳實踐？

使用突出顯示功能時，請參考以下最佳實務：

檔案大小：標示的圖片可能體積龐大，尤其是高解析度的文件。處理大量批次時，請考量輸出目錄的可用空間。如需最佳化建議，請參閱我們的快速 OCR 設定指南。
效能：產生標示會增加處理負載。對於僅需偶爾進行高亮標示的生產系統，應將此功能實作為獨立的診斷流程，而非主工作流程的一部分。建議使用多執行緒 OCR 進行批次處理。
錯誤處理：進行檔案操作時，務必實作適當的錯誤處理機制：

try
{
    using var ocrInput = new OcrInput();
    ocrInput.LoadPdf("document.pdf");

    // Apply image filters if needed for better recognition
    ocrInput.Deskew(); // Correct slight rotations
    ocrInput.DeNoise(); // Remove background noise

    ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_", ResultHighlightType.Word);
}
catch (Exception ex)
{
    Console.WriteLine($"Error during highlighting: {ex.Message}");
    // Log error details for debugging
}

try
{
    using var ocrInput = new OcrInput();
    ocrInput.LoadPdf("document.pdf");

    // Apply image filters if needed for better recognition
    ocrInput.Deskew(); // Correct slight rotations
    ocrInput.DeNoise(); // Remove background noise

    ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_", ResultHighlightType.Word);
}
catch (Exception ex)
{
    Console.WriteLine($"Error during highlighting: {ex.Message}");
    // Log error details for debugging
}

Imports System

Try
    Using ocrInput As New OcrInput()
        ocrInput.LoadPdf("document.pdf")

        ' Apply image filters if needed for better recognition
        ocrInput.Deskew() ' Correct slight rotations
        ocrInput.DeNoise() ' Remove background noise

        ocrInput.HighlightTextAndSaveAsImages(ocrTesseract, "highlight_", ResultHighlightType.Word)
    End Using
Catch ex As Exception
    Console.WriteLine($"Error during highlighting: {ex.Message}")
    ' Log error details for debugging
End Try

$vbLabelText $csharpLabel

標示功能如何與 OCR 結果整合？

高亮顯示功能可與 IronOCR 的結果物件無縫整合，讓您能將視覺高亮標記與擷取的文字資料相互對應。這在您需要 track OCR progress 或驗證已識別文字的特定區段時特別有用。 OcrResult 類別提供每個偵測到元素的詳細資訊，這些資訊與此方法所產生的視覺標示直接對應。

若遇到問題該怎麼辦？

若在使用標示功能時遇到問題，請參閱一般疑難排解指南以獲取常見解決方案。針對與高亮顯示相關的具體問題：

輸出圖片為空白：請確認輸入文件包含可讀取的文字，且 OCR 引擎已針對您的文件類型進行正確設定。您可能需要套用圖片優化濾鏡或 fixing image orientation 來提升辨識效果。
遺漏重點：某些文件類型可能需要特定的預處理。請嘗試套用圖片濾鏡或 fixing image orientation 來提升辨識效果。
效能問題：針對大型文件，建議採用 multithreading 來提升處理速度。此外，若需處理品質不佳的原始檔案，請參閱我們關於修正低品質掃描檔的指南。

我該如何將此工具應用於生產環境的除錯？

高亮顯示功能是極佳的生產環境除錯工具。若結合用於長時間運作及超時的中止標記，即可建立一套穩健的診斷系統。請考慮在您的應用程式中實作除錯模式：

public class OcrDebugger
{
    private readonly IronTesseract _tesseract;
    private readonly bool _debugMode;

    public OcrDebugger(bool enableDebugMode = false)
    {
        _tesseract = new IronTesseract();
        _debugMode = enableDebugMode;
    }

    public OcrResult ProcessDocument(string filePath)
    {
        using var input = new OcrInput();
        input.LoadPdf(filePath);

        // Apply preprocessing
        input.Deskew();
        input.DeNoise();

        // Generate debug highlights if in debug mode
        if (_debugMode)
        {
            string debugPath = $"debug_{Path.GetFileNameWithoutExtension(filePath)}_";
            input.HighlightTextAndSaveAsImages(_tesseract, debugPath, ResultHighlightType.Word);
        }

        // Perform actual OCR
        return _tesseract.Read(input);
    }
}

public class OcrDebugger
{
    private readonly IronTesseract _tesseract;
    private readonly bool _debugMode;

    public OcrDebugger(bool enableDebugMode = false)
    {
        _tesseract = new IronTesseract();
        _debugMode = enableDebugMode;
    }

    public OcrResult ProcessDocument(string filePath)
    {
        using var input = new OcrInput();
        input.LoadPdf(filePath);

        // Apply preprocessing
        input.Deskew();
        input.DeNoise();

        // Generate debug highlights if in debug mode
        if (_debugMode)
        {
            string debugPath = $"debug_{Path.GetFileNameWithoutExtension(filePath)}_";
            input.HighlightTextAndSaveAsImages(_tesseract, debugPath, ResultHighlightType.Word);
        }

        // Perform actual OCR
        return _tesseract.Read(input);
    }
}

Imports System.IO

Public Class OcrDebugger
    Private ReadOnly _tesseract As IronTesseract
    Private ReadOnly _debugMode As Boolean

    Public Sub New(Optional enableDebugMode As Boolean = False)
        _tesseract = New IronTesseract()
        _debugMode = enableDebugMode
    End Sub

    Public Function ProcessDocument(filePath As String) As OcrResult
        Using input As New OcrInput()
            input.LoadPdf(filePath)

            ' Apply preprocessing
            input.Deskew()
            input.DeNoise()

            ' Generate debug highlights if in debug mode
            If _debugMode Then
                Dim debugPath As String = $"debug_{Path.GetFileNameWithoutExtension(filePath)}_"
                input.HighlightTextAndSaveAsImages(_tesseract, debugPath, ResultHighlightType.Word)
            End If

            ' Perform actual OCR
            Return _tesseract.Read(input)
        End Using
    End Function
End Class

$vbLabelText $csharpLabel

接下來該去哪裡？

既然您已了解如何使用標示功能，請進一步探索：

從 OCR 結果建立可搜尋的 PDF 檔案
讀取特定文件類型，例如護照或執照
透過我們的入門指南，在您的開發環境中設定 IronOCR
為全球應用程式實作 125 種國際語言支援
使用"濾鏡精靈"來優化影像處理

若用於正式生產環境，請記得取得授權以移除浮水印並使用完整功能。

常見問題

如何在我的 C# 應用程式中視覺化 OCR 結果？

IronOCR 提供 HighlightTextAndSaveAsImages 方法，該方法會透過在偵測到的文字元素（字元、單字、行或段落）周圍繪製邊界框來視覺化 OCR 結果，並將其儲存為診斷圖像。此功能有助於開發人員驗證 OCR 準確性並排除辨識問題。

在 PDF 文件中標示文字的最簡單方法是什麼？

透過 IronOCR，您只需一行程式碼即可在 PDF 中標示文字：new IronOcr.OcrInput().LoadPdf("document.pdf").HighlightTextAndSaveAsImages(new IronOcr.IronTesseract(), "highlight_page_", IronOcr.ResultHighlightType.WORD)。此程式碼會載入 PDF 並產生標示文字的圖片。

HighlightTextAndSaveAsImages 方法需要哪些參數？

IronOCR 中的 HighlightTextAndSaveAsImages 方法需要三個參數：IronTesseract OCR 引擎實例、輸出檔名前綴字串，以及指定要標示哪些文字元素（字元、WORD、行或段落）的 ResultHighlightType 枚舉值。

使用文字標示功能時，產出的圖片會如何命名？

IronOCR 會透過將您指定的前綴與頁面識別碼結合，自動為輸出影像命名。例如，若您使用「highlight_page_」作為前綴，該方法將針對文件中的每一頁，分別產生名為「highlight_page_0」、「highlight_page_1」等的檔案。

視覺性標示為何對 OCR 開發至關重要？

IronOCR 的視覺標示功能透過精確顯示 OCR 引擎偵測到的文字內容及潛在錯誤位置，提供關鍵的診斷回饋。此視覺地圖有助於開發人員除錯辨識問題、驗證 OCR 準確性，並在複雜文件中排除故障。

除了單字之外，我還能標示其他類型的文字元素嗎？

是的，IronOCR 的 ResultHighlightType 枚舉允許您標示各種文字元素，包括單一字元、單詞、行或整個段落。只需在呼叫 HighlightTextAndSaveAsImages 方法時指定所需的類型，即可視覺化呈現不同層級的文字偵測結果。

IronOCR 能否整合至現有應用程式中？

IronOCR 設計上可輕鬆透過 C# 整合至現有應用程式中，讓開發人員能以最少的努力，為其軟體增添 OCR 功能。

使用 IronOCR 進行文件管理有哪些好處？

使用 IronOCR 進行文件管理，可將掃描文件轉換為可搜尋且可編輯的文字，從而簡化工作流程，減少人工資料輸入的需求，並提升文件的可存取性。

IronOCR 如何提升資料準確性？

IronOCR 透過其先進的辨識演算法與影像校正功能來提升資料準確性，確保文字擷取過程既可靠又精確。

IronOCR 是否有提供免費試用版？

是的，Iron Software 提供 IronOCR 的免費試用版，讓使用者能在決定購買前測試其功能與效能。

Curtis Chau

立即與工程團隊聯繫

技術撰稿人

Curtis Chau 擁有卡爾頓大學（Carleton University）的電腦科學學士學位，專精於前端開發，並精通 Node.js、TypeScript、JavaScript 及 React。他熱衷於打造直觀且美觀的用戶介面，喜歡運用現代框架，並創建結構完善、視覺上吸引人的手冊。

除了開發工作之外，Curtis 對物聯網（IoT）抱有濃厚興趣，致力於探索整合硬體與軟體的創新方法。閒暇時，他喜歡玩遊戲和開發 Discord 機器人，將對科技的熱愛與創意相結合。

準備開始了嗎？

Nuget 下載 5,896,332 | 版本： 2026.5 just released

檢視授權

還在往下捲動嗎？

想要快速確認成果嗎？ PM > Install-Package IronOcr
執行範例觀看您的圖片轉為可搜尋文字。

檢視授權

客戶亮點：

開發者焦點：

網路研討會：

立即開始 30天試用

本頁內容

使用 IronOCR 在 C# 中將文字轉為圖片

using NuGet 套件管理員安裝 https://www.nuget.org/packages/IronOcr

請複製並執行此程式碼片段。

部署至您的生產環境進行測試

簡化工作流程（5 個步驟）

如何選取文字並儲存為圖片？

輸入的 PDF 檔案長什麼樣子？

如何實作高亮顯示程式碼？

輸出圖片顯示了什麼？

ResultHighlightType 有哪些不同的選項？

如何比較不同的標示類型？

如何處理多頁文件？

有哪些效能最佳實踐？

標示功能如何與 OCR 結果整合？

若遇到問題該怎麼辦？

我該如何將此工具應用於生產環境的除錯？

接下來該去哪裡？

常見問題

如何在我的 C# 應用程式中視覺化 OCR 結果？

在 PDF 文件中標示文字的最簡單方法是什麼？

HighlightTextAndSaveAsImages 方法需要哪些參數？

使用文字標示功能時，產出的圖片會如何命名？

視覺性標示為何對 OCR 開發至關重要？

除了單字之外，我還能標示其他類型的文字元素嗎？

IronOCR 能否整合至現有應用程式中？

使用 IronOCR 進行文件管理有哪些好處？

IronOCR 如何提升資料準確性？

IronOCR 是否有提供免費試用版？

還在往下捲動嗎？

鋼鐵支援團隊

立即開始 30天試用

本頁內容

使用 IronOCR 在 C# 中將文字轉為圖片

using NuGet 套件管理員安裝 https://www.nuget.org/packages/IronOcr

請複製並執行此程式碼片段。

部署至您的生產環境進行測試

簡化工作流程（5 個步驟）

如何選取文字並儲存為圖片？

輸入的 PDF 檔案長什麼樣子？

如何實作高亮顯示程式碼？

輸出圖片顯示了什麼？

ResultHighlightType 有哪些不同的選項？

如何比較不同的標示類型？

如何處理多頁文件？

有哪些效能最佳實踐？

標示功能如何與 OCR 結果整合？

若遇到問題該怎麼辦？

我該如何將此工具應用於生產環境的除錯？

接下來該去哪裡？

常見問題

如何在我的 C# 應用程式中視覺化 OCR 結果？

在 PDF 文件中標示文字的最簡單方法是什麼？

HighlightTextAndSaveAsImages 方法需要哪些參數？

使用文字標示功能時，產出的圖片會如何命名？

視覺性標示為何對 OCR 開發至關重要？

除了單字之外，我還能標示其他類型的文字元素嗎？

IronOCR 能否整合至現有應用程式中？

使用 IronOCR 進行文件管理有哪些好處？

IronOCR 如何提升資料準確性？

IronOCR 是否有提供免費試用版？

還在往下捲動嗎？

下一步：開始 30天試用

Thank You

下一步：開始 30天試用

Want to deploy IronSuite to a live project for FREE?

What’s included?

獲得全球數百萬工程師的信賴

鋼鐵支援團隊