如何在 C# 中使用 Iron Tesseract

更新:2026年1月10日

Translated

View the article in English

在 C# 中使用 Iron Tesseract 的方式是：建立一個 IronTesseract 實例，設定其語言與 OCR 參數，接著對包含您的圖片或 PDF 檔案的 OcrInput 物件呼叫 Read() 方法。此工具利用 Tesseract 5 的優化引擎，將文字圖像轉換為可搜尋的 PDF 檔案。

IronOCR 提供直觀的 API，用於運用經過客製化與優化的 Tesseract 5，即 Iron Tesseract。透過使用 IronOCR 和 IronTesseract，您將能夠將文字圖像和掃描文件轉換為純文字及可搜尋的 PDF 檔案。該函式庫支援 125 種國際語言，並包含 BARCODE 讀取與電腦視覺等進階功能。

快速入門：在 C# 中設定 IronTesseract 配置

此範例展示如何透過特定設定配置 IronTesseract，並以單行程式碼執行 OCR。

using NuGet 套件管理員安裝 https://www.nuget.org/packages/IronOcr
PM > Install-Package IronOcr

請複製並執行此程式碼片段。

var result = new IronOcr.IronTesseract { Language = IronOcr.OcrLanguage.English, Configuration = new IronOcr.TesseractConfiguration { ReadBarCodes = false, RenderSearchablePdf = true, WhiteListCharacters = "ABCabc123" } }.Read(new IronOcr.OcrInput("image.png"));

部署至您的生產環境進行測試

立即透過免費試用，在您的專案中開始使用 IronOCR

基本 OCR 工作流程

使用 NuGet 安裝 OCR 函式庫以讀取圖像
使用自訂 `Tesseract 5` 執行 OCR
載入待處理的文件，例如圖片或 PDF 檔案
將擷取的文字輸出至控制台或檔案
將結果儲存為可搜尋的 PDF 檔案

如何建立 IronTesseract 實例？

請使用以下程式碼初始化 Tesseract 物件：

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-initialize-irontesseract.cs

using IronOcr;

IronTesseract ocr = new IronTesseract();

Imports IronOcr

Dim ocr As New IronTesseract()

$vbLabelText $csharpLabel

您可以透過選擇不同語言、啟用BarCode讀取功能，以及將字元加入白名單或黑名單，來自訂 IronTesseract 的運作行為。 IronOCR 提供全面的設定選項，讓您能針對 OCR 流程進行微調：

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-configure-irontesseract.cs

IronTesseract ocr = new IronTesseract
{
    Configuration = new TesseractConfiguration
    {
        ReadBarCodes = false,
        RenderHocr = true,
        TesseractVariables = null,
        WhiteListCharacters = null,
        BlackListCharacters = "`ë|^",
    },
    MultiThreaded = false,
    Language = OcrLanguage.English,
    EnableTesseractConsoleMessages = true, // False as default
};

Dim ocr As New IronTesseract With {
	.Configuration = New TesseractConfiguration With {
		.ReadBarCodes = False,
		.RenderHocr = True,
		.TesseractVariables = Nothing,
		.WhiteListCharacters = Nothing,
		.BlackListCharacters = "`ë|^"
	},
	.MultiThreaded = False,
	.Language = OcrLanguage.English,
	.EnableTesseractConsoleMessages = True
}

$vbLabelText $csharpLabel

完成設定後，即可使用 Tesseract 功能讀取 OcrInput 物件。 OcrInput 類別提供靈活的方法，用於載入各種輸入格式：

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-read.cs

IronTesseract ocr = new IronTesseract();

using OcrInput input = new OcrInput();
input.LoadImage("attachment.png");
OcrResult result = ocr.Read(input);
string text = result.Text;

Dim ocr As New IronTesseract()

Using input As New OcrInput()
	input.LoadImage("attachment.png")
	Dim result As OcrResult = ocr.Read(input)
	Dim text As String = result.Text
End Using

$vbLabelText $csharpLabel

針對複雜情境，您可以利用多執行緒功能同時處理多個文件，大幅提升批次作業的效能。

什麼是 Tesseract 的進階設定變數？

IronOcr 的 Tesseract 介面允許透過 IronOcr.TesseractConfiguration 類別，完全控制 Tesseract 的配置變數。這些進階設定可讓您針對特定使用情境優化 OCR 效能，例如修正掃描品質不佳的文件，或讀取特定類型的文件。

如何在程式碼中使用 Tesseract 設定？

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-tesseract-configuration.cs

using IronOcr;
using System;

IronTesseract Ocr = new IronTesseract();

Ocr.Language = OcrLanguage.English;
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd;

// Configure Tesseract Engine
Ocr.Configuration.TesseractVariables["tessedit_parallelize"] = false;

using var input = new OcrInput();
input.LoadImage("/path/file.png");

OcrResult Result = Ocr.Read(input);
Console.WriteLine(Result.Text);

Imports IronOcr
Imports System

Private Ocr As New IronTesseract()

Ocr.Language = OcrLanguage.English
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd

' Configure Tesseract Engine
Ocr.Configuration.TesseractVariables("tessedit_parallelize") = False

Dim input = New OcrInput()
input.LoadImage("/path/file.png")

Dim Result As OcrResult = Ocr.Read(input)
Console.WriteLine(Result.Text)

$vbLabelText $csharpLabel

IronOCR 亦針對不同文件類型提供專用設定。例如，在讀取護照或處理 MICR 支票時，您可以套用特定的預處理濾鏡與區域偵測功能，以提升準確度。

財務文件配置範例：

// Example: Configure for financial documents
IronTesseract ocr = new IronTesseract
{
    Language = OcrLanguage.English,
    Configuration = new TesseractConfiguration
    {
        PageSegmentationMode = TesseractPageSegmentationMode.SingleBlock,
        TesseractVariables = new Dictionary<string, object>
        {
            ["tessedit_char_whitelist"] = "0123456789.$,",
            ["textord_heavy_nr"] = false,
            ["edges_max_children_per_outline"] = 10
        }
    }
};

// Apply preprocessing filters for better accuracy
using OcrInput input = new OcrInput();
input.LoadPdf("financial-document.pdf");
input.Deskew();
input.EnhanceResolution(300);

OcrResult result = ocr.Read(input);

// Example: Configure for financial documents
IronTesseract ocr = new IronTesseract
{
    Language = OcrLanguage.English,
    Configuration = new TesseractConfiguration
    {
        PageSegmentationMode = TesseractPageSegmentationMode.SingleBlock,
        TesseractVariables = new Dictionary<string, object>
        {
            ["tessedit_char_whitelist"] = "0123456789.$,",
            ["textord_heavy_nr"] = false,
            ["edges_max_children_per_outline"] = 10
        }
    }
};

// Apply preprocessing filters for better accuracy
using OcrInput input = new OcrInput();
input.LoadPdf("financial-document.pdf");
input.Deskew();
input.EnhanceResolution(300);

OcrResult result = ocr.Read(input);

Imports IronOcr

' Example: Configure for financial documents
Dim ocr As New IronTesseract With {
    .Language = OcrLanguage.English,
    .Configuration = New TesseractConfiguration With {
        .PageSegmentationMode = TesseractPageSegmentationMode.SingleBlock,
        .TesseractVariables = New Dictionary(Of String, Object) From {
            {"tessedit_char_whitelist", "0123456789.$,"},
            {"textord_heavy_nr", False},
            {"edges_max_children_per_outline", 10}
        }
    }
}

' Apply preprocessing filters for better accuracy
Using input As New OcrInput()
    input.LoadPdf("financial-document.pdf")
    input.Deskew()
    input.EnhanceResolution(300)

    Dim result As OcrResult = ocr.Read(input)
End Using

$vbLabelText $csharpLabel

Tesseract 的所有配置變數完整清單為何？

這些可透過 IronTesseract.Co/nfiguration.TesseractVariables["key"] = value; 進行設定。透過配置變數，您可以微調 OCR 的運作行為，以針對您的特定文件獲得最佳結果。有關優化 OCR 效能的詳細指引，請參閱我們的快速 OCR 設定指南。

0Use old baseline algorithm 0All doc is proportial text 0Debug on fixed pitch test 0Turn off dp fixed pitch algorithm 0Do even faster pitch algorithm 0Write full metric stuff 0Draw row-level cuts 0Draw page-level cuts 0Use correct answer for fixed/prop 0Attempt whole doc/block fixed pitch 0Display separate words 0Display separate words 0Display forced fixed pitch words 0Moan about prop blocks 0Moan about fixed pitch blocks 0Dump stats when moaning 0Do current test 0.08Fraction of xheight for sameness 0.5Max initial cluster size 0.15Min initial cluster spacing 0.25Fraction of xheight 0.75Fraction of xheight 0.6Allowed size variance 0.3Non-fuzzy spacing region 2.8Min ratio space/nonspace 2Min ratio space/nonspace 1.5Pitch IQR/Gap IQR threshold 0.2Xh fraction noise in pitch 0.5Min width of decent blobs 0.1Fraction of x to ignore 0Debug level for unichar ambiguities 0Classify debug level 1Normalization Method ... 0Matcher Debug Level 0Matcher Debug Flags 0Learning Debug Level: 1Min # of permanent classes 3Reliable Config Threshold 5Enable adaption even if the ambiguities have not been seen 230Threshold for good protos during adaptive 0-255 230Threshold for good features during adaptive 0-255 229Class Pruner Threshold 0-255 15Class Pruner Multiplier 0-255: 7Class Pruner CutoffStrength: 10Integer Matcher Multiplier 0-255: 0Set to 1 for general debug info, to 2 for more details, to 3 to see all the debug messages 0Debug level for hyphenated words. 2Size of dict word to be treated as non-dict word 0Stopper debug level 10Max words to keep in list 10000Maximum number of different character choices to consider during permutation. This limit is especially useful when user patterns are specified, since overly generic patterns can result in dawg search exploring an overly large number of options. 1Fix blobs that aren't chopped 0Chop debug 10000Split Length 2Same distance 6Min Number of Points on Outline 150Max number of seams in seam_pile -50Min Inside Angle Bend 2000Min Outline Area 90Width of (smaller) chopped blobs above which we don't care that a chop is not near the center. 3X / Y length weight 0Debug level for wordrec 4Max number of broken pieces to associate 0SegSearch debug level 2000Maximum number of pain points stored in the queue 0Language model debug level 8Maximum order of the character ngram model 10Maximum number of prunable (those for which PrunablePath() is true) entries in each viterbi list recorded in BLOB_CHOICEs 500Maximum size of viterbi lists recorded in BLOB_CHOICEs 3Minimum length of compound words 0Display Segmentations 6Page seg mode: 0=osd only, 1=auto+osd, 2=auto_only, 3=auto, 4=column, 5=block_vert, 6=block, 7=line, 8=word, 9=word_circle, 10=char,11=sparse_text, 12=sparse_text+osd, 13=raw_line (Values from PageSegMode enum in tesseract/publictypes.h) 2Which OCR engine(s) to run (Tesseract, LSTM, both). Defaults to loading and running the most accurate available. 0Whether to use the top-line splitting process for Devanagari documents while performing page-segmentation. 0Whether to use the top-line splitting process for Devanagari documents while performing ocr. 0Debug level for BiDi 1Debug level 0Page number to apply boxes from 0Amount of debug output for bigram correction. 0Debug reassignment of small outlines 8Max diacritics to apply to a blob 16Max diacritics to apply to a word 0Reestimate debug 2alphas in a good word 39Adaptation decision algorithm for tess 0Print multilang debug info. 0Print paragraph debug info. 2Only preserve wds longer than this 10For adj length in rating per ch 1How many potential indicators needed 4Don't crunch words with long lower case strings 4Don't crunch words with long lower case strings 3Crunch words with long repetitions 1How many non-noise blbs either side? 1What constitues done for spacing 0Contextual fixspace debug 8Max allowed deviation of blob top outside of font data 8Min change in xht before actually trying it 0Debug level for sub & superscript fixer 85Set JPEG quality level 0Specify DPI for input image 50Specify minimum characters to try during OSD 0Rejection algorithm 2Rej blbs near image edge limit 8Reject any x-ht lt or eq than this -1-1 -> All pages, else specific page to process 1Run in parallel where possible 2Allows to include alternative symbols choices in the hOCR output. Valid input values are 0, 1 and 2. 0 is the default value. With 1 the alternative symbol choices per timestep are included. With 2 alternative symbol choices are extracted from the CTC process instead of the lattice. The choices are mapped per character. 5Sets the number of cascading iterations for the Beamsearch in lstm_choice_mode. Note that lstm_choice_mode must be set to a value greater than 0 to produce results. 0Debug data 3or should we use mean 10No.samples reqd to reestimate for row 40No.gaps reqd with 1 large gap to treat as a table 20No.gaps reqd with few cert spaces to use certs 1How to avoid being silly 7Pixel size of noise 0Baseline debug level 10Fraction of size for maxima 16Transitions for normal blob 1super norm blobs to save row 0Use ambigs for deciding whether to adapt to a character 0Prioritize blob division over chopping 1Enable adaptive classifier 0Character Normalized Matching 0Baseline Normalized Matching 1Enable adaptive classifier 0Use pre-adapted classifier templates 0Save adapted templates to a file 0Enable match debugger 0Non-linear stroke-density normalization 0Bring up graphical debugging windows for fragments training 0Use two different windows for debugging the matching: One for the protos and one for the features. 0Assume the input is numbers [0-9]. 1Load system word dawg. 1Load frequent word dawg. 1Load unambiguous word dawg. 1Load dawg with punctuation patterns. 1Load dawg with number patterns. 1Load dawg with special word bigrams. 0Use only the first UTF8 step of the given string when computing log probabilities. 0Make AcceptableChoice() always return false. Useful when there is a need to explore all segmentations 0Don't use any alphabetic-specific tricks. Set to true in the traineddata config file for scripts that are cursive or inherently fixed-pitch 0Save Document Words 1Merge the fragments in the ratings matrix and delete them after merging 1Associator Enable 0force associator to run regardless of what enable_assoc is. This is used for CJK where component grouping is necessary. 1Chop enable 0Vertical creep 1Use new seam_pile 0include fixed-pitch heuristics in char segmentation 0Only run OCR for words that had truth recorded in BlamerBundle 0Print blamer debug messages 0Try to set the blame for errors 1Save alternative paths found during chopping and segmentation search 1Words are delimited by space 0Use sigmoidal score for certainty 0Take segmentation and labeling from box file 0Conversion of word/line box file to char box file 0Generate training data from boxed chars 0Generate more boxes from boxed chars 0Break input into lines and remap boxes if present 0Dump intermediate images made during page segmentation 1Try inverting the image in LSTMRecognizeWord 0Perform training for ambiguities 0Generate and print debug information for adaption 0Learn both character fragments (as is done in the special low exposure mode) as well as unfragmented characters. 0Each bounding box is assumed to contain ngrams. Only learn the ngrams whose outlines overlap horizontally. 0Draw output words 0Dump char choices 0Print timing stats 1Try to improve fuzzy spaces 0Don't bother with word plausibility 1Crunch double hyphens? 1Add words to the document dictionary 0Output font info per char 0Block and Row stats 1Enable correction based on the word bigram dictionary. 0Enable single word correction based on the dictionary. 1Remove and conditionally reassign small outlines when they confuse layout analysis, determining diacritics vs noise 0Do minimal rejection on pass 1 output 0Test adaption criteria 0Test for point 1Run paragraph detection on the post-text-recognition (more accurate) 1Use ratings matrix/beam search with lstm 1Reduce rejection on good docs 1Reject spaces? 0Add font info to hocr output 0Add coordinates for each character to hocr output 1Before word crunch? 0Take out ~^ early? 1As it says 1Don't touch sensible strings 0Use dictword test 1Individual rejection control 1Individual rejection control 1Individual rejection control 0Extend permuter check 0Extend permuter check 0Output text with boxes 0Capture the image from the IPE 0Run interactively? 1According to dict_word 0In multilingual mode use params model of the primary language 0Debug line finding 0Use CJK fixed pitch model 0Allow feature extractors to see the original outline 0Only initialize with the config file. Useful if the instance is not going to be used for OCR but say only for layout analysis. 0Turn on equation detector 1Enable vertical detection 0Force using vertical text page mode 0Preserve multiple interword spaces 1Detect music staff and remove intersecting components 0Script has no xheight, so use a single mode 0Space stats use prechopping? 0Constrain relative values of inter and intra-word gaps for old_to_method. 1Block stats to use fixed pitch rows? 0Force word breaks on punct to break long lines in non-space delimited langs 0Space stats use prechopping? 0Fix suspected bug in old code 1Only stat OBVIOUS spaces 1Only stat OBVIOUS spaces 1Only stat OBVIOUS spaces 1Only stat OBVIOUS spaces 1Use row alone when inadequate cert spaces 0Better guess 0Pass ANY flip to context? 1Don't restrict kn->sp fuzzy limit to tables 0Don't remove noise blobs 0Display unsorted blobs 0Display unsorted blobs 1Reject noise-like words 1Reject noise-like rows 0Debug row garbage detector Class str to debug learning A filename of user-provided words. A suffix of user-provided words located in tessdata. A filename of user-provided patterns. A suffix of user-provided patterns located in tessdata. Output file for ambiguities found in the dictionary Word for which stopper debug information should be printed to stdout Blacklist of chars not to recognize Whitelist of chars to recognize List of chars to override tessedit_char_blacklist .expExposure value follows this pattern in the image filename. The name of the image files are expected to be in the form [lang].[fontname].exp [num].tif 引號).,;:?!1st Trailing punctuation 2nd Trailing punctuationPage separator (default is form feed control character) 0.2Character Normalization Range ... 1.5Veto ratio between classifier ratings 5.5Veto difference between classifier certainties 0.125Good Match (0-1) 0Great Match (0-1) 0.02Perfect Match (0-1) 0.15Bad Match Pad (0-1) 0.1New template margin (0-1) 12Avg. noise blob length 0.015Maximum angle delta for prototype clustering 0Penalty to apply when a non-alnum is vertically out of its expected textline position 1.5Rating scaling factor 20Certainty scaling factor 0.00390625Scale factor for features not used 2.5Prune poor adapted results this much worse than best result -1Threshold at which classify_adapted_pruning_factor starts -3Exclude fragments that do not look like whole characters from training and adaption 0.3Max large speckle size 10Penalty to add to worst rating for noise 0.125Score penalty (0.1 = 10%) added if there are subscripts or superscripts in a word, but it is otherwise OK. 0.25Score penalty (0.1 = 10%) added if an xheight is inconsistent. 1Score multiplier for word matches which have good case and are frequent in the given language (lower is better). 1.1Score multiplier for word matches that have good case (lower is better). 1.3125Default score multiplier for word matches, which may have case issues (lower is better). 1.25Score multiplier for glyph fragment segmentations which do not match a dictionary word (lower is better). -2.25Worst certainty for words that can be inserted into the document dictionary -2.25Good blob limit 2最大字元寬高比

Tesseract 配置變數	Default	含義
classify_num_cp_levels	3	類別修剪層級數
textord_debug_tabfind	0	"除錯"索引標籤定位
textord_debug_bugs	0	開啟與標籤定位相關的錯誤輸出
textord_testregion_left	-1	除錯報告矩形的左邊緣
textord_testregion_top	-1	除錯報告矩形的頂部邊緣
textord_testregion_right	2147483647	除錯矩形的右邊緣
textord_testregion_bottom	2147483647	除錯矩形的底部邊緣
textord_tabfind_show_partitions	0	顯示區段邊界，若大於 1 則等待
devanagari_split_debuglevel	0	分段白線處理的除錯層級。
edges_max_children_per_outline	10	字元輪廓內的子元素最大數量
edges_max_children_layers	5	字元輪廓內的子元素最大嵌套層級
edges_children_per_grandchild	10	大綱刪除的重要性權重
edges_children_count_limit	45	Blob 中允許的最大孔洞數
edges_min_nonhole	12	方框內潛在字元的最小像素數
邊緣路徑面積比	40	Max lensq/area for acceptable child outline
textord_fp_chop_error	2	切削單元的最大允許彎曲量
textord_tabfind_show_images	0	Show image blobs
textord_skewsmooth_offset	4	為確保流暢度
textord_skewsmooth_offset2	1	為確保流暢度
textord_test_x	-2147483647	測試點座標
textord_test_y	-2147483647	測試點座標
textord_min_blobs_in_row	4	計算漸變效果前的最小 Blob 數量
textord_spline_minblobs	8	Min blobs in each spline segment
textord_spline_medianwin	6	Size of window for spline segmentation
textord_max_blob_overlaps	4	Max number of blobs a big blob can overlap
textord_min_xheight	10	Min credible pixel xheight
textord_lms_line_trials	12	Number of linew fits to do
oldbl_holed_losscount	10	Max lost before fallback line used
pitsync_linear_version	6	Use new fast algorithm
pitsync_fake_depth	1	Max advance fake generation
textord_tabfind_show_strokewidths	0	Show stroke widths
textord_dotmatrix_gap	3	Max pixel gap for broken pixed pitch
textord_debug_block	0	Block to do debug on
textord_pitch_range	2	Max range test on pitch
textord_words_veto_power	5	Rows required to outvote a veto
equationdetect_save_bi_image	0	Save input bi image
equationdetect_save_spt_image	0	Save special character image
equationdetect_save_seed_image	0	Save the seed image
equationdetect_save_merged_image	0	Save the merged image
poly_debug	0	Debug old poly
poly_wide_objects_better	1	More accurate approx on wide things
wordrec_display_splits	0	Display splits
textord_debug_printable	0	Make debug windows printable
textord_space_size_is_variable	0	If true, word delimiter spaces are assumed to have variable width, even though characters have fixed pitch.
textord_tabfind_show_initial_partitions	0	Show partition bounds
textord_tabfind_show_reject_blobs	0	Show blobs rejected as noise
textord_tabfind_show_columns	0	Show column bounds
textord_tabfind_show_blocks	0	Show final block bounds
textord_tabfind_find_tables	1	run table detection
devanagari_split_debugimage	0	Whether to create a debug image for split shiro-rekha process.
textord_show_fixed_cuts	0	Draw fixed pitch cell boundaries
edges_use_new_outline_complexity	0	Use the new outline complexity module
edges_debug	0	turn on debugging for this module
edges_children_fix	0	Remove boxy parents of char-like children
gapmap_debug	0	Say which blocks have tables
gapmap_use_ends	0	Use large space at start and end of rows
gapmap_no_isolated_quanta	0	Ensure gaps not less than 2quanta wide
textord_heavy_nr	0	Vigorously remove noise
textord_show_initial_rows	0	Display row accumulation
textord_show_parallel_rows	0	Display page correlated rows
textord_show_expanded_rows	0	Display rows after expanding
textord_show_final_rows	0	Display rows after final fitting
textord_show_final_blobs	Display blob bounds after pre-ass
textord_test_landscape	0	Tests refer to land/port
textord_parallel_baselines	1	Force parallel baselines
textord_straight_baselines	0	Force straight baselines
textord_old_baselines	1
textord_old_xheight	0	Use old xheight algorithm
textord_fix_xheight_bug	1	Use spline baseline
textord_fix_makerow_bug	1	Prevent multiple baselines
textord_debug_xheights	0	Test xheight algorithms
textord_biased_skewcalc	1	Bias skew estimates with line length
textord_interpolating_skew	1	Interpolate across gaps
textord_new_initial_xheight	1	Use test xheight mechanism
textord_debug_blob	0	Print test blob information
textord_really_old_xheight	0	Use original wiseowl xheight
textord_oldbl_debug	0	Debug old baseline generation
textord_debug_baselines	0	Debug baseline generation
textord_oldbl_paradef	1	Use para default mechanism
textord_oldbl_split_splines	1	Split stepped splines
textord_oldbl_merge_parts	1	Merge suspect partitions
oldbl_corrfix	1	Improve correlation of heights
oldbl_xhfix	0	Fix bug in modes threshold for xheights
textord_ocropus_mode	0	Make baselines for ocropus
textord_tabfind_only_strokewidths	0	Only run stroke widths
textord_tabfind_show_initialtabs	0	Show tab candidates
textord_tabfind_show_finaltabs	0	Show tab vectors
textord_show_tables	0	Show table regions
textord_tablefind_show_mark	0	Debug table marking steps in detail
textord_tablefind_show_stats	0	Show page stats used in table finding
textord_tablefind_recognize_tables	0	Enables the table recognizer for table layout and filtering.
textord_all_prop
textord_debug_pitch_test
textord_disable_pitch_test
textord_fast_pitch_test
textord_debug_pitch_metric
textord_show_row_cuts
textord_show_page_cuts
textord_pitch_cheat
textord_blockndoc_fixed
textord_show_initial_words
textord_show_new_words
textord_show_fixed_words
textord_blocksall_fixed
textord_blocksall_prop
textord_blocksall_testing
textord_test_mode
textord_pitch_rowsimilarity
words_initial_lower
words_initial_upper
words_default_prop_nonspace
words_default_fixed_space
words_default_fixed_limit
textord_words_definite_spread
textord_spacesize_ratiofp
textord_spacesize_ratioprop
textord_fpiqr_ratio
textord_max_pitch_iqr
textord_fp_min_width
textord_underline_offset
ambigs_debug_level
classify_debug_level
classify_norm_method
matcher_debug_level
matcher_debug_flags
classify_learning_debug_level
matcher_permanent_classes_min
matcher_min_examples_for_ 原型設計
matcher_sufficient_examples_ for_prototyping
classify_adapt_proto_threshold
classify_adapt_feature_threshold
classify_class_pruner_threshold
classify_class_pruner_multiplier
classify_cp_cutoff_strength
classify_integer_matcher_multiplier
dawg_debug_level
hyphen_debug_level
stopper_smallword_size
stopper_debug_level
tessedit_truncate_wordchoice_log
max_permuter_attempts
repair_unchopped_blobs
chop_debug
chop_split_length
chop_same_distance
chop_min_outline_points
chop_seam_pile_size
chop_inside_angle
chop_min_outline_area
chop_centered_maxwidth
chop_x_y_weight
wordrec_debug_level
wordrec_max_join_chunks
segsearch_debug_level
segsearch_max_pain_points
segsearch_max_futile_classifications20Maximum number of pain point classifications per chunk that did not result in finding a better word choice.
language_model_debug_level
language_model_ngram_order
language_model_viterbi_list_ max_num_prunable
language_model_viterbi_list_max_size
language_model_min_compound_length
wordrec_display_segmentations
tessedit_pageseg_mode
tessedit_ocr_engine_mode
pageseg_devanagari_split_strategy
ocr_devanagari_split_strategy
bidi_debug
applybox_debug
applybox_page
tessedit_bigram_debug
除錯雜訊移除
noise_maxperblob
noise_maxperword
debug_x_ht_level
quality_min_initial_alphas_reqd
tessedit_tess_adaption_mode
multilang_debug_level
paragraph_debug_level
tessedit_preserve_min_wd_len
crunch_rating_max
crunch_pot_indicators
crunch_leave_lc_strings
crunch_leave_uc_strings
crunch_long_repetitions
crunch_debug0As it says
fixsp_non_noise_limit
fixsp_done_mode
debug_fix_space_level
x_ht_acceptance_tolerance
x_ht_min_change
superscript_debug
jpg_quality
user_defined_dpi
min_characters_to_try
suspect_level99Suspect marker level
suspect_short_words2Don't suspect dict wds longer than this
tessedit_reject_mode
tessedit_image_border
min_sane_x_ht_pixels
tessedit_page_number
tessedit_parallelize
lstm_choice_mode
lstm_choice_iterations
tosp_debug_level
tosp_enough_space_samples_for_median
tosp_redo_kern_limit
tosp_few_samples
tosp_short_row
tosp_sanity_method
textord_max_noise_size
textord_baseline_debug
textord_noise_sizefraction
textord_noise_translimit
textord_noise_sncount
use_ambigs_for_adaption
prioritize_division
classify_enable_learning
tess_cn_matching
tess_bn_matching
classify_enable_adaptive_matcher
classify_use_pre_adapted_templates
classify_save_adapted_templates
classify_enable_adaptive_debugger
classify_nonlinear_norm
disable_character_fragments1Do not include character fragments in the results of the classifier
classify_debug_character_fragments
matcher_debug_separate_windows
classify_bln_numeric_mode
load_system_dawg
load_freq_dawg
load_unambig_dawg
load_punc_dawg
load_number_dawg
load_bigram_dawg
use_only_first_uft8_step
stopper_no_acceptable_choices
segment_nonalphabetic_script
save_doc_words
merge_fragments_in_matrix
wordrec_enable_assoc
force_word_assoc
chop_enable
chop_vertical_creep
chop_new_seam_pile
assume_fixed_pitch_char_segment
wordrec_skip_no_truth_words
wordrec_debug_blamer
wordrec_run_blamer
save_alt_choices
language_model_ngram_on0Turn on/off the use of character ngram model
language_model_ngram_use_ only_first_uft8_step0Use only the first UTF8 step of the given string when computing log probabilities.
language_model_ngram_space_ delimited_language
language_model_use_sigmoidal_certainty
tessedit_resegment_from_boxes
tessedit_resegment_from_line_boxes
tessedit_train_from_boxes
tessedit_make_boxes_from_boxes
tessedit_train_line_recognizer
tessedit_dump_pageseg_images
tessedit_do_invert
tessedit_ambigs_training
tessedit_adaption_debug
applybox_learn_chars_and_char_frags_mode
applybox_learn_ngrams_mode
tessedit_display_outwords
tessedit_dump_choices
tessedit_timing_debug
tessedit_fix_fuzzy_spaces
tessedit_unrej_any_wd
tessedit_fix_hyphens
tessedit_enable_doc_dict
tessedit_debug_fonts
tessedit_debug_block_rejection
tessedit_enable_bigram_correction
tessedit_enable_dict_correction
enable_noise_removal
tessedit_minimal_rej_pass1
tessedit_test_adaption
test_pt
段落文本
lstm_use_matrix
tessedit_good_quality_unrej
tessedit_use_reject_spaces
tessedit_preserve_blk_rej_perfect_wds1Only rej partially rejected words in block rejection
tessedit_preserve_row_rej_perfect_wds1Only rej partially rejected words in row rejection
tessedit_dont_blkrej_good_wds0Use word segmentation quality metric
tessedit_dont_rowrej_good_wds0Use word segmentation quality metric
tessedit_row_rej_good_docs1Apply row rejection to good docs
tessedit_reject_bad_qual_wds1Reject all bad quality wds
tessedit_debug_doc_rejection0Page stats
tessedit_debug_quality_metrics0Output data to debug file
bland_unrej0unrej potential with no checks
unlv_tilde_crunching0Mark v.bad words for tilde crunch
hocr_font_info
hocr_char_boxes
crunch_early_merge_tess_fails
crunch_early_convert_bad_unlv_chs
crunch_terrible_garbage
crunch_leave_ok_strings
crunch_accept_ok1Use acceptability in okstring
crunch_leave_accept_strings0Don't pot crunch sensible strings
crunch_include_numerals0Fiddle alpha figures
tessedit_prefer_joined_punct0Reward punctuation joins
tessedit_write_block_separators0Write block separators in output
tessedit_write_rep_codes0Write repetition char code
tessedit_write_unlv0Write .unlv output file
tessedit_create_txt0Write .txt output file
tessedit_create_hocr0Write .html hOCR output file
tessedit_create_alto0Write .xml ALTO file
tessedit_create_lstmbox0Write .box file for LSTM training
tessedit_create_tsv0Write .tsv output file
tessedit_create_wordstrbox0Write WordStr format .box output file
tessedit_create_pdf0Write .pdf output file
textonly_pdf0Create PDF with only one invisible text layer
suspect_constrain_1Il0UNLV keep 1Il chars rejected
tessedit_minimal_rejection0Only reject tess failures
tessedit_zero_rejection0Don't reject ANYTHING
tessedit_word_for_word0Make output have exactly one word per WERD
tessedit_zero_kelvin_rejection0Don't reject ANYTHING AT ALL
tessedit_rejection_debug0Adaption debug
tessedit_flip_0O1Contextual 0O O0 flips
rej_trust_doc_dawg0Use DOC dawg in 11l conf. detector
rej_1Il_use_dict_word
rej_1Il_trust_permuter_type1Don't double check
rej_use_tess_accepted
rej_use_tess_blanks
rej_use_good_perm
rej_use_sensible_wd
rej_alphas_in_number_perm
tessedit_create_boxfile
tessedit_write_images
interactive_display_mode
tessedit_override_permuter
tessedit_use_primary_params_model
textord_tabfind_show_vlines
textord_use_cjk_fp_model
poly_allow_detailed_fx
tessedit_init_config_only
textord_equation_detect
textord_tabfind_vertical_text
textord_tabfind_force_vertical_text
preserve_interword_spaces
pageseg_apply_music_mask
textord_single_height_mode
tosp_old_to_method
tosp_old_to_constrain_sp_kn
tosp_only_use_prop_rows
tosp_force_wordbreak_on_punct
tosp_use_pre_chopping
tosp_old_to_bug_fix
tosp_block_use_cert_spaces
tosp_row_use_cert_spaces
tosp_narrow_blobs_not_cert
tosp_row_use_cert_spaces1
tosp_recovery_isolated_row_stats
tosp_only_small_gaps_for_kern
tosp_all_flips_fuzzy
tosp_fuzzy_limit_all
textord_no_rejects
textord_show_blobs
textord_show_boxes
textord_noise_rejwords
textord_noise_rejrows
textord_noise_debug
classify_learn_debug_str
user_words_file
user_words_suffix
user_patterns_file
user_patterns_suffix
output_ambig_words_file
word_to_debug
tessedit_char_blacklist
tessedit_char_whitelist
tessedit_char_unblacklist
tessedit_write_params_to_fileWrite all parameters to the given file.
applybox_exposure_pattern
chs_leading_punct('`"
chs_trailing_punct1
chs_trailing_punct2)'`"
outlines_odd	%\|	大綱數量不符合標準
outlines_2ij!?%":;	大綱數量不符合標準
numeric_punctuation	.,	Punct. chs expected WITHIN numbers
unrecognised_char	\|	Output char for unidentified blobs
ok_repeated_ch_non_alphanum_wds	-?*=	Allow NN to unrej
conflict_set_I_l_1	Il1 []	Il1 conflict set
file_type	.tif	Filename extension
tessedit_load_sublangs	List of languages to load with this one
page_separator
classify_char_norm_range
classify_max_rating_ratio
classify_max_certainty_margin
matcher_good_threshold
matcher_reliable_adaptive_result
matcher_perfect_threshold
matcher_bad_match_pad
matcher_rating_margin
matcher_avg_noise_size
matcher_clustering_max_angle_delta
classify_misfit_junk_penalty
評分標準
certainty_scale
tessedit_class_miss_scale
classify_adapted_pruning_factor
classify_adapted_pruning_threshold
classify_character_fragments_ garbage_certainty_threshold
speckle_large_max_size
speckle_rating_penalty
xheight_penalty_subscripts
xheight_penalty_inconsistent
segment_penalty_dict_frequent_word
segment_penalty_dict_case_ok
segment_penalty_dict_case_bad
segment_penalty_dict_nonword
certainty_scale20Certainty scaling factor
stopper_nondict_certainty_base-2.5Certainty threshold for non-dict words
stopper_phase2_certainty_rejection_offset1Reject certainty offset
stopper_certainty_per_char-0.5Certainty to add for each dict char above small word size.
stopper_allowable_character_badness3Max certaintly variation allowed in a word (in sigma)
doc_dict_pending_threshold0Worst certainty for using pending dictionary
doc_dict_certainty_threshold
tessedit_certainty_threshold
chop_split_dist_knob0.5Split length adjustment
chop_overlap_knob0.9Split overlap adjustment
chop_center_knob0.15Split center adjustment
chop_sharpness_knob0.06Split sharpness adjustment
chop_width_change_knob5Width change adjustment
chop_ok_split100OK split limit
chop_good_split50Good split limit
segsearch_max_char_wh_ratio

為獲得最佳效果，建議在執行 OCR 之前，先使用 IronOCR 的影像預處理濾鏡。這些篩選器能顯著提升準確度，特別是在處理低品質掃描檔或表格等複雜文件時。

常見問題

如何在 C# 中設定 IronTesseract 進行 OCR？

要設定 IronTesseract，請建立一個 IronTesseract 實例，並設定 Language 和 Configuration 等屬性。您可以指定 OCR 語言（從 125 種支援的語言中選擇）、啟用 BarCode 讀取、設定可搜尋 PDF 輸出，以及設定字元白名單。例如：var tesseract = new IronOcr.IronTesseract { Language = IronOcr.OcrLanguage.English, Configuration = new IronOcr.TesseractConfiguration { ReadBarCodes = false, RenderSearchablePdf = true } };

IronTesseract 支援哪些輸入格式？

IronTesseract 透過 OcrInput 類別支援多種輸入格式。您可以處理圖像（PNG、JPG 等）、PDF 檔案以及掃描文件。OcrInput 類別提供了靈活的方法來載入這些不同格式，使您能夠輕鬆對幾乎任何包含文字的文件執行 OCR 處理。

我能否使用 IronTesseract 同時讀取 BarCode 與文字？

是的，IronTesseract 具備進階 BarCode 讀取功能。您可透過在 TesseractConfiguration 中將 ReadBarCodes 設為 true 來啟用 BarCode 偵測功能。這讓您能在單次 OCR 操作中，從同一份文件中同時擷取文字與 BarCode 資料。

如何將掃描文件轉為可搜尋的 PDF 檔案？

IronTesseract 可透過在 TesseractConfiguration 中設定 RenderSearchablePdf = true，將掃描文件和圖像轉換為可搜尋的 PDF 檔案。此設定會產生文字可選取且可搜尋的 PDF 檔案，同時保留原始文件的外觀。

IronTesseract 支援哪些語言的 OCR 功能？

IronTesseract 支援 125 種國際語言的文字辨識。您可以透過設定 IronTesseract 實例的 Language 屬性來指定語言，例如 IronOcr.OcrLanguage.English、Spanish、Chinese、Arabic 以及其他多種語言。

我可以限制 OCR 過程中識別的字元嗎？

是的，IronTesseract 允許透過 TesseractConfiguration 中的 WhiteListCharacters 屬性來設定字元白名單與黑名單。當您已知預期字元集（例如僅限識別英數字元）時，此功能有助於提升辨識準確度。

如何同時對多個文件執行 OCR 處理？

IronTesseract 支援多執行緒功能，可進行批次處理。您可以利用平行處理同時對多個文件進行 OCR 處理，在處理大量圖片或 PDF 檔案時，能顯著提升效能。

IronOCR 使用哪個版本的 Tesseract？

IronOCR 採用了 Tesseract 5 的客製化與優化版本，稱為 Iron Tesseract。相較於標準的 Tesseract 實作，此增強版引擎在維持與 .NET 應用程式相容性的同時，提供了更高的準確度與效能。

IronOCR 如何提升資料準確性？

IronOCR 透過其先進的辨識演算法與影像校正功能來提升資料準確性，確保文字擷取過程既可靠又精確。

IronOCR 是否有提供免費試用版？

是的，Iron Software 提供 IronOCR 的免費試用版，讓使用者能在決定購買前測試其功能與效能。

Curtis Chau

立即與工程團隊聯繫

技術撰稿人

Curtis Chau 擁有卡爾頓大學（Carleton University）的電腦科學學士學位，專精於前端開發，並精通 Node.js、TypeScript、JavaScript 及 React。他熱衷於打造直觀且美觀的用戶介面，喜歡運用現代框架，並創建結構完善、視覺上吸引人的手冊。

除了開發工作之外，Curtis 對物聯網（IoT）抱有濃厚興趣，致力於探索整合硬體與軟體的創新方法。閒暇時，他喜歡玩遊戲和開發 Discord 機器人，將對科技的熱愛與創意相結合。

Jeffrey T. Fritz

首席程式經理 - .NET 社群團隊

Jeff 同時也是 .NET 與 Visual Studio 團隊的首席程式經理。他是 .NET Conf 虛擬會議系列的執行製作人，並主持每週播出兩次的開發者直播節目《Fritz and Friends》，在節目中他會與觀眾一起探討技術話題並共同編寫程式碼。Jeff 負責撰寫工作坊內容、準備簡報，並為 Microsoft Build、Microsoft Ignite、.NET Conf 以及 Microsoft MVP Summit 等微軟最大規模的開發者活動規劃內容。

準備開始了嗎？

Nuget 下載 5,896,332 | 版本： 2026.5 just released

檢視授權

還在往下捲動嗎？

想要快速確認成果嗎？ PM > Install-Package IronOcr
執行範例觀看您的圖片轉為可搜尋文字。

檢視授權

客戶亮點：

開發者焦點：

網路研討會：

立即開始 30天試用

本頁內容

如何在 C# 中使用 Iron Tesseract

using NuGet 套件管理員安裝 https://www.nuget.org/packages/IronOcr

請複製並執行此程式碼片段。

部署至您的生產環境進行測試

基本 OCR 工作流程

如何建立 IronTesseract 實例？

什麼是 Tesseract 的進階設定變數？

如何在程式碼中使用 Tesseract 設定？

Tesseract 的所有配置變數完整清單為何？

常見問題

如何在 C# 中設定 IronTesseract 進行 OCR？

IronTesseract 支援哪些輸入格式？

我能否使用 IronTesseract 同時讀取 BarCode 與文字？

如何將掃描文件轉為可搜尋的 PDF 檔案？

IronTesseract 支援哪些語言的 OCR 功能？

我可以限制 OCR 過程中識別的字元嗎？

如何同時對多個文件執行 OCR 處理？

IronOCR 使用哪個版本的 Tesseract？

IronOCR 如何提升資料準確性？

IronOCR 是否有提供免費試用版？

還在往下捲動嗎？

鋼鐵支援團隊

立即開始 30天試用

本頁內容

如何在 C# 中使用 Iron Tesseract

using NuGet 套件管理員安裝 https://www.nuget.org/packages/IronOcr

請複製並執行此程式碼片段。

部署至您的生產環境進行測試

基本 OCR 工作流程

如何建立 IronTesseract 實例？

什麼是 Tesseract 的進階設定變數？

如何在程式碼中使用 Tesseract 設定？

Tesseract 的所有配置變數完整清單為何？

常見問題

如何在 C# 中設定 IronTesseract 進行 OCR？

IronTesseract 支援哪些輸入格式？

我能否使用 IronTesseract 同時讀取 BarCode 與文字？

如何將掃描文件轉為可搜尋的 PDF 檔案？

IronTesseract 支援哪些語言的 OCR 功能？

我可以限制 OCR 過程中識別的字元嗎？

如何同時對多個文件執行 OCR 處理？

IronOCR 使用哪個版本的 Tesseract？

IronOCR 如何提升資料準確性？

IronOCR 是否有提供免費試用版？

還在往下捲動嗎？

立即獲取免費

下一步：開始 30天試用

Thank You

下一步：開始 30天試用

Want to deploy IronSuite to a live project for FREE?

What’s included?

獲得全球數百萬工程師的信賴

鋼鐵支援團隊