C#でのテッセラクトの使い方

更新日:2026年6月3日

Translated

View the article in English

C# での Iron Tesseract は IronTesseract インスタンスを作成し、言語とOCR設定で構成し、画像またはPDFを含む OcrInput オブジェクトで Read() メソッドを呼び出すことで使用されます。これにより、Tesseract 5 の最適化されたエンジンを使用して、テキストの画像を検索可能なPDFに変換します。

IronOCR は、Iron Tesseract と呼ばれるカスタマイズおよび最適化された Tesseract 5 を利用するための直感的な API を提供します。 IronOCR および IronTesseract を使用することで、テキストの画像やスキャンされたドキュメントをテキストおよび検索可能なPDFに変換できるようになります。ライブラリは、125の国際言語をサポートし、BarCode読み取りやコンピュータビジョンのような高度な機能を含んでいます。

クイックスタート: C#でのIronTesseract構成の設定

この例は、特定の設定で IronTesseract を構成し、1行のコードでOCRを実行する方法を示します。

IronOCR をNuGetパッケージマネージャでインストール
PM > Install-Package IronOcr

このコードスニペットをコピーして実行します。

var result = new IronOcr.IronTesseract { Language = IronOcr.OcrLanguage.English, Configuration = new IronOcr.TesseractConfiguration { ReadBarCodes = false, RenderSearchablePdf = true, WhiteListCharacters = "ABCabc123" } }.Read(new IronOcr.OcrInput("image.png"));

実際の環境でテストするためにデプロイする

今日プロジェクトで IronOCR を使い始めましょう無料トライアル

基本的なOCRワークフロー

画像を読み取るためにNuGetでOCRライブラリをインストールする
カスタム`Tesseract 5`を活用してOCRを実行する
画像やPDFファイルなどの処理したいドキュメントをロードする
抽出されたテキストをコンソールやファイルに出力する
結果を検索可能なPDFとして保存する

どのようにIronTesseractインスタンスを作成しますか？

このコードでTesseractオブジェクトを初期化する：

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-initialize-irontesseract.cs

using IronOcr;

IronTesseract ocr = new IronTesseract();

Imports IronOcr

Dim ocr As New IronTesseract()

$vbLabelText $csharpLabel

さまざまな言語を選択して、バーコード読み取りを有効にし、文字をホワイトリストまたはブラックリスト化することによって IronTesseract の動作をカスタマイズできます。 IronOCRは、OCRプロセスを微調整するための包括的な設定オプションを提供します：

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-configure-irontesseract.cs

IronTesseract ocr = new IronTesseract
{
    Configuration = new TesseractConfiguration
    {
        ReadBarCodes = false,
        RenderHocr = true,
        TesseractVariables = null,
        WhiteListCharacters = null,
        BlackListCharacters = "`ë|^",
    },
    MultiThreaded = false,
    Language = OcrLanguage.English,
    EnableTesseractConsoleMessages = true, // False as default
};

Dim ocr As New IronTesseract With {
	.Configuration = New TesseractConfiguration With {
		.ReadBarCodes = False,
		.RenderHocr = True,
		.TesseractVariables = Nothing,
		.WhiteListCharacters = Nothing,
		.BlackListCharacters = "`ë|^"
	},
	.MultiThreaded = False,
	.Language = OcrLanguage.English,
	.EnableTesseractConsoleMessages = True
}

$vbLabelText $csharpLabel

構成後、Tesseract機能を使用して OcrInput オブジェクトを読み取ることができます。 OcrInputクラスは、さまざまな入力フォーマットを読み込むための柔軟なメソッドを提供します：

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-read.cs

IronTesseract ocr = new IronTesseract();

using OcrInput input = new OcrInput();
input.LoadImage("attachment.png");
OcrResult result = ocr.Read(input);
string text = result.Text;

Dim ocr As New IronTesseract()

Using input As New OcrInput()
	input.LoadImage("attachment.png")
	Dim result As OcrResult = ocr.Read(input)
	Dim text As String = result.Text
End Using

$vbLabelText $csharpLabel

複雑なシナリオでは、マルチスレッド機能を活用して複数のドキュメントを同時に処理し、バッチ処理のパフォーマンスを大幅に向上させることができます。

Tesseractの高度な設定変数とは

IronOCR Tesseract インターフェースでは、IronOcr.TesseractConfiguration クラスを通じて Tesseract 構成変数を完全に制御できます。これらの高度な設定により、低品質スキャンの修正や特定のドキュメントタイプの読み取りなど、特定の使用ケースに合わせて OCR のパフォーマンスを最適化できます。

コード内でTesseractコンフィギュレーションを使用するには？

:path=/static-assets/ocr/content-code-examples/how-to/irontesseract-tesseract-configuration.cs

using IronOcr;
using System;

IronTesseract Ocr = new IronTesseract();

Ocr.Language = OcrLanguage.English;
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd;

// Configure Tesseract Engine
Ocr.Configuration.TesseractVariables["tessedit_parallelize"] = false;

using var input = new OcrInput();
input.LoadImage("/path/file.png");

OcrResult Result = Ocr.Read(input);
Console.WriteLine(Result.Text);

Imports IronOcr
Imports System

Private Ocr As New IronTesseract()

Ocr.Language = OcrLanguage.English
Ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.AutoOsd

' Configure Tesseract Engine
Ocr.Configuration.TesseractVariables("tessedit_parallelize") = False

Dim input = New OcrInput()
input.LoadImage("/path/file.png")

Dim Result As OcrResult = Ocr.Read(input)
Console.WriteLine(Result.Text)

$vbLabelText $csharpLabel

IronOCRはまた、異なるドキュメントタイプに特化した設定を提供します。例えば、パスポートの読み取りやMICR小切手の処理では、特定の前処理フィルターや領域検出を適用して精度を向上させることができます。

財務文書の構成例

:path=/static-assets/ocr/content-code-examples/how-to/iron-tesseract-6.cs

// Example: Configure for financial documents
IronTesseract ocr = new IronTesseract
{
    Language = OcrLanguage.English,
    Configuration = new TesseractConfiguration
    {
        PageSegmentationMode = TesseractPageSegmentationMode.SingleBlock,
        TesseractVariables = new Dictionary<string, object>
        {
            ["tessedit_char_whitelist"] = "0123456789.$,",
            ["textord_heavy_nr"] = false,
            ["edges_max_children_per_outline"] = 10
        }
    }
};

// Apply preprocessing filters for better accuracy
using OcrInput input = new OcrInput();
input.LoadPdf("financial-document.pdf");
input.Deskew();
input.EnhanceResolution(300);

OcrResult result = ocr.Read(input);

Imports IronOcr

' Example: Configure for financial documents
Dim ocr As New IronTesseract With {
    .Language = OcrLanguage.English,
    .Configuration = New TesseractConfiguration With {
        .PageSegmentationMode = TesseractPageSegmentationMode.SingleBlock,
        .TesseractVariables = New Dictionary(Of String, Object) From {
            {"tessedit_char_whitelist", "0123456789.$,"},
            {"textord_heavy_nr", False},
            {"edges_max_children_per_outline", 10}
        }
    }
}

' Apply preprocessing filters for better accuracy
Using input As New OcrInput()
    input.LoadPdf("financial-document.pdf")
    input.Deskew()
    input.EnhanceResolution(300)

    Dim result As OcrResult = ocr.Read(input)
End Using

$vbLabelText $csharpLabel

すべてのTesseract構成変数の完全なリストは何ですか?

これらは IronTesseract.Configuration.TesseractVariables["key"] = value; を使用して設定できます。設定変数を使用すると、特定のドキュメントで最適な結果を得るためにOCRの動作を微調整することができます。 OCR パフォーマンスの最適化に関する詳細なガイダンスについては、高速 OCR 設定ガイドを参照してください。

Tesseract 設定変数	Default	意味
分類番号cpレベル	3	クラスプルーナーレベルの数
textord_debug_tabfind	0	デバッグタブの検出
textord_debug_bugs	0	タブ検索のバグに関する出力をオンにする
textord_testregion_left	-1	デバッグレポートの四角形の左端
textord_testregion_top	-1	デバッグレポートの四角形の上端
テキストコード_テスト領域_右	2147483647	デバッグ長方形の右端
テキストコード_テスト領域_下	2147483647	デバッグ用四角形の下端
textord_tabfind_show_partitions	0	パーティション境界を表示し、>1 の場合は待機します
デバナガリ_スプリット_デバッグレベル	0	分割された shiro-rekha プロセスのデバッグレベル。
アウトラインあたりのエッジの最大子数	10	キャラクターアウトライン内の子の最大数
エッジの最大子レイヤー数	5	文字アウトライン内のネストされた子の最大レイヤー数
孫あたりのエッジの子数	10	チャッキングアウトラインの重要度比
エッジの子の数の制限	45	ブロブに許容される最大穴数
エッジの最小非穴	12	ボックス内の文字の最小ピクセル数
エッジパス面積比	40	Max lensq/area for acceptable child outline
textord_fp_chop_error	2	チョップセルの最大許容曲げ
textord_tabfind_show_images	0	Show image blobs
textord_skewsmooth_offset	4	スムーズファクター
textord_skewsmooth_offset2	1	スムーズファクター
テキストードテストx	-2147483647	検査患者の座標
テキストord_test_y	-2147483647	検査患者の座標
行内のテキストワードの最小ブロブ数	4	勾配をカウントする前の最小ブロブ数
テキストードスプライン最小ブロブ	8	Min blobs in each spline segment
テキストフォードスプライン中央値勝利	6	Size of window for spline segmentation
テキストオード最大ブロブオーバーラップ	4	Max number of blobs a big blob can overlap
テキストord_min_xheight	10	Min credible pixel xheight
textord_lms_line_trials	12	Number of linew fits to do
古い穴あき損失数	10	Max lost before fallback line used
pitsync_linear_version	6	Use new fast algorithm
ピットシンクフェイクデプス	1	Max advance fake generation
textord_tabfind_show_strokewidths	0	Show stroke widths
テキストードドットマトリックスギャップ	3	Max pixel gap for broken pixed pitch
テキストコードデバッグブロック	0	Block to do debug on
テキストピッチ範囲	2	Max range test on pitch
テキストワード拒否権	5	Rows required to outvote a veto
方程式検出_保存_bi_image	0	Save input bi image
方程式検出_save_spt_image	0	Save special character image
方程式検出_保存_シード画像	0	Save the seed image
方程式検出_マージされた画像を保存	0	Save the merged image
ポリデバッグ	0	Debug old poly
ポリワイドオブジェクトより良い	1	More accurate approx on wide things
単語記録表示分割	0	Display splits
textord_debug_printable	0	Make debug windows printable
textord_space_size_is_variable	0	If true, word delimiter spaces are assumed to have variable width, even though characters have fixed pitch.
textord_tabfind_show_initial_partitions	0	Show partition bounds
textord_tabfind_show_reject_blobs	0	Show blobs rejected as noise
textord_tabfind_show_columns	0	Show column bounds
textord_tabfind_show_blocks	0	Show final block bounds
textord_tabfind_find_tables	1	run table detection
デバナガリ_スプリット_デバッグイメージ	0	Whether to create a debug image for split shiro-rekha process.
textord_show_fixed_cuts	0	Draw fixed pitch cell boundaries
エッジ使用の新しいアウトラインの複雑さ	0	Use the new outline complexity module
エッジデバッグ	0	turn on debugging for this module
エッジの子供の修正	0	Remove boxy parents of char-like children
ギャップマップデバッグ	0	Say which blocks have tables
ギャップマップの使用終了	0	Use large space at start and end of rows
ギャップマップ_no_isolated_quanta	0	Ensure gaps not less than 2quanta wide
テキストードヘビーnr	0	Vigorously remove noise
textord_show_initial_rows	0	Display row accumulation
textord_show_parallel_rows	0	Display page correlated rows
textord_show_expanded_rows	0	Display rows after expanding
textord_show_final_rows	0	Display rows after final fitting
textord_show_final_blobs	0	Display blob bounds after pre-ass
textord_test_landscape	0	Tests refer to land/port
textord_parallel_baselines	1	Force parallel baselines
テキストコード_ストレート_ベースライン	0	Force straight baselines
textord_old_baselines	1	Use old baseline algorithm
テキストード古いx高さ	0	Use old xheight algorithm
textord_fix_xheight_bug	1	Use spline baseline
textord_fix_makerow_bug	1	Prevent multiple baselines
textord_debug_xheights	0	Test xheight algorithms
textord_biased_skewcalc	1	Bias skew estimates with line length
textord_interpolating_skew	1	Interpolate across gaps
テキストード_新しい_初期_x高さ	1	Use test xheight mechanism
テキストード_デバッグ_ブロブ	0	Print test blob information
テキストード_本当に古い_x高さ	0	Use original wiseowl xheight
textord_oldbl_debug	0	Debug old baseline generation
textord_debug_baselines	0	Debug baseline generation
textord_oldbl_paradef	1	Use para default mechanism
textord_oldbl_split_splines	1	Split stepped splines
textord_oldbl_merge_parts	1	Merge suspect partitions
古いbl_corrfix	1	Improve correlation of heights
古いbl_xhfix	0	Fix bug in modes threshold for xheights
テキストードオクロプスモード	0	Make baselines for ocropus
textord_tabfind_only_strokewidths	0	Only run stroke widths
textord_tabfind_show_initialtabs	0	Show tab candidates
textord_tabfind_show_finaltabs	0	Show tab vectors
textord_show_tables	0	Show table regions
textord_tablefind_show_mark	0	Debug table marking steps in detail
textord_tablefind_show_stats	0	Show page stats used in table finding
textord_tablefind_recognize_tables	0	Enables the table recognizer for table layout and filtering.
textord_all_prop	0	All doc is proportial text
textord_debug_pitch_test	0	Debug on fixed pitch test
textord_disable_pitch_test	0	Turn off dp fixed pitch algorithm
テキストコード_fast_pitch_test	0	Do even faster pitch algorithm
テキストコード_デバッグ_ピッチ_メトリック	0	Write full metric stuff
textord_show_row_cuts	0	Draw row-level cuts
textord_show_page_cuts	0	Draw page-level cuts
テキストードピッチチート	0	Use correct answer for fixed/prop
textord_blockndoc_fixed	0	Attempt whole doc/block fixed pitch
textord_show_initial_words	0	Display separate words
textord_show_new_words	0	Display separate words
textord_show_fixed_words	0	Display forced fixed pitch words
textord_blocksall_fixed	0	Moan about prop blocks
textord_blocksall_prop	0	Moan about fixed pitch blocks
textord_blocksall_testing	0	Dump stats when moaning
テキストードテストモード	0	Do current test
textord_pitch_rowsimilarity	0.08	Fraction of xheight for sameness
単語の頭文字	0.5	Max initial cluster size
単語の頭文字	0.15	Min initial cluster spacing
単語のデフォルトプロパティ非スペース	0.25	Fraction of xheight
単語のデフォルト固定スペース	0.75	Fraction of xheight
単語数のデフォルト制限	0.6	Allowed size variance
テキストワードの明確な広がり	0.3	Non-fuzzy spacing region
テキストスペースサイズ比	2.8	Min ratio space/nonspace
textord_spacesize_ratioprop	2	Min ratio space/nonspace
テキストord_fpiqr_ratio	1.5	Pitch IQR/Gap IQR threshold
テキストード最大ピッチiqr	0.2	Xh fraction noise in pitch
テキストフォードfpの最小幅	0.5	Min width of decent blobs
テキスト下線オフセット	0.1	Fraction of x to ignore
ambigs_debug_level	0	Debug level for unichar ambiguities
デバッグレベルを分類する	0	Classify debug level
分類規範法	1	Normalization Method ...
マッチャーデバッグレベル	0	Matcher Debug Level
マッチャーデバッグフラグ	0	Matcher Debug Flags
分類学習デバッグレベル	0	Learning Debug Level:
マッチャー永続クラス最小値	1	Min # of permanent classes
プロトタイプ作成のためのmatcher_min_examples	3	Reliable Config Threshold
プロトタイプ作成のための十分な例のマッチング	5	Enable adaption even if the ambiguities have not been seen
分類_適応_プロト_しきい値	230	Threshold for good protos during adaptive 0-255
分類_適応_特徴_しきい値	230	Threshold for good features during adaptive 0-255
分類クラスプルーナーしきい値	229	Class Pruner Threshold 0-255
分類クラスプルーナー乗数	15	Class Pruner Multiplier 0-255:
分類_cp_カットオフ_強度	7	Class Pruner CutoffStrength:
整数マッチャー乗数分類	10	Integer Matcher Multiplier 0-255:
dawg_debug_level	0	Set to 1 for general debug info, to 2 for more details, to 3 to see all the debug messages
ハイフンデバッグレベル	0	Debug level for hyphenated words.
ストッパー_smallword_size	2	Size of dict word to be treated as non-dict word
ストッパーデバッグレベル	0	Stopper debug level
tessedit_truncate_wordchoice_log	10	Max words to keep in list
最大試行回数	10000	Maximum number of different character choices to consider during permutation. This limit is especially useful when user patterns are specified, since overly generic patterns can result in dawg search exploring an overly large number of options.
修復されていないBLOB	1	Fix blobs that aren't chopped
チョップデバッグ	0	Chop debug
チョップスプリット長さ	10000	Split Length
同じ距離を切り取る	2	Same distance
最小アウトラインポイントを切り取る	6	Min Number of Points on Outline
チョップシームパイルサイズ	150	Max number of seams in seam_pile
チョップインサイドアングル	-50	Min Inside Angle Bend
最小アウトライン面積	2000	Min Outline Area
チョップ中央最大幅	90	Width of (smaller) chopped blobs above which we don't care that a chop is not near the center.
チョップ_x_y_ウェイト	3	X / Y length weight
wordrec_debug_level	0	Debug level for wordrec
wordrec_max_join_chunks	4	Max number of broken pieces to associate
セグメント検索デバッグレベル	0	SegSearch debug level
セグメント検索最大痛みポイント	2000	Maximum number of pain points stored in the queue
segsearch_max_futile_classifications	20	Maximum number of pain point classifications per chunk that did not result in finding a better word choice.
言語モデルのデバッグレベル	0	Language model debug level
言語モデルngram順序	8	Maximum order of the character ngram model
言語モデルビタービリストの最大プルーニング可能数	10	Maximum number of prunable (those for which PrunablePath() is true) entries in each viterbi list recorded in BLOB_CHOICEs
言語モデルビタービリストの最大サイズ	500	Maximum size of viterbi lists recorded in BLOB_CHOICEs
言語モデルの最小複合長	3	Minimum length of compound words
ワードレック_ディスプレイ_セグメンテーション	0	Display Segmentations
tessedit_pageseg_mode	6	Page seg mode: 0=osd only, 1=auto+osd, 2=auto_only, 3=auto, 4=column, 5=block_vert, 6=block, 7=line, 8=word, 9=word_circle, 10=char,11=sparse_text, 12=sparse_text+osd, 13=raw_line (Values from PageSegMode enum in tesseract/publictypes.h)
tessedit_ocr_engine_mode	2	Which OCR engine(s) to run (Tesseract, LSTM, both). Defaults to loading and running the most accurate available.
ページeg_devanagari_split_strategy	0	Whether to use the top-line splitting process for Devanagari documents while performing page-segmentation.
ocr_devanagari_split_strategy	0	Whether to use the top-line splitting process for Devanagari documents while performing ocr.
bidi_debug	0	Debug level for BiDi
適用ボックスデバッグ	1	Debug level
適用ボックスページ	0	Page number to apply boxes from
tessedit_bigram_debug	0	Amount of debug output for bigram correction.
デバッグノイズ除去	0	Debug reassignment of small outlines
ノイズ最大ブロブ	8	Max diacritics to apply to a blob
単語あたりのノイズ最大値	16	Max diacritics to apply to a word
デバッグ_x_ht_レベル	0	Reestimate debug
品質_最小_初期_アルファ値_必要	2	alphas in a good word
tessedit_tess_adaption_mode	39	Adaptation decision algorithm for tess
マルチ言語デバッグレベル	0	Print multilang debug info.
段落デバッグレベル	0	Print paragraph debug info.
tessedit_preserve_min_wd_len	2	Only preserve wds longer than this
クランチレーティングマックス	10	For adj length in rating per ch
クランチポットインジケーター	1	How many potential indicators needed
クランチ_leave_lc_strings	4	Don't crunch words with long lower case strings
クランチ_leave_uc_strings	4	Don't crunch words with long lower case strings
クランチロングレペティション	3	Crunch words with long repetitions
crunch_debug	0	As it says
fixsp_non_noise_limit	1	How many non-noise blbs either side?
fixsp_done_mode	1	What constitues done for spacing
デバッグ修正スペースレベル	0	Contextual fixspace debug
x_ht_許容値	8	Max allowed deviation of blob top outside of font data
x_ht_min_change	8	Min change in xht before actually trying it
上付き文字デバッグ	0	Debug level for sub & superscript fixer
jpg_品質	85	Set JPEG quality level
ユーザー定義dpi	0	Specify DPI for input image
試す最小文字数	50	Specify minimum characters to try during OSD
suspect_level	99	Suspect marker level
suspect_short_words	2	Don't suspect dict wds longer than this
tessedit_reject_mode	0	Rejection algorithm
tessedit_image_border	2	Rej blbs near image edge limit
最小の正気のx高さピクセル	8	Reject any x-ht lt or eq than this
tessedit_ページ番号	-1	-1 -> All pages, else specific page to process
tessedit_parallelize	1	Run in parallel where possible
lstm_choice_mode	2	Allows to include alternative symbols choices in the hOCR output. Valid input values are 0, 1 and 2. 0 is the default value. With 1 the alternative symbol choices per timestep are included. With 2 alternative symbol choices are extracted from the CTC process instead of the lattice. The choices are mapped per character.
lstm_choice_iterations	5	Sets the number of cascading iterations for the Beamsearch in lstm_choice_mode. Note that lstm_choice_mode must be set to a value greater than 0 to produce results.
tosp_debug_level	0	Debug data
中央値に十分なスペースのサンプル数	3	or should we use mean
tosp_redo_kern_limit	10	No.samples reqd to reestimate for row
tosp_few_samples	40	No.gaps reqd with 1 large gap to treat as a table
tosp_short_row	20	No.gaps reqd with few cert spaces to use certs
tosp_sanity_method	1	How to avoid being silly
テキストード最大ノイズサイズ	7	Pixel size of noise
テキストコード_ベースライン_デバッグ	0	Baseline debug level
textord_noise_sizefraction	10	Fraction of size for maxima
テキストードノイズトランスリミット	16	Transitions for normal blob
テキストードノイズカウント	1	super norm blobs to save row
適応のための曖昧さの使用	0	Use ambigs for deciding whether to adapt to a character
優先順位付け部門	0	Prioritize blob division over chopping
分類_有効_学習	1	Enable adaptive classifier
tess_cn_matching	0	Character Normalized Matching
tess_bn_マッチング	0	Baseline Normalized Matching
分類_有効_適応_マッチャー	1	Enable adaptive classifier
事前に適応されたテンプレートを使用して分類する	0	Use pre-adapted classifier templates
適応したテンプレートを分類して保存する	0	Save adapted templates to a file
分類_有効_適応型デバッガー	0	Enable match debugger
非線形ノルムを分類する	0	Non-linear stroke-density normalization
disable_character_fragments	1	Do not include character fragments in the results of the classifier
分類デバッグ文字フラグメント	0	Bring up graphical debugging windows for fragments training
マッチャーデバッグ分離ウィンドウ	0	Use two different windows for debugging the matching: One for the protos and one for the features.
分類_bln_数値_モード	0	Assume the input is numbers [0-9].
ロードシステムドッグ	1	Load system word dawg.
ロード頻度	1	Load frequent word dawg.
ロード_unambig_dawg	1	Load unambiguous word dawg.
ロードパンクドッグ	1	Load dawg with punctuation patterns.
ロード番号_dawg	1	Load dawg with number patterns.
ロードビグラムドッグ	1	Load dawg with special word bigrams.
uft8の最初のステップのみを使用する	0	Use only the first UTF8 step of the given string when computing log probabilities.
ストッパー_受け入れられない選択肢	0	Make AcceptableChoice() always return false. Useful when there is a need to explore all segmentations
セグメント非アルファベット文字	0	Don't use any alphabetic-specific tricks. Set to true in the traineddata config file for scripts that are cursive or inherently fixed-pitch
ドキュメントの単語を保存	0	Save Document Words
マトリックス内のフラグメントのマージ	1	Merge the fragments in the ratings matrix and delete them after merging
wordrec_enable_assoc	1	Associator Enable
強制単語連想	0	force associator to run regardless of what enable_assoc is. This is used for CJK where component grouping is necessary.
チョップを有効にする	1	Chop enable
チョップ垂直クリープ	0	Vertical creep
新しいシームパイルを切り刻む	1	Use new seam_pile
固定ピッチ文字セグメントを想定	0	include fixed-pitch heuristics in char segmentation
単語記録スキップなし真実のない単語	0	Only run OCR for words that had truth recorded in BlamerBundle
wordrec_debug_blamer	0	Print blamer debug messages
wordrec_run_blamer	0	Try to set the blame for errors
代替選択肢を保存する	1	Save alternative paths found during chopping and segmentation search
language_model_ngram_on	0	Turn on/off the use of character ngram model
language_model_ngram_use_ only_first_uft8_step	0	Use only the first UTF8 step of the given string when computing log probabilities.
言語モデルngram空間_区切り言語	1	Words are delimited by space
言語モデル使用シグモイド確実性	0	Use sigmoidal score for certainty
tessedit_resegment_from_boxes	0	Take segmentation and labeling from box file
tessedit_resegment_from_line_boxes	0	Conversion of word/line box file to char box file
tessedit_train_from_boxes	0	Generate training data from boxed chars
tessedit_make_boxes_from_boxes（箱から箱を作る	0	Generate more boxes from boxed chars
tessedit_train_line_recognizer	0	Break input into lines and remap boxes if present
tessedit_dump_pageseg_images	0	Dump intermediate images made during page segmentation
tessedit_do_invert	1	Try inverting the image in `LSTMRecognizeWord`
tessedit_ambigs_training	0	Perform training for ambiguities
tessedit_adaption_debug	0	Generate and print debug information for adaption
applybox_learn_chars_and_char_frags_mode	0	Learn both character fragments (as is done in the special low exposure mode) as well as unfragmented characters.
applybox_learn_ngrams_mode	0	Each bounding box is assumed to contain ngrams. Only learn the ngrams whose outlines overlap horizontally.
tessedit_display_outwords	0	Draw output words
tessedit_dump_choices	0	Dump char choices
tessedit_timing_debug	0	Print timing stats
tessedit_fix_fuzzy_spaces	1	Try to improve fuzzy spaces
tessedit_unrej_any_wd	0	Don't bother with word plausibility
tessedit_fix_hyphens	1	Crunch double hyphens?
tessedit_enable_doc_dict	1	Add words to the document dictionary
tessedit_debug_fonts	0	Output font info per char
tessedit_debug_block_rejection	0	Block and Row stats
tessedit_enable_bigram_correction	1	Enable correction based on the word bigram dictionary.
tessedit_enable_dict_correction	0	Enable single word correction based on the dictionary.
ノイズ除去を有効にする	1	Remove and conditionally reassign small outlines when they confuse layout analysis, determining diacritics vs noise
tessedit_minimal_rej_pass1	0	Do minimal rejection on pass 1 output
tessedit_test_adaptation	0	Test adaption criteria
テストpt	0	Test for point
段落テキストベース	1	Run paragraph detection on the post-text-recognition (more accurate)
lstm_use_matrix	1	Use ratings matrix/beam search with lstm
テセディット_良質_アンレジ	1	Reduce rejection on good docs
tessedit_use_reject_spaces	1	Reject spaces?
tessedit_preserve_blk_rej_perfect_wds	1	Only rej partially rejected words in block rejection
tessedit_preserve_row_rej_perfect_wds	1	Only rej partially rejected words in row rejection
tessedit_dont_blkrej_good_wds	0	Use word segmentation quality metric
tessedit_dont_rowrej_good_wds	0	Use word segmentation quality metric
tessedit_row_rej_good_docs	1	Apply row rejection to good docs
tessedit_reject_bad_qual_wds	1	Reject all bad quality wds
tessedit_debug_doc_rejection	0	Page stats
tessedit_debug_quality_metrics	0	Output data to debug file
bland_unrej	0	unrej potential with no checks
unlv_tilde_crunching	0	Mark v.bad words for tilde crunch
hocr_font_info	0	Add font info to hocr output
hocr_char_boxes	0	Add coordinates for each character to hocr output
クランチ早期マージテス失敗	1	Before word crunch?
クランチ_アーリー_コンバート_バッド_unlv_chs	0	Take out ~^ early?
ひどいゴミ	1	As it says
クランチ_leave_ok_strings	1	Don't touch sensible strings
crunch_accept_ok	1	Use acceptability in okstring
crunch_leave_accept_strings	0	Don't pot crunch sensible strings
crunch_include_numerals	0	Fiddle alpha figures
tessedit_prefer_joined_punct	0	Reward punctuation joins
tessedit_write_block_separators	0	Write block separators in output
tessedit_write_rep_codes	0	Write repetition char code
tessedit_write_unlv	0	Write .unlv output file
tessedit_create_txt	0	Write .txt output file
tessedit_create_hocr	0	Write .html hOCR output file
tessedit_create_alto	0	Write .xml ALTO file
tessedit_create_lstmbox	0	Write .box file for LSTM training
tessedit_create_tsv	0	Write .tsv output file
tessedit_create_wordstrbox	0	Write WordStr format .box output file
tessedit_create_pdf	0	Write .pdf output file
textonly_pdf	0	Create PDF with only one invisible text layer
suspect_constrain_1Il	0	UNLV keep 1Il chars rejected
tessedit_minimal_rejection	0	Only reject tess failures
tessedit_zero_rejection	0	Don't reject ANYTHING
tessedit_word_for_word	0	Make output have exactly one word per WERD
tessedit_zero_kelvin_rejection	0	Don't reject ANYTHING AT ALL
tessedit_rejection_debug	0	Adaption debug
tessedit_flip_0O	1	Contextual 0O O0 flips
rej_trust_doc_dawg	0	Use DOC dawg in 11l conf. detector
rej_1Il_use_dict_word	0	Use dictword test
rej_1Il_trust_permuter_type	1	Don't double check
rej_use_tess_accepted	1	Individual rejection control
rej_use_tess_blanks	1	Individual rejection control
良いパーミッションの使用を拒否	1	Individual rejection control
rej_use_sensible_wd	0	Extend permuter check
承認番号のアルファベット順	0	Extend permuter check
tessedit_create_boxfile	0	Output text with boxes
tessedit_write_images	0	Capture the image from the IPE
インタラクティブ表示モード	0	Run interactively?
tessedit_override_permuter	1	According to dict_word
tessedit_use_primary_params_model	0	In multilingual mode use params model of the primary language
textord_tabfind_show_vlines	0	Debug line finding
textord_use_cjk_fp_model	0	Use CJK fixed pitch model
poly_allow_detailed_fx	0	Allow feature extractors to see the original outline
tessedit_init_config_only	0	Only initialize with the config file. Useful if the instance is not going to be used for OCR but say only for layout analysis.
テキスト式検出	0	Turn on equation detector
textord_tabfind_vertical_text	1	Enable vertical detection
テキストord_tabfind_force_vertical_text	0	Force using vertical text page mode
単語間のスペースを保持する	0	Preserve multiple interword spaces
pageseg_apply_music_mask	1	Detect music staff and remove intersecting components
テキストコードシングルハイトモード	0	Script has no xheight, so use a single mode
tosp_old_to_method	0	Space stats use prechopping?
TOSP_OLD_TO_CONSTRIN_SP_KN	0	Constrain relative values of inter and intra-word gaps for old_to_method.
tosp_only_use_prop_rows	1	Block stats to use fixed pitch rows?
tosp_force_wordbreak_on_punct	0	Force word breaks on punct to break long lines in non-space delimited langs
tosp_use_pre_chopping	0	Space stats use prechopping?
tosp_old_to_bug_fix	0	Fix suspected bug in old code
tosp_block_use_cert_spaces	1	Only stat OBVIOUS spaces
tosp_row_use_cert_spaces	1	Only stat OBVIOUS spaces
tosp_narrow_blobs_not_cert	1	Only stat OBVIOUS spaces
tosp_row_use_cert_spaces1	1	Only stat OBVIOUS spaces
tosp_recovery_isolated_row_stats	1	Use row alone when inadequate cert spaces
tosp_only_small_gaps_for_kern。	0	Better guess
tosp_all_flips_fuzzy	0	Pass ANY flip to context?
tosp_fuzzy_limit_all	1	Don't restrict kn->sp fuzzy limit to tables
textord_no_rejects	0	Don't remove noise blobs
textord_show_blobs	0	Display unsorted blobs
テキスト表示ボックス	0	Display unsorted blobs
テキストワードノイズ	1	Reject noise-like words
テキストードノイズ再行	1	Reject noise-like rows
テキストコードノイズデバッグ	0	Debug row garbage detector
分類学習デバッグ文字列		Class str to debug learning
ユーザー単語ファイル		A filename of user-provided words.
ユーザー単語の接尾辞		A suffix of user-provided words located in tessdata.
ユーザーパターンファイル		A filename of user-provided patterns.
ユーザーパターンサフィックス		A suffix of user-provided patterns located in tessdata.
出力曖昧語ファイル		Output file for ambiguities found in the dictionary
デバッグ用の単語		Word for which stopper debug information should be printed to stdout
tessedit_char_ブラックリスト		Blacklist of chars not to recognize
tessedit_char_whitelist		Whitelist of chars to recognize
tessedit_char_ブラックリスト解除		List of chars to override tessedit_char_ブラックリスト
tessedit_write_params_to_file		Write all parameters to the given file.
ボックス露出パターンを適用する	.exp	Exposure value follows this pattern in the image filename. The name of the image files are expected to be in the form [lang].[fontname].exp [num].tif
chs_leading_punct('`"	行頭の句読点
chs_trailing_punct1	).,;:?!	1st Trailing punctuation
chs_trailing_punct2)'`"	2nd Trailing punctuation
アウトライン_奇数	%\|	標準外のアウトライン数
outlines_2ij!?%":;	標準外のアウトライン数
数値句読点	.,	Punct. chs expected WITHIN numbers
認識されない文字	\|	Output char for unidentified blobs
ok_repeated_ch_non_alphanum_wds	-?*=	Allow NN to unrej
競合セットI_l_1	イル1 []	Il1 conflict set
ファイルタイプ	.tif	Filename extension
tessedit_load_sublangs		List of languages to load with this one
ページセパレーター		Page separator (default is form feed control character)
文字の標準範囲を分類する	0.2	Character Normalization Range ...
分類最大評価比率	1.5	Veto ratio between classifier ratings
分類最大確実性マージン	5.5	Veto difference between classifier certainties
マッチャーの良好なしきい値	0.125	Good Match (0-1)
マッチャー_信頼性の高い適応結果	0	Great Match (0-1)
マッチャー完全しきい値	0.02	Perfect Match (0-1)
マッチャー_悪い_マッチ_パッド	0.15	Bad Match Pad (0-1)
マッチャーレーティングマージン	0.1	New template margin (0-1)
マッチャー平均ノイズサイズ	12	Avg. noise blob length
マッチャークラスタリング最大角度デルタ	0.015	Maximum angle delta for prototype clustering
不適合ジャンクペナルティの分類	0	Penalty to apply when a non-alnum is vertically out of its expected textline position
評価スケール	1.5	Rating scaling factor
確実性スケール	20	Certainty scaling factor
tessedit_class_miss_scale	0.00390625	Scale factor for features not used
適応剪定係数を分類する	2.5	Prune poor adapted results this much worse than best result
適応剪定しきい値の分類	-1	Threshold at which 適応剪定係数を分類する starts
文字断片分類_ガベージ確実性しきい値	-3	Exclude fragments that do not look like whole characters from training and adaption
スペックル_large_max_size	0.3	Max large speckle size
スペックル評価ペナルティ	10	Penalty to add to worst rating for noise
xheight_penalty_subscripts	0.125	Score penalty (0.1 = 10%) added if there are subscripts or superscripts in a word, but it is otherwise OK.
xheight_penalty_inconsistent	0.25	Score penalty (0.1 = 10%) added if an xheight is inconsistent.
セグメントペナルティ辞書頻出単語	1	Score multiplier for word matches which have good case and are frequent in the given language (lower is better).
セグメントペナルティ辞書ケースOK	1.1	Score multiplier for word matches that have good case (lower is better).
セグメントペナルティ辞書ケース不良	1.3125	Default score multiplier for word matches, which may have case issues (lower is better).
セグメントペナルティ辞書非単語	1.25	Score multiplier for glyph fragment segmentations which do not match a dictionary word (lower is better).
確実性スケール	20	Certainty scaling factor
stopper_nondict_certainty_base	-2.5	Certainty threshold for non-dict words
stopper_phase2_certainty_rejection_offset	1	Reject certainty offset
stopper_certainty_per_char	-0.5	Certainty to add for each dict char above small word size.
stopper_allowable_character_badness	3	Max certaintly variation allowed in a word (in sigma)
doc_dict_pending_threshold	0	Worst certainty for using pending dictionary
doc_dict_確実性しきい値	-2.25	Worst certainty for words that can be inserted into the document dictionary
tessedit_certainty_threshold	-2.25	Good blob limit
chop_split_dist_knob	0.5	Split length adjustment
chop_overlap_knob	0.9	Split overlap adjustment
chop_center_knob	0.15	Split center adjustment
chop_sharpness_knob	0.06	Split sharpness adjustment
chop_width_change_knob	5	Width change adjustment
chop_ok_split	100	OK split limit
chop_good_split	50	Good split limit
セグメント検索最大文字数比率	2	最大文字幅と高さの比率

最良の結果を得るためには、OCRを適用する前にIronOCRの画像前処理フィルターを使用することをお勧めします。これらのフィルタは、特に低品質スキャンや表のような複雑なドキュメントを扱うときに、劇的に精度を向上させることができます。

よくある質問

C#でのOCRのためのIronTesseractの設定方法は？

IronTesseractを設定するには、IronTesseractインスタンスを作成し、LanguageやConfigurationなどのプロパティを設定します。OCR言語(125のサポート言語から)を指定し、BarCode読み取りを有効にし、検索可能なPDF出力を設定し、文字のホワイトリストを設定することができます。例えば: var tesseract = new IronOcr.IronTesseract { Language = IronOcr.OcrLanguage.English, Configuration = new IronOcr.TesseractConfiguration { ReadBarCodes = false, RenderSearchablePdf = true } }.};

IronTesseractはどのような入力フォーマットに対応していますか？

IronTesseractはOcrInputクラスを通して様々な入力フォーマットを受け入れます。画像(PNG、JPGなど)、PDFファイル、スキャンしたドキュメントを処理することができます。OcrInputクラスは、これらの異なるフォーマットを読み込むための柔軟なメソッドを提供しており、テキストを含むほぼ全てのドキュメントに対してOCRを簡単に実行することができます。

IronTesseractを使ってテキストと一緒にBarCodeを読むことはできますか？

IronTesseractには高度なバーコード読み取り機能があります。TesseractConfigurationでReadBarCodes = trueを設定することでバーコード検出を有効にすることができます。これにより、一度のOCR操作で同じドキュメントからテキストとバーコードの両方のデータを抽出することができます。

スキャンした文書から検索可能なPDFを作成するには？

IronTesseractは、TesseractConfigurationでRenderSearchablePdf = trueを設定することで、スキャンした文書や画像を検索可能なPDFに変換することができます。これにより、元のドキュメントの外観を維持したまま、テキストが選択可能で検索可能なPDFファイルが作成されます。

IronTesseractはどの言語のOCRをサポートしていますか？

IronTesseractはテキスト認識のために125の国際言語をサポートしています。IronOcr.OcrLanguage.English、スペイン語、中国語、アラビア語など、IronTesseractインスタンスのLanguageプロパティを設定することで言語を指定することができます。

OCR時に認識される文字を制限することはできますか？

はい、IronTesseractではTesseractConfigurationのWhiteListCharactersプロパティを通して文字のホワイトリストとブラックリストが可能です。この機能は、認識対象を英数字のみに限定するなど、想定される文字セットがわかっている場合に精度の向上に役立ちます。

複数の文書を同時にOCRするにはどうすればよいですか？

IronTesseractはバッチ処理のためのマルチスレッド機能をサポートしています。並列処理を活用して複数のドキュメントを同時にOCRすることができ、大量の画像やPDFを扱う際のパフォーマンスを大幅に向上させます。

IronOCRはどのバージョンのTesseractを使用していますか？

IronOCRは、Iron Tesseractとして知られるTesseract 5のカスタマイズされ最適化されたバージョンを使用しています。この強化されたエンジンは、.NETアプリケーションとの互換性を維持しながら、標準的なTesseractの実装に比べて精度とパフォーマンスを向上させています。

IronOCRはデータ精度をどのように向上させますか？

IronOCRはその高度な認識アルゴリズムと画像補正機能により、信頼性が高く正確なテキスト抽出プロセスを保証します。

IronOCRの無料トライアルを利用できますか？

はい、Iron SoftwareはIronOCRの無料トライアルを提供しており、ユーザーが購入決定をする前にその機能と能力をテストできます。

Curtis Chau

今すぐエンジニアリングチームとチャット

テクニカルライター

Curtis Chauは、カールトン大学でコンピュータサイエンスの学士号を取得し、Node.js、TypeScript、JavaScript、およびReactに精通したフロントエンド開発を専門としています。直感的で美しいユーザーインターフェースを作成することに情熱を持ち、Curtisは現代のフレームワークを用いた開発や、構造の良い視覚的に魅力的なマニュアルの作成を楽しんでいます。

開発以外にも、CurtisはIoT（Internet of Things）への強い関心を持ち、ハードウェアとソフトウェアの統合方法を模索しています。余暇には、ゲームをしたりDiscordボットを作成したりして、技術に対する愛情と創造性を組み合わせています。

Jeffrey T. Fritz

プリンシパルプログラムマネージャー - .NETコミュニティチーム

Jeffはまた、.NETとVisual Studioチームのプリンシパルプログラムマネージャーです。彼は.NET Conf仮想会議シリーズのエグゼクティブプロデューサーであり、週に二回放送される開発者向けライブストリーム『Fritz and Friends』のホストを務め、テクノロジーについて話すことや視聴者と一緒にコードを書くことをしています。Jeffはワークショップ、プレゼンテーション、およびMicrosoft Build、Microsoft Ignite、.NET Conf、Microsoft MVPサミットを含む最大のMicrosoft開発者イベントのコンテンツを企画しています。

準備はできましたか？

Nuget ダウンロード 6,151,372 | バージョン: 2026.7 リリースされたばかり

ライセンスを見る

まだスクロールしていますか?

すぐに証拠が欲しいですか? PM > Install-Package IronOcr
サンプルを実行あなたの画像が検索可能なテキストになるのをご覧ください。

ライセンスを見る

顧客ハイライト:

開発者スポットライト:

ウェビナー:

無料30日間のトライアルを開始

このページでは

C#でのテッセラクトの使い方

IronOCR をNuGetパッケージマネージャでインストール

このコードスニペットをコピーして実行します。

実際の環境でテストするためにデプロイする

基本的なOCRワークフロー

どのようにIronTesseractインスタンスを作成しますか？

Tesseractの高度な設定変数とは

コード内でTesseractコンフィギュレーションを使用するには？

すべてのTesseract構成変数の完全なリストは何ですか?

よくある質問

C#でのOCRのためのIronTesseractの設定方法は？

IronTesseractはどのような入力フォーマットに対応していますか？

IronTesseractを使ってテキストと一緒にBarCodeを読むことはできますか？

スキャンした文書から検索可能なPDFを作成するには？

IronTesseractはどの言語のOCRをサポートしていますか？

OCR時に認識される文字を制限することはできますか？

複数の文書を同時にOCRするにはどうすればよいですか？

IronOCRはどのバージョンのTesseractを使用していますか？

IronOCRはデータ精度をどのように向上させますか？

IronOCRの無料トライアルを利用できますか？

まだスクロールしていますか?

ライセンスキーがメールボックスに配信されました

デモリクエストが受け付けられました。

アイアンサポートチーム

無料30日間のトライアルを開始

このページでは

C#でのテッセラクトの使い方

IronOCR をNuGetパッケージマネージャでインストール

このコード スニペットをコピーして実行します。

実際の環境でテストするためにデプロイする

基本的なOCRワークフロー

どのようにIronTesseractインスタンスを作成しますか？

Tesseractの高度な設定変数とは

コード内でTesseractコンフィギュレーションを使用するには？

すべてのTesseract構成変数の完全なリストは何ですか?

よくある質問

C#でのOCRのためのIronTesseractの設定方法は？

IronTesseractはどのような入力フォーマットに対応していますか？

IronTesseractを使ってテキストと一緒にBarCodeを読むことはできますか？

スキャンした文書から検索可能なPDFを作成するには？

IronTesseractはどの言語のOCRをサポートしていますか？

OCR時に認識される文字を制限することはできますか？

複数の文書を同時にOCRするにはどうすればよいですか？

IronOCRはどのバージョンのTesseractを使用していますか？

IronOCRはデータ精度をどのように向上させますか？

IronOCRの無料トライアルを利用できますか？

まだスクロールしていますか?

無料をゲット

次のステップ：30日間の無料トライアルを開始

Thank You

次のステップ：30日間の無料トライアルを開始

IronSuiteを実際のプロジェクトに無料で導入してみませんか？

含まれているものは？

ライセンスキーがメールボックスに配信されました

デモリクエストが受け付けられました。

世界中の数百万人のエンジニアから信頼されています。

アイアンサポートチーム

このコードスニペットをコピーして実行します。