在IronOCR OCR辨識中減少 PDF 擷取圖片文字輸出的檔案大小
This article was translated from English: Does it need improvement?
Translated
View the article in English
如何在IronOCR中減少輸出PDF檔案的大小?
IronOCR會自動將偵測到的低品質(低於 150DPI)的輸入影像放大,以確保準確的讀取結果。
如果偵測到 DPI 低於 150,則 TargetDPI(預設為 225DPI)定義 PDF 渲染的 DPI。 這與手動設定 TargetDPI = 225 相同。
若要減少輸出檔案大小,您可以設定較低的 TargetDPI,這樣會建立更小的 PDF 檔案。 但是,設定得太低可能會影響 OCR 效能,因此保持平衡至關重要。
建議值分別為 96、72、48。
// Example of reducing PDF output file size by lowering the DPI
// Example 1: Reducing DPI to 96
using IronOcr; // Import IronOCR namespace
var Ocr = new IronTesseract(); // Initialize IronTesseract for OCR operations
using (var Input = new OcrInput()) // Create OCR input object
{
Input.TargetDPI = 96; // Set the desired DPI; 96 is used for smaller output size
Input.AddPdf("example.pdf", "password"); // Add input PDF (with optional password)
var Result = Ocr.Read(Input); // Perform OCR on the input
Console.WriteLine(Result.Text); // Output recognized text to the console
}
// Example 2: Another way to set DPI
var ocr = new IronTesseract();
using (var ocrInput = new OcrInput()) // Create a new OCR input object
{
ocrInput.AddPdf("img/Input.pdf", 72); // Add PDF with the specified DPI of 72
var ocrResult = ocr.Read(ocrInput); // Read and process the PDF
ocrResult.SaveAsSearchablePdf(@"Output.pdf"); // Save result to a searchable PDF
}
// Example of reducing PDF output file size by lowering the DPI
// Example 1: Reducing DPI to 96
using IronOcr; // Import IronOCR namespace
var Ocr = new IronTesseract(); // Initialize IronTesseract for OCR operations
using (var Input = new OcrInput()) // Create OCR input object
{
Input.TargetDPI = 96; // Set the desired DPI; 96 is used for smaller output size
Input.AddPdf("example.pdf", "password"); // Add input PDF (with optional password)
var Result = Ocr.Read(Input); // Perform OCR on the input
Console.WriteLine(Result.Text); // Output recognized text to the console
}
// Example 2: Another way to set DPI
var ocr = new IronTesseract();
using (var ocrInput = new OcrInput()) // Create a new OCR input object
{
ocrInput.AddPdf("img/Input.pdf", 72); // Add PDF with the specified DPI of 72
var ocrResult = ocr.Read(ocrInput); // Read and process the PDF
ocrResult.SaveAsSearchablePdf(@"Output.pdf"); // Save result to a searchable PDF
}
' Example of reducing PDF output file size by lowering the DPI
' Example 1: Reducing DPI to 96
Imports IronOcr ' Import IronOCR namespace
Private Ocr = New IronTesseract() ' Initialize IronTesseract for OCR operations
Using Input = New OcrInput() ' Create OCR input object
Input.TargetDPI = 96 ' Set the desired DPI; 96 is used for smaller output size
Input.AddPdf("example.pdf", "password") ' Add input PDF (with optional password)
Dim Result = Ocr.Read(Input) ' Perform OCR on the input
Console.WriteLine(Result.Text) ' Output recognized text to the console
End Using
' Example 2: Another way to set DPI
Dim ocr = New IronTesseract()
Using ocrInput As New OcrInput() ' Create a new OCR input object
ocrInput.AddPdf("img/Input.pdf", 72) ' Add PDF with the specified DPI of 72
Dim ocrResult = ocr.Read(ocrInput) ' Read and process the PDF
ocrResult.SaveAsSearchablePdf("Output.pdf") ' Save result to a searchable PDF
End Using
$vbLabelText
$csharpLabel
若要停用自動放大,請使用 TargetDPI = 0。 這將使IronOCR按原樣讀取輸入文件,忽略 TargetDPI 值。
更多資訊請參閱 API: IronOCR API 參考
準備好開始了嗎?
Nuget 下載 5,585,834 | 版本: 2026.4 剛剛發布

