OCR 工具

如何在 Python 中建立 OCR

Name: IronOCR
Brand: Iron Software
Availability: InStock
Rating: 4.86 (101 reviews)

Kannapat Udonpant

已更新:2026年4月21日

世界上充斥著大量的文字資訊。若能稍微提高其可存取性，將有大量寶貴的內容能夠大有用途。

這就是光學字元識別（OCR）技術大展拳腳的地方。想像一下，一台電腦能像人類一樣"讀取"圖像中的文字，但這是計算機視覺，它代表了計算機科學的一個部分，我們可以訓練電腦識別和識別圖像中的不同主題。

在本教程中，我們將引導您使用簡潔多樣的程式語言Python構建自己的OCR系統。在像Tesseract、IronOCR和OpenCV這樣的程式庫的幫助下，您很快便能夠解鎖從文件圖像中提取、操縱和處理文字的潛力。

OCR引擎的先決條件（光學字元識別）

在我們深入研究構建OCR系統的繁瑣細節之前，您需要以下幾樣事物：

Python：確保您的電腦上安裝了Python。您可以從官方Python網站下載。
安裝Tesseract OCR：Tesseract OCR是由Google開發的開源OCR引擎。這是一個強大的工具，我們將在項目中使用它。您可以從GitHub下載Tesseract程式庫，並閱讀有關Tesseract OCR安裝過程的內容。
Python程式庫：我們將在此項目中使用兩個重要的Python程式庫：pytesseract 和 opencv-python程式庫。您可以在命令行提示符或終端中使用以下命令安裝它們：
```
pip install pytesseract opencv-python
```
```
pip install pytesseract opencv-python
```
SHELL

如何在Python中構建OCR：圖1

構建OCR系統的步驟

您可以輕鬆地使用Python程式碼、Python OCR程式庫和簡單的Python腳本建構OCR。

步驟1：匯入程式庫

首先，您需要匯入必需的程式庫：

import cv2  # OpenCV library for computer vision
import pytesseract  # Tesseract library for OCR

import cv2  # OpenCV library for computer vision
import pytesseract  # Tesseract library for OCR

PYTHON

步驟2：讀取和處理圖像

使用OpenCV載入圖像並進行預處理以提高OCR準確性：

# Load the image using OpenCV
image = cv2.imread('sample_image.png') 

# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 

# Apply thresholding or other preprocessing techniques if needed
# This step helps in enhancing the quality for better OCR results

# Load the image using OpenCV
image = cv2.imread('sample_image.png') 

# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 

# Apply thresholding or other preprocessing techniques if needed
# This step helps in enhancing the quality for better OCR results

PYTHON

步驟3：使用Tesseract進行OCR

現在是時候使用Tesseract OCR引擎對處理後的圖像進行OCR：

# Set the path to the Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

# Use pytesseract to perform OCR on the grayscale image
text = pytesseract.image_to_string(gray_image)

# Set the path to the Tesseract executable
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

# Use pytesseract to perform OCR on the grayscale image
text = pytesseract.image_to_string(gray_image)

PYTHON

步驟4：顯示結果

如果您想要看到原始圖像和提取的文字，您可以使用OpenCV來顯示它們：

# Display the original image using OpenCV
cv2.imshow('Original Image', image) 
cv2.waitKey(0) 

# Print the extracted text to the console
print("Extracted Text:", text) 

cv2.destroyAllWindows()  # Close the OpenCV window

# Display the original image using OpenCV
cv2.imshow('Original Image', image) 
cv2.waitKey(0) 

# Print the extracted text to the console
print("Extracted Text:", text) 

cv2.destroyAllWindows()  # Close the OpenCV window

PYTHON

原始圖像

如何在Python中構建OCR：圖2

提取的文字

如何在Python中構建OCR：圖3

如您所見，結果可能因圖像的質量和複雜性而異，某些情況下，對於包含表格等複雜結構的圖像，可能需要額外的訓練（類似於機器學習訓練）。

IronOCR

在資料泛濫的世界中，輕鬆將印刷文字轉換為機器可讀內容是一種變革性的能力。

引入IronOCR—高端技術，讓開發者輕鬆地將強大的光學字元識別（OCR）功能整合到他們的應用中。

無論您是在從掃描的文件中提取資料、自動化資料輸入，還是增強可存取性，IronOCR都提供了超越傳統文字識別界限的全面解決方案。

在這次探索中，我們深入研究IronOCR的領域，揭示其多功能特性，並強調其在物理與數位世界之間架起橋樑的潛力。

安裝IronOCR

只需運行以下命令，您就可以輕鬆地使用NuGet套件管理控制台安裝IronOCR。

Install-Package IronOcr

IronOCR也可以在官方NuGet網站下載。

使用IronOCR從圖像中提取文字

在本節中，我們將看到如何輕鬆使用IronOCR從圖像中提取文字。以下是從圖像中提取文字的源程式碼。

using IronOcr;
using System;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    input.AddImage("r3.png");
    OcrResult result = ocr.Read(input);
    string text = result.Text;
    Console.WriteLine(result.Text);
}

using IronOcr;
using System;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    input.AddImage("r3.png");
    OcrResult result = ocr.Read(input);
    string text = result.Text;
    Console.WriteLine(result.Text);
}

Imports IronOcr
Imports System

Private ocr = New IronTesseract()

Using input = New OcrInput()
	input.AddImage("r3.png")
	Dim result As OcrResult = ocr.Read(input)
	Dim text As String = result.Text
	Console.WriteLine(result.Text)
End Using

$vbLabelText $csharpLabel

輸出

如何在Python中構建OCR：圖4

結論

在本教程中，我們探討了在Python中構建光學字元識別（OCR）系統的過程，展示了能夠輕鬆從圖像中提取文字的能力。

通過利用像Tesseract和OpenCV這樣的程式庫，我們從載入和預處理圖像到使用Tesseract OCR引擎提取文字，走過了基本步驟。

我們也觸及了像準確性限制這樣的潛在挑戰，而IronOCR這樣的高級解決方案旨在解決這些問題。

無論您選擇DIY途徑還是採用先進的工具，OCR世界承諾將圖像變成可操作的文字、簡化資料輸入並增強可存取性。擁有這些新獲得的知識，您將準備好踏上無縫融合視覺和數位領域的旅程。

要開始IronOCR請存取以下連結。要查看整個有關如何從圖像中提取文字的教程，請存取這裡。

如果您今天想免費試用IronOCR，務必選擇IronOCR提供的試用，以探索其在商業環境中不帶水印的所有用途及潛力。 15天結束後繼續使用，只需購買授權即可。

Kannapat Udonpant

立即與工程團隊聊天

軟體工程師

在成為軟體工程師之前，Kannapat在日本北海道大學完成了環境資源博士學位。在攻讀學位期間，Kannapat還成為車輛機器人實驗室的一員，該實驗室隸屬於生產工程系。在2022年，他憑藉C#技能加入了Iron Software的工程團隊，專注於IronPDF。Kannapat珍視他的工作，因為他能直接向撰寫大部分IronPDF程式碼的開發者學習。除了同儕學習，Kannapat還喜歡在Iron Software工作的社交方面。不寫程式碼或文件時，Kannapat通常在他的PS5上玩遊戲或重看The Last of Us。