OCR 工具 Easyocr 与 Tesseract(OCR 功能比较) Kannapat Udonpant 已更新:六月 22, 2025 Download IronOCR NuGet 下载 DLL 下载 Windows 安装程序 Start Free Trial Copy for LLMs Copy for LLMs Copy page as Markdown for LLMs Open in ChatGPT Ask ChatGPT about this page Open in Gemini Ask Gemini about this page Open in Grok Ask Grok about this page Open in Perplexity Ask Perplexity about this page Share Share on Facebook Share on X (Twitter) Share on LinkedIn Copy URL Email article Optical Character Recognition (OCR) is the technology that helps process documents, such as scanned paper documents, PDF files, or camera-captured high-resolution images, into printable and searchable data. The recognition of extracted text features and morphological operations allows OCR to automate data entry, which speeds up the information processing process and makes it more accurate. OCR scans the document, recognizes the characters, such as letters, numbers, or symbols, and translates it into a machine-readable format. Its uses include book digitization, form processing, automation of document workflow, and improvement in accessibility to blind people. With the development of deep learning and AI, OCR engines became very accurate in recognizing complex formats, multi-language documents, and even poor-quality images. Popular OCR tools and libraries, like EasyOCR, Tesseract OCR, Keras-OCR, and IronOCR, are commonly employed to integrate this functionality into modern applications. EasyOCR EasyOCR is an open-source Python library that aims to make text extraction from images simple and efficient. It uses deep learning techniques and supports over 80 languages, including Latin, Chinese, Arabic, and many others. Its API is simple enough that anyone can easily integrate OCR prediction functionality into their applications without much set-up. With EasyOCR Tesseract, one can do simple document digitization, license plate recognition, or even extract text from a picture. EasyOCR is well known for its robust text recognition capabilities, especially with multi-line text and low-quality images. Therefore, it's suitable for real-world use cases, relying only on a few dependencies. It is lightweight and runs efficiently without the need for a GPU on modern hardware, making it quite attractive for developers in need of flexible OCR capabilities. Features of EasyOCR There are several features that make EasyOCR a comprehensive and powerful OCR utility: Recognizes over 80 languages: EasyOCR can read Chinese, Japanese, Korean, Arabic, Latin-based languages, and many more including complex words and languages. Advanced deep learning-based recognition: It supports advanced deep learning techniques with high performance and precision, especially in noisy or distorted text layouts and images. Simple API: This easy-to-use API lets users quickly get OCR capabilities within an application without further configuration. Multi-line text detection: It recognizes multiple lines of text, which is useful for documents, books, or multi-line signs. Lightweight: It runs well on the CPU and can leverage a GPU for improved performance, yet remains workable with basic hardware. Image pre-processing: Basic image pre-processing tools are available for cleaning up OCR output from noisy or low-resolution images. Flexible deployment: Works on various platforms and is relatively simple to embed in Python applications. Installation EasyOCR can be installed using pip, Python's package manager. Ensure that all the dependencies have been satisfied first. The essential dependencies include PyTorch libraries: torch and torchvision. These can be installed together with EasyOCR: Install EasyOCR: Open a terminal or command line and enter the command: pip install easyocr pip install easyocr SHELL Install PyTorch, if not installed (required by EasyOCR): EasyOCR runs on PyTorch. If not automatically installed in your environment, follow the official PyTorch installation guide. Once installed, you'll be ready to use EasyOCR for text extraction tasks. OCR Image using EasyOCR The following is a sample Python code demonstrating how to use EasyOCR to perform OCR on an image: import easyocr import matplotlib.pyplot as plt import cv2 # Initialize the EasyOCR reader with the English language specified reader = easyocr.Reader(['en']) # Specify the languages (e.g., 'en' for English) # Load the image image_path = 'sample_image.png' # Path to the image image = cv2.imread(image_path) # Perform OCR on the image result = reader.readtext(image_path) # Print detected text and its bounding boxes for (bbox, text, prob) in result: print(f"Detected Text: {text} (Confidence: {prob:.4f})") # Optionally, display the image with bounding boxes around the detected text for (bbox, text, prob) in result: # Unpack the bounding box top_left, top_right, bottom_right, bottom_left = bbox top_left = tuple(map(int, top_left)) bottom_right = tuple(map(int, bottom_right)) # Draw a rectangle around the text cv2.rectangle(image, top_left, bottom_right, (0, 255, 0), 2) # Convert the image to RGB (since OpenCV loads images in BGR by default) image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Display the image with bounding boxes plt.imshow(image_rgb) plt.axis('off') plt.show() import easyocr import matplotlib.pyplot as plt import cv2 # Initialize the EasyOCR reader with the English language specified reader = easyocr.Reader(['en']) # Specify the languages (e.g., 'en' for English) # Load the image image_path = 'sample_image.png' # Path to the image image = cv2.imread(image_path) # Perform OCR on the image result = reader.readtext(image_path) # Print detected text and its bounding boxes for (bbox, text, prob) in result: print(f"Detected Text: {text} (Confidence: {prob:.4f})") # Optionally, display the image with bounding boxes around the detected text for (bbox, text, prob) in result: # Unpack the bounding box top_left, top_right, bottom_right, bottom_left = bbox top_left = tuple(map(int, top_left)) bottom_right = tuple(map(int, bottom_right)) # Draw a rectangle around the text cv2.rectangle(image, top_left, bottom_right, (0, 255, 0), 2) # Convert the image to RGB (since OpenCV loads images in BGR by default) image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Display the image with bounding boxes plt.imshow(image_rgb) plt.axis('off') plt.show() PYTHON The below image is the output generated from the above code. Tesseract Tesseract is one of the most popular open-source optical character recognition engines, supporting multiple hyperparameter options for customization. It can be accessed from Python applications using pytesseract. The development of Tesseract was initiated by Hewlett-Packard and later enhanced by Google. It is highly versatile, capable of extracting text from images and PDFs in more than 100 languages. The Python wrapper allows seamless interaction with Tesseract through pytesseract. Tesseract is renowned for its ability to detect and extract machine-printed text. It offers multi-language recognition capabilities, supports training on new fonts, and performs text layout analysis. Tesseract is extensively used in digitizing documents, scanning receipts, automating data entry, and enabling searchable PDFs. In Python, Tesseract forms a powerful combination for developers working on OCR-related tasks. Features of Tesseract OCR Notable features of pytesseract include: Multi-language support: Tesseract can read over 100 languages, and pytesseract provides easy multi-language OCR support within Python scripts. It also allows training for additional custom fonts and languages, extending its capabilities. Image-to-text conversion: Pytesseract extracts text content from various image formats like PNG, JPEG, BMP, GIF, and TIFF, enabling OCR on diverse sources. Transformation from PDF to searchable PDF: Tesseract reads the text within a PDF file and converts it into a searchable format, allowing users to index the content of scanned documents. Complex text layout recognition: It can read complex layouts, including multi-column documents and tables, extracting text from non-standard formats more accurately. Custom configuration: Users can pass custom Tesseract configuration parameters through pytesseract to fine-tune OCR performance, using appropriate recognition modes or image attributes. Simple API: The simple API in pytesseract makes it easy for developers to add OCR to Python projects with minimal code for interaction. This library works well with other libraries, such as OpenCV, PIL (Python Imaging Library), or NumPy, for image preprocessing to improve OCR accuracy. Installation After installing Tesseract, install the pytesseract package using pip: pip install pytesseract pip install pytesseract SHELL OCR image using pytesseract Here's a sample Python code using pytesseract to perform OCR on an image: import pytesseract from PIL import Image # Set the path to the Tesseract executable pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>' # Open the image and perform OCR image = Image.open('sample_image.png') text = pytesseract.image_to_string(image) # Print the extracted text print(text) import pytesseract from PIL import Image # Set the path to the Tesseract executable pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>' # Open the image and perform OCR image = Image.open('sample_image.png') text = pytesseract.image_to_string(image) # Print the extracted text print(text) PYTHON Below is the output generated from the above code. IronOCR IronOCR is a powerful Optical Character Recognition library that allows .NET developers to leverage IronOCR for efficient text extraction from images, PDFs, and other document formats. Advanced algorithms provide high accuracy even for complex layouts or multi-language environments, supporting JPEG, PNG, GIF, and TIFF formats. The library offers configurable settings, enabling fine-tuning of the OCR engine process with parameters such as image resolution or text orientation. The feature of image preprocessing ensures better quality input images result in higher recognition accuracy and further output documents as searchable PDF conversion for easier information retrieval. With its seamless integration into web applications, IronOCR is a strong choice for developers looking to implement reliable text extraction and document digitization solutions across various fields. Features of IronOCR High Accuracy: Uses advanced algorithms to provide high accuracy levels in text recognition regardless of document complexity or font usage. Multiple Formats Support: Accepts image formats like JPEG, PNG, GIF, and TIFF, in addition to PDFs, for versatility across applications. Multilingual Recognition: Supports multilingual OCR, yielding accurate results in diverse linguistic contexts. Text Layout Preservation: Maintains the original document layout, ensuring extracted text retains its formatted structure. Configurable OCR: Offers configurable parameters for image resolution, text orientation, and more, allowing developers to optimize OCR performance for specific images. Image Preprocessing: Includes basic tools for enhancing images, such as noise removal, contrast adjustment, and resizing, to improve OCR accuracy. Searchable PDF Conversion: Converts scanned images and documents directly into searchable PDFs for efficient data management and retrieval. Easy Integration: Facilitates straightforward integration into .NET applications, allowing users to easily add OCR functionality. Batch Processing: Supports processing multiple images or documents simultaneously, useful for handling large volumes of data. Installation To install IronOCR, open NuGet Package Manager in Visual Studio, start a new project, search for "IronOCR," select the latest version, and click Install. Sample Code using IronOCR The following C# code demonstrates how to use IronOCR for OCR processing: using IronOcr; class Program { static void Main(string[] args) { // Initialize IronTesseract engine var Ocr = new IronTesseract(); // Add languages to the OCR engine Ocr.Language = OcrLanguage.English; // Define the path to the input image var inputFile = @"path\to\your\image.png"; // Read the image and perform OCR using (var input = new OcrInput(inputFile)) { var result = Ocr.Read(input); // Display the extracted text Console.WriteLine("Text:"); Console.WriteLine(result.Text); } } } using IronOcr; class Program { static void Main(string[] args) { // Initialize IronTesseract engine var Ocr = new IronTesseract(); // Add languages to the OCR engine Ocr.Language = OcrLanguage.English; // Define the path to the input image var inputFile = @"path\to\your\image.png"; // Read the image and perform OCR using (var input = new OcrInput(inputFile)) { var result = Ocr.Read(input); // Display the extracted text Console.WriteLine("Text:"); Console.WriteLine(result.Text); } } } Imports IronOcr Friend Class Program Shared Sub Main(ByVal args() As String) ' Initialize IronTesseract engine Dim Ocr = New IronTesseract() ' Add languages to the OCR engine Ocr.Language = OcrLanguage.English ' Define the path to the input image Dim inputFile = "path\to\your\image.png" ' Read the image and perform OCR Using input = New OcrInput(inputFile) Dim result = Ocr.Read(input) ' Display the extracted text Console.WriteLine("Text:") Console.WriteLine(result.Text) End Using End Sub End Class $vbLabelText $csharpLabel Comparative Assessment High Accuracy IronOCR stands out for its accuracy with complex layouts, noisy images, and low-resolution texts when compared to Tesseract or EasyOCR. Its built-in image preprocessing tools, such as noise reduction and contrast adjustments, contribute to achieving high accuracy in real-world applications. Multi-format and Layout Preservation IronOCR excels in processing various image formats, PDF files, and multi-column layouts while preserving original document structure and formatting. It is well-suited for projects where layout preservation is paramount. Its capability to directly convert images and scanned documents into fully searchable PDFs without relying on additional tools or libraries gives it an advantage over Tesseract and EasyOCR. IronOCR Provides Advanced Preprocessing Even poor-quality images can achieve high OCR accuracy using IronOCR’s advanced preprocessing features, which reduce the need for additional libraries like OpenCV, making it a comprehensive solution for text extraction. Scalability and Performance Optimized for high-speed, resource-efficient OCR, IronOCR supports scalability for large document processing tasks, a priority for enterprise applications. Support and Updates With commercial support, IronOCR benefits from regular updates, bug fixes, and dedicated assistance, offering long-term reliability and the latest advancements in OCR, unlike open-source options such as Tesseract and EasyOCR. Conclusion In the realm of significant OCR libraries, IronOCR is distinguished by its superior accuracy, ease of integration, pre-processing capabilities, and creation of searchable PDFs. It adeptly handles complex layouts and noisy images while preserving document structure, supporting multiple languages out of the box. These features make it preferable over open-source solutions like Tesseract and EasyOCR. Encompassing seamless integration with both .NET and Python, IronOCR serves as a comprehensive package for developers seeking high-quality OCR in diverse projects. Given its commendable performance, scalability, and commercial support, IronOCR is well-suited for extensive small and large-scale document digitization initiatives, offering reliable and efficient text recognition. To learn more about IronOCR and its functionalities, you may visit the documentation page. For further details about Iron Software products, refer to the library suite page. Kannapat Udonpant 立即与工程团队聊天 软件工程师 在成为软件工程师之前,Kannapat 在日本北海道大学完成了环境资源博士学位。在攻读学位期间,Kannapat 还成为了车辆机器人实验室的成员,隶属于生物生产工程系。2022 年,他利用自己的 C# 技能加入 Iron Software 的工程团队,专注于 IronPDF。Kannapat 珍视他的工作,因为他可以直接从编写大多数 IronPDF 代码的开发者那里学习。除了同行学习外,Kannapat 还喜欢在 Iron Software 工作的社交方面。不撰写代码或文档时,Kannapat 通常可以在他的 PS5 上玩游戏或重温《最后生还者》。 相关文章 已更新六月 22, 2025 Power Automate OCR(开发者教程) 光学字符识别技术在文档数字化、自动化PDF数据提取和录入、发票处理和使扫描的 PDF 可搜索的应用中得到了应用。 阅读更多 已更新六月 22, 2025 如何将图片转化为文本 在当前的数字时代,将基于图像的内容转化为易于阅读的可编辑、可搜索文本 阅读更多 已更新六月 22, 2025 收据 OCR 库(开发者列表) 这些收据 OCR API 库使开发人员能够轻松地将强大的收据功能集成到他们的 .NET 应用程序中,彻底革新数据管理工作流程。 阅读更多 Power Automate OCR(开发者教程)如何将图片转化为文本
已更新六月 22, 2025 Power Automate OCR(开发者教程) 光学字符识别技术在文档数字化、自动化PDF数据提取和录入、发票处理和使扫描的 PDF 可搜索的应用中得到了应用。 阅读更多