OCR 工具 安装 Tesseract(带图片的逐步教程) Kannapat Udonpant 已更新:八月 20, 2025 Download IronOCR NuGet 下载 DLL 下载 Windows 安装程序 Start Free Trial Copy for LLMs Copy for LLMs Copy page as Markdown for LLMs Open in ChatGPT Ask ChatGPT about this page Open in Gemini Ask Gemini about this page Open in Grok Ask Grok about this page Open in Perplexity Ask Perplexity about this page Share Share on Facebook Share on X (Twitter) Share on LinkedIn Copy URL Email article What is Tesseract OCR? Tesseract is an open-source software library, released under the Apache license agreement. It was originally developed by Hewlett Packard in the 1980s. It is a text recognition tool primarily used for identifying and extracting texts from images. Tesseract OCR provides a command prompt interface for performing this functionality. IronOCR is built on top of Tesseract. Read texts from images and PDFs with 99.8% accuracy with just a few lines of code (without relying on external web services). IronOCR extracts content from poor quality images and scans as well. Say goodbye to cumbersome performance tuning and tedious preprocessing work. Trust IronOCR to do the job quickly when speed, accuracy, and ease of use matters. Learn more about IronOCR's features or sign up for a trial today! How to Download Tesseract OCR in Windows Download Tesseract Installer for Windows Install Tesseract OCR Add installation path to Environment Variables Run Tesseract OCR 1. Download Tesseract Installer for Windows To use the Tesseract command on Windows, we first need to download the Tesseract OCR binaries .exe Windows Installer. There are many places where you can download the latest version of Tesseract OCR. One such place is from UB Mannheim, which is forked from tesseract-ocr/tesseract (Main Repository). Tesseract Wiki Download the tesseract-ocr-w64-setup-5.3.0.20221222.exe (64 bit) Windows Installer. For macOS users, Tesseract can be installed in the terminal using either of the commands below: brew install tesseract brew install tesseract SHELL sudo port install tesseract sudo port install tesseract SHELL 2. Install Tesseract OCR Next, we'll install Tesseract using the .exe file we downloaded in the previous step. Launch the .exe installer to start Tesseract installation. Installer Language Once the unpacking of the setup is completed, the installer's language data dialog will appear. You can install Tesseract to use multiple languages by selecting additional language packs, but here we'll just install the language data for the English language. Tesseract Installer Click OK, and the Installer language for Tesseract OCR is set. Tesseract OCR Setup Next, the setup wizard will appear. This Setup Wizard will guide the Tesseract installation for Windows. Tesseract OCR Setup Wizard Click Next to continue the installation. Accept License Agreement Tesseract OCR is licensed under the Apache License Version 2.0. As it is open source and free to use, you can redistribute and modify versions of Tesseract without any royalty concerns. Tesseract OCR is licensed under Apache License v2.0. Please accept this license to continue with the installation. Click I Agree to proceed to installation. Choose Users You can choose to install Tesseract for multiple users or for a single user. Choose to install Tesseract OCR for the Current User (you) or for all user accounts Click Next to choose the components to install with Tesseract. Choose Components From the components list to install, ScrollView, Training Tools, Shortcuts creation, and Language data are all selected by default. We will keep all of the default selected options. You can choose any or skip any component based on the needs. Usually, all are necessary to install. Here, you can choose to include or exclude Tesseract OCR components. For the best results, continue the installation with the default components selected. Click Next to choose the installation location. Choose Installation Location Next, we'll choose the location to install Tesseract. Make sure you copy the destination folder path. We will need this later to add the installation location to the machine's path Environment Variable. Select an install location for the Tesseract OCR library, and remember this location for later. Click Next to further setup the installation of Tesseract. Choose the Start Menu Folder This is the last step in which we will create shortcuts in the Start menu. You can name the folder anything, but I've kept it the same as the default. Choose the name of Tesseract OCR's Start Menu Folder Now, click Install and wait for the installation to complete. Once the installation is done, the following screen will appear. Click Finish, and we are done with installing Tesseract OCR in Windows successfully. Tesseract OCR Installation is now complete. 3. Add Installation Path to System Environment Variables Now, we will add the Tesseract installation path to Windows' Environment Variables. In the Start menu, type "environment variables" or "advanced system settings" The Windows System Properties Dialog Box System Properties Once the System Properties dialog box opens, click on the Advanced tab, and then click the Environment Variables button, located towards the bottom right of the screen. The Environment Variables dialog box will be presented to you. Environment Variables Under System variables, click on the Path variable. Access the Windows' System Environment Variables Now, click Edit. Add Tesseract OCR for Windows Installation Directory to Environment Variables From the Edit environment variable dialog box, click New. Paste the installation location path which was copied during the second step, and click OK. Edit Windows' Path System Environment Variable by adding an entry that includes the Absolute path to the Tesseract OCR installation That's it! We have successfully downloaded, installed, and set the environment variable for Tesseract OCR in Windows machine. 4. Run Tesseract OCR To check that Tesseract OCR for Windows was successfully installed and added to Environment Variables, open the Command prompt (cmd) on your Windows machine, then run the "tesseract" command. If everything worked fine, then a quick usage guide must be displayed with OCR and other single options such as the Tesseract version. Run the tesseract command in Windows Commandline (or Windows Powershell) to make sure that the above installation steps were done correctly. The console output is the expected result of a successful Windows installation. Congratulations! We have successfully installed Tesseract OCR for Windows. IronOCR Library IronOCR is a Tesseract-based C# library that allows .NET software developers to identify and extract text from images and PDF documents. It is purely built in .NET, using the most advanced Tesseract engine known anywhere. Install with NuGet Package Manager Installing IronOCR in Visual Studio or using the Command line with the NuGet Package Manager is straightforward. In Visual Studio, navigate to the Menu options with: Tools > NuGet Package Manager > Package Manager Console Then in Command line, type the following command: Install-Package IronOcr This will install IronOCR with ease, and now you can use it to extract its full potential. You can also download other IronOCR NuGet Packages for different platforms: Windows: https://www.nuget.org/packages/IronOcr Linux: https://www.nuget.org/packages/IronOcr.Linux MacOS: https://www.nuget.org/packages/IronOcr.MacOs MacOS ARM: https://www.nuget.org/packages/IronOcr.MacOs.ARM IronOCR with Tesseract 5 The below sample code shows how easy it is to use IronOCR Tesseract to read text from an image and perform OCR using C#. // Import the IronOCR library using IronOcr; // Create an instance of IronTesseract var Ocr = new IronTesseract(); string Text = Ocr.Read(@"test-files/redacted-employmentapp.png").Text; // Output the extracted text to the console Console.WriteLine(Text); // Printed text // Import the IronOCR library using IronOcr; // Create an instance of IronTesseract var Ocr = new IronTesseract(); string Text = Ocr.Read(@"test-files/redacted-employmentapp.png").Text; // Output the extracted text to the console Console.WriteLine(Text); // Printed text ' Import the IronOCR library Imports IronOcr ' Create an instance of IronTesseract Private Ocr = New IronTesseract() Private Text As String = Ocr.Read("test-files/redacted-employmentapp.png").Text ' Output the extracted text to the console Console.WriteLine(Text) ' Printed text $vbLabelText $csharpLabel If you want more robust code, then the following should help you in achieving the same task: // Import the IronOCR library using IronOcr; // Create an instance of IronTesseract var Ocr = new IronTesseract(); // Using the OcrInput class to handle multiple images using (var Input = new OcrInput()){ // Add an image to the input collection Input.AddImage("test-files/redacted-employmentapp.png"); // You can add any number of images // Read the OCR text from the input var Result = Ocr.Read(Input); // Output the extracted text to the console Console.WriteLine(Result.Text); } // Import the IronOCR library using IronOcr; // Create an instance of IronTesseract var Ocr = new IronTesseract(); // Using the OcrInput class to handle multiple images using (var Input = new OcrInput()){ // Add an image to the input collection Input.AddImage("test-files/redacted-employmentapp.png"); // You can add any number of images // Read the OCR text from the input var Result = Ocr.Read(Input); // Output the extracted text to the console Console.WriteLine(Result.Text); } ' Import the IronOCR library Imports IronOcr ' Create an instance of IronTesseract Private Ocr = New IronTesseract() ' Using the OcrInput class to handle multiple images Using Input = New OcrInput() ' Add an image to the input collection Input.AddImage("test-files/redacted-employmentapp.png") ' You can add any number of images ' Read the OCR text from the input Dim Result = Ocr.Read(Input) ' Output the extracted text to the console Console.WriteLine(Result.Text) End Using $vbLabelText $csharpLabel Input Image Sample input image for IronOCR processing Output Image The output is printed on the Console as: The console returned from the execution of IronOCR on the sample image. Why Choose IronOCR? IronOCR is very easy to install. It provides a complete and well-documented .NET software library. IronOCR achieves a 99.8% text-detection accuracy rate without the need for other third-party libraries or web services. It also provides multithreading support. Most importantly, IronOCR can work with well over 125 international languages. Install IronOCR from NuGet for your next OCR projects to see its full capabilities for yourself. A trial license gives free, unrestricted access to IronOCR's full capabilities for 30 days. Conclusion In this tutorial, we learned how to download and install Tesseract OCR for a Windows machine. Tesseract OCR is excellent software for C++ developers, but it does have some limits. It is not fully developed for .NET. Scanned image files or photographed images need to be processed and standardized to high-resolution, keeping them free from digital noise. Only then can Tesseract accurately work on them. In contrast, IronOCR can work with any image provided, whether scanned or photographed, with just a single line of code. IronOCR also uses Tesseract as its internal OCR engine, but it is finely tuned to get the best out of Tesseract, especially built for C#, with high performance and improved features. You can download the IronOCR software product from thislink. Kannapat Udonpant 立即与工程团队聊天 软件工程师 在成为软件工程师之前,Kannapat 在日本北海道大学完成了环境资源博士学位。在攻读学位期间,Kannapat 还成为了车辆机器人实验室的成员,隶属于生物生产工程系。2022 年,他利用自己的 C# 技能加入 Iron Software 的工程团队,专注于 IronPDF。Kannapat 珍视他的工作,因为他可以直接从编写大多数 IronPDF 代码的开发者那里学习。除了同行学习外,Kannapat 还喜欢在 Iron Software 工作的社交方面。不撰写代码或文档时,Kannapat 通常可以在他的 PS5 上玩游戏或重温《最后生还者》。 相关文章 已更新六月 22, 2025 Power Automate OCR(开发者教程) 光学字符识别技术在文档数字化、自动化PDF数据提取和录入、发票处理和使扫描的 PDF 可搜索的应用中得到了应用。 阅读更多 已更新六月 22, 2025 Easyocr 与 Tesseract(OCR 功能比较) 流行的 OCR 工具和库,如 EasyOCR、Tesseract OCR、Keras-OCR 和 IronOCR,通常用于将此功能集成到现代应用程序中。 阅读更多 已更新六月 22, 2025 如何将图片转化为文本 在当前的数字时代,将基于图像的内容转化为易于阅读的可编辑、可搜索文本 阅读更多 Microsoft OCR 工具(C# 替代)从 PDF 提取 OCR(免费在线...
已更新六月 22, 2025 Power Automate OCR(开发者教程) 光学字符识别技术在文档数字化、自动化PDF数据提取和录入、发票处理和使扫描的 PDF 可搜索的应用中得到了应用。 阅读更多
已更新六月 22, 2025 Easyocr 与 Tesseract(OCR 功能比较) 流行的 OCR 工具和库,如 EasyOCR、Tesseract OCR、Keras-OCR 和 IronOCR,通常用于将此功能集成到现代应用程序中。 阅读更多