使用 IRONOCR 发票 OCR API(开发者教程) Kannapat Udonpant 已更新:六月 22, 2025 Download IronOCR NuGet 下载 DLL 下载 Windows 安装程序 Start Free Trial Copy for LLMs Copy for LLMs Copy page as Markdown for LLMs Open in ChatGPT Ask ChatGPT about this page Open in Gemini Ask Gemini about this page Open in Grok Ask Grok about this page Open in Perplexity Ask Perplexity about this page Share Share on Facebook Share on X (Twitter) Share on LinkedIn Copy URL Email article Invoice OCR API utilizes machine learning and computer vision to transform invoice data into a format suitable for automated processing. This technology addresses manual data entry issues like delays, costs, and errors, accurately extracting details like vendor information, invoice numbers, and prices from both digital and scanned invoices. This article will use a top-of-the-line invoice OCR API named IronOCR. How to Create Invoice OCR API Download and Install the Invoice OCR API Create a New C# project in Visual Studio or open an existing one. Load an existing image file using OcrInput method Extract the Text from Image using Ocr.Read method. Print the extracted text in Console using Console.WriteLine 1. IronOCR IronOCR, developed by Iron Software, is an OCR library offering a range of tools for developers. It uses machine learning and computer vision to extract text from scanned documents, images, and PDFs, enabling automated processing. Its APIs integrate into various languages and platforms, reducing manual data entry errors and improving efficiency. Extracted data can be analyzed and integrated into existing systems, aiding decision-making and productivity. Features like image preprocessing, barcode recognition, and file parsing increase its versatility. IronOCR empowers developers to incorporate text recognition into their applications. 2. Prerequisites Before you can start working with IronOCR, there are a few prerequisites that need to be in place. These prerequisites include: Ensure that you have a suitable development environment set up on your computer. This typically involves having an Integrated Development Environment (IDE) such as Visual Studio installed. It's important to have a basic understanding of the C# programming language. This will enable you to comprehend and modify the code examples provided in the article effectively. You'll need to have the IronOCR library installed in your project. This can be accomplished by using the NuGet Package Manager within Visual Studio or through the command line interface. By ensuring that these prerequisites are met, you'll be ready to dive into the process of working with IronOCR. 3. Creating a New Visual Studio Project To get started with IronOCR, the first step is to create a new Visual Studio project. Open Visual Studio and go to Files, then hover on New, and click on Project. New Project In the new window, select Console Application and click on Next. Console Application A new window will appear, write the name of your new project, and location and click on Next. Project Configuration Finally, provide the Target framework and click on Create. Target Framework Now your new Visual Studio project is created. Let's install IronOCR. 4. Installing IronOCR There are several methods for downloading and installing the IronOCR library. But here are the two simplest approaches. Using the Visual Studio NuGet Package Manager Using the Visual Studio Command Line 4.1. Using the Visual Studio NuGet Package Manager IronOCR may be included in a C# project by utilizing the Visual Studio NuGet Package Manager. Navigate to the NuGet Package Manager graphical user interface by selecting Tools > NuGet Package Manager > Manage NuGet Packages for Solution NuGet Package Manager After this, a new window will appear. Search for IronOCR and install the package in the project. Select the IronOCR package in NuGet Package Manager UI Additional language packs for IronOCR can also be installed using the same method described above. 4.2. Using the Visual Studio Command-Line In Visual Studio, go to Tools > NuGet Package Manager > Package Manager Console Enter the following line in the Package Manager Console tab to install IronOCR: Install-Package IronOcr Package Manager Console The package will now download/install in the current project and be ready to use. 5. Extract data from Invoices using IronOCR Using IronOCR, you can easily extract data from invoices with just a few lines of code and use that data extraction for further processes like data entry. This will replace manual data entry and many more. Here is an example invoice to extract text from. The sample invoice Now, let's write the code to extract all the data from this invoice. using IronOcr; using System; // Initialize a new instance of the IronTesseract class var ocr = new IronTesseract(); // Use the OcrInput object to load the image file using (var input = new OcrInput(@"r2.png")) { // Read the image using the Read method, which performs OCR var result = ocr.Read(input); // Output the extracted text to the console Console.WriteLine(result.Text); } using IronOcr; using System; // Initialize a new instance of the IronTesseract class var ocr = new IronTesseract(); // Use the OcrInput object to load the image file using (var input = new OcrInput(@"r2.png")) { // Read the image using the Read method, which performs OCR var result = ocr.Read(input); // Output the extracted text to the console Console.WriteLine(result.Text); } Imports IronOcr Imports System ' Initialize a new instance of the IronTesseract class Private ocr = New IronTesseract() ' Use the OcrInput object to load the image file Using input = New OcrInput("r2.png") ' Read the image using the Read method, which performs OCR Dim result = ocr.Read(input) ' Output the extracted text to the console Console.WriteLine(result.Text) End Using $vbLabelText $csharpLabel The above code gets input in the form of an image and then extracts data from that image using a Read method from the IronTesseract class. Invoice Parser 5.1. Invoice Processing to extract specific data from invoices You can also extract specific data from invoices like customer invoice numbers. Below is the code to extract the customer invoice number from the invoice. using IronOcr; using System; using System.Text.RegularExpressions; // Initialize a new instance of the IronTesseract class var ocr = new IronTesseract(); // Use the OcrInput object to load the image file using (var input = new OcrInput(@"r2.png")) { // Perform OCR on the image var result = ocr.Read(input); // Define a regular expression pattern for the invoice number var linePattern = @"INV\/\d{4}\/\d{5}"; // Match the pattern in the extracted text var lineMatch = Regex.Match(result.Text, linePattern); // Check if the pattern matches any part of the text if (lineMatch.Success) { // If a match is found, print the invoice number var lineValue = lineMatch.Value; Console.WriteLine("Customer Invoice number: " + lineValue); } } using IronOcr; using System; using System.Text.RegularExpressions; // Initialize a new instance of the IronTesseract class var ocr = new IronTesseract(); // Use the OcrInput object to load the image file using (var input = new OcrInput(@"r2.png")) { // Perform OCR on the image var result = ocr.Read(input); // Define a regular expression pattern for the invoice number var linePattern = @"INV\/\d{4}\/\d{5}"; // Match the pattern in the extracted text var lineMatch = Regex.Match(result.Text, linePattern); // Check if the pattern matches any part of the text if (lineMatch.Success) { // If a match is found, print the invoice number var lineValue = lineMatch.Value; Console.WriteLine("Customer Invoice number: " + lineValue); } } Imports IronOcr Imports System Imports System.Text.RegularExpressions ' Initialize a new instance of the IronTesseract class Private ocr = New IronTesseract() ' Use the OcrInput object to load the image file Using input = New OcrInput("r2.png") ' Perform OCR on the image Dim result = ocr.Read(input) ' Define a regular expression pattern for the invoice number Dim linePattern = "INV\/\d{4}\/\d{5}" ' Match the pattern in the extracted text Dim lineMatch = Regex.Match(result.Text, linePattern) ' Check if the pattern matches any part of the text If lineMatch.Success Then ' If a match is found, print the invoice number Dim lineValue = lineMatch.Value Console.WriteLine("Customer Invoice number: " & lineValue) End If End Using $vbLabelText $csharpLabel Invoice Scanning 6. Conclusion IronOCR's Invoice OCR API revolutionizes data extraction from invoices using machine learning and computer vision. This technology converts invoice text and numbers into a machine-readable format, simplifying data extraction for analysis, integration, and process improvement. It offers a robust solution for automating invoice processing, improving accuracy, and optimizing workflows like accounts payable. Automated data entry from scanned invoices is also made possible with this technology. IronOCR offers high accuracy using the best results from Tesseract, without any additional settings. It supports multipage frame TIFF, PDF files, and all popular image formats. It is also possible to read barcode values from images. Please visit the homepage website for more information on IronOCR. For more tutorials on invoice OCR, visit the following this details invoice OCR tutorial. To know about how to use computer vision to find text such as invoice fields, visit this computer vision how-to. 常见问题解答 如何通过OCR自动化发票数据处理? 您可以使用IronOCR通过利用其机器学习算法来自动化发票数据处理。IronOCR从数字和扫描的发票中提取诸如供应商信息、发票号码和价格等细节,减少手动输入错误并提高效率。 设置发票OCR API涉及哪些步骤? 要使用IronOCR设置发票OCR API,请首先通过Visual Studio的NuGet包管理器下载并安装库。接下来,创建一个新的C#项目,集成IronOCR,并使用其方法加载和读取图像文件以进行文本提取。 IronOCR可以提取如发票号码之类的特定数据吗? 是的,IronOCR可以提取如发票号码之类的特定数据。它利用正则表达式来匹配提取文本中的模式,让您可以从发票中提取特定信息。 IronOCR有什么发票处理受益功能? IronOCR包括图像预处理、条形码识别和文件解析等功能。这些功能提高了准确提取和处理各种发票格式文本的能力,改善数据采集和工作流程效率。 图像预处理如何提高OCR结果? IronOCR中的图像预处理通过在文本提取之前优化图像质量来提高OCR结果。这包括像对比度调整和噪声减少这样的操作,可以从发票中提取更准确的数据。 是否可以将IronOCR用于数字和扫描的发票? 是的,IronOCR能够处理数字和扫描的发票。它使用先进的机器学习和计算机视觉技术从各种格式和图像质量中准确提取文本。 IronOCR如何处理多页格式和文件类型? IronOCR支持多页格式和流行的图像和PDF文件类型。它能够有效地从复杂文档中提取文本,使其在各种发票处理应用中具有多样性。 开发人员在哪里可以找到使用IronOCR的教程? 开发人员可以在IronOCR网站上找到教程和其他资源。该网站提供了一系列学习材料,包括如何指南和博客文章,适用于在不同情境中应用IronOCR。 Kannapat Udonpant 立即与工程团队聊天 软件工程师 在成为软件工程师之前,Kannapat 在日本北海道大学完成了环境资源博士学位。在攻读学位期间,Kannapat 还成为了车辆机器人实验室的成员,隶属于生物生产工程系。2022 年,他利用自己的 C# 技能加入 Iron Software 的工程团队,专注于 IronPDF。Kannapat 珍视他的工作,因为他可以直接从编写大多数 IronPDF 代码的开发者那里学习。除了同行学习外,Kannapat 还喜欢在 Iron Software 工作的社交方面。不撰写代码或文档时,Kannapat 通常可以在他的 PS5 上玩游戏或重温《最后生还者》。 相关文章 已发布九月 29, 2025 如何使用 IronOCR 创建 .NET OCR SDK 使用 IronOCR 的 .NET SDK 创建强大的 OCR 解决方案。简单的 API、企业功能,以及用于文档处理应用程序的跨平台支持。 阅读更多 已发布九月 29, 2025 如何在 C# GitHub 项目中集成 OCR 使用 IronOCR OCR C# GitHub 教程:使用 IronOCR 在您的 GitHub 项目中实施文本识别。包括代码示例和版本控制技巧。 阅读更多 已更新九月 4, 2025 我们如何将文档处理内存减少 98%:IronOCR 工程突破 IronOCR 2025.9 通过流架构将 TIFF 处理内存减少 98%,消除崩溃并提高企业工作流的速度。 阅读更多 发票处理的最佳 OCR(更新列表)如何在 Blazor 中从图像中读...
已发布九月 29, 2025 如何使用 IronOCR 创建 .NET OCR SDK 使用 IronOCR 的 .NET SDK 创建强大的 OCR 解决方案。简单的 API、企业功能,以及用于文档处理应用程序的跨平台支持。 阅读更多
已发布九月 29, 2025 如何在 C# GitHub 项目中集成 OCR 使用 IronOCR OCR C# GitHub 教程:使用 IronOCR 在您的 GitHub 项目中实施文本识别。包括代码示例和版本控制技巧。 阅读更多
已更新九月 4, 2025 我们如何将文档处理内存减少 98%:IronOCR 工程突破 IronOCR 2025.9 通过流架构将 TIFF 处理内存减少 98%,消除崩溃并提高企业工作流的速度。 阅读更多