使用 IRONOCR 使用 IronOCR 从扫描图像中提取表格数据:现场演示回顾 Kannapat Udonpant 已更新:六月 22, 2025 Download IronOCR NuGet 下载 DLL 下载 Windows 安装程序 Start Free Trial Copy for LLMs Copy for LLMs Copy page as Markdown for LLMs Open in ChatGPT Ask ChatGPT about this page Open in Gemini Ask Gemini about this page Open in Grok Ask Grok about this page Open in Perplexity Ask Perplexity about this page Share Share on Facebook Share on X (Twitter) Share on LinkedIn Copy URL Email article Extracting data from scanned images is a common challenge, especially when it involves structured data like tables. With IronOCR's advanced machine learning capabilities, you can now seamlessly extract table data including cell values and their positions. In this demo, Shadman Majid, Software Sales Engineer, walks through the code implementation step-by-step, while Anne Lazarakis, Sales and Marketing Director, shares real-world use cases from Iron Software customers. Real-World Use Cases Explained by Anne Lazarakis, Sales and Marketing Director* Insurance Claim Processing (Opyn Market) In the highly regulated healthcare insurance industry in the U.S., companies like Opyn Market still receive many documents via fax. These scanned documents often contain tabular data that must be accurately extracted and entered into internal systems. With IronOCR, they’re able to automate this process, reducing manual work and eliminating the potential for human error. Logistics & Food Distribution (iPAP) iPAP, the largest cheese distributor in the U.S., uses IronOCR to manage over 200 client orders. Their invoices come in various formats with inconsistent table layouts. IronOCR helps them extract purchase order numbers, shipment dates, and item details from scanned documents efficiently, even with varied formatting. This automation has saved them between $40,000 and $45,000 annually. Technical Overview Live Coding Session With Shadman Majid, Software Sales Engineer* IronOCR uses proprietary machine learning models to detect and extract table data from scanned documents. This feature supports: Extraction of table cells and coordinates OCR of scanned images and multi-frame PDFs Compatibility with C#, VB.NET, .NET Standard, .NET Framework, and .NET Core To access this functionality, you'll need: IronOCR NuGet package IronOcr.Extensions.AdvancedScanning NuGet package for table detection via ML models These packages include the trained ML models necessary for table structure detection and accurate OCR. Example Code for Extracting Tables Below is a sample C# code snippet that demonstrates how to use IronOCR for extracting table data from images: // Import the necessary IronOCR namespaces using IronOcr; // Initialize the IronTesseract to handle OCR processes var Ocr = new IronTesseract(); // Load the image containing the table using (var input = new OcrInput("invoice.jpg")) { // Perform OCR and extract text data including tables var result = Ocr.Read(input); // Iterate through each page in the document foreach (var page in result.Pages) { // Iterate through each table found on the page foreach (var table in page.Tables) { Console.WriteLine("Table found:"); // Iterate through each row in the table foreach (var row in table.Rows) { // Convert the row of cells to a comma-separated string var cells = string.Join(", ", row.Cells.Select(cell => cell.Text)); Console.WriteLine(cells); } } } } // Import the necessary IronOCR namespaces using IronOcr; // Initialize the IronTesseract to handle OCR processes var Ocr = new IronTesseract(); // Load the image containing the table using (var input = new OcrInput("invoice.jpg")) { // Perform OCR and extract text data including tables var result = Ocr.Read(input); // Iterate through each page in the document foreach (var page in result.Pages) { // Iterate through each table found on the page foreach (var table in page.Tables) { Console.WriteLine("Table found:"); // Iterate through each row in the table foreach (var row in table.Rows) { // Convert the row of cells to a comma-separated string var cells = string.Join(", ", row.Cells.Select(cell => cell.Text)); Console.WriteLine(cells); } } } } ' Import the necessary IronOCR namespaces Imports IronOcr ' Initialize the IronTesseract to handle OCR processes Private Ocr = New IronTesseract() ' Load the image containing the table Using input = New OcrInput("invoice.jpg") ' Perform OCR and extract text data including tables Dim result = Ocr.Read(input) ' Iterate through each page in the document For Each page In result.Pages ' Iterate through each table found on the page For Each table In page.Tables Console.WriteLine("Table found:") ' Iterate through each row in the table For Each row In table.Rows ' Convert the row of cells to a comma-separated string Dim cells = String.Join(", ", row.Cells.Select(Function(cell) cell.Text)) Console.WriteLine(cells) Next row Next table Next page End Using $vbLabelText $csharpLabel Loading an Image: The script begins by initializing the IronTesseract engine and loading an image file named invoice.jpg that you want to process. OCR Execution: It performs OCR on the input to extract text data, particularly focusing on any tables. Table Extraction: The script iterates through each detected table and its rows, outputting cell contents in a structured way. Ensure you have installed the necessary NuGet packages for IronOCR before running this script. Conclusion IronOCR makes it easy to automate the extraction of complex table data from scanned documents. Whether you're in healthcare, logistics, finance, or manufacturing, this solution offers reliability, accuracy, and cost-saving efficiency. With just a few lines of code, you can eliminate manual data entry and reduce human error. Want to see it in action? Book a live Demo with one of our engineers here. 常见问题解答 如何使用 C# 从扫描图像中提取表格数据? 您可以使用 IronOCR 的高级机器学习功能从扫描图像中提取表格数据。该过程包括使用 IronTesseract 引擎在图像上执行 OCR 并提取信息,包括单元格值及其坐标。 从扫描文档中提取表格数据的实际应用有哪些? 实际应用包括通过从传真文档中提取表格数据自动化保险索赔处理,以及在物流管理客户订单中,其中发票格式各异且表格布局不一致,如 Opyn Market 和 iPAP 等公司所示。 IronOCR 提供了哪些技术能力用于表格数据提取? IronOCR 提供了例如提取表格单元格及坐标、扫描图像和多帧 PDF 的 OCR 以及与 C#、VB.NET、.NET Standard、.NET Framework 和 .NET Core 的兼容性等功能。 使用 IronOCR 提取表格数据的代码涉及哪些步骤? 该过程包括初始化 IronTesseract 引擎、加载图像、执行 OCR 以提取文本数据,并遍历每个检测到的表格及其行以输出单元格内容。 提取表格数据需要哪些包? 您需要 IronOCR NuGet 包以及 IronOcr.Extensions.AdvancedScanning 包,以利用必要的训练 ML 模型进行表格检测和准确的 OCR。 IronOCR 如何在医疗保健和物流行业提升效率? IronOCR 通过自动化从扫描的文档中提取复杂的表格数据来减少人工劳动和人为错误,为医疗保健和物流等行业提供显着的效率和成本节约。 我可以看到 IronOCR 功能的现场演示吗? 可以,您可以预约与 Iron Software 工程师进行现场演示,了解 IronOCR 的实际运行情况及其在提取表格数据方面的能力。 Kannapat Udonpant 立即与工程团队聊天 软件工程师 在成为软件工程师之前,Kannapat 在日本北海道大学完成了环境资源博士学位。在攻读学位期间,Kannapat 还成为了车辆机器人实验室的成员,隶属于生物生产工程系。2022 年,他利用自己的 C# 技能加入 Iron Software 的工程团队,专注于 IronPDF。Kannapat 珍视他的工作,因为他可以直接从编写大多数 IronPDF 代码的开发者那里学习。除了同行学习外,Kannapat 还喜欢在 Iron Software 工作的社交方面。不撰写代码或文档时,Kannapat 通常可以在他的 PS5 上玩游戏或重温《最后生还者》。 相关文章 已发布九月 29, 2025 如何使用 IronOCR 创建 .NET OCR SDK 使用 IronOCR 的 .NET SDK 创建强大的 OCR 解决方案。简单的 API、企业功能,以及用于文档处理应用程序的跨平台支持。 阅读更多 已发布九月 29, 2025 如何在 C# GitHub 项目中集成 OCR 使用 IronOCR OCR C# GitHub 教程:使用 IronOCR 在您的 GitHub 项目中实施文本识别。包括代码示例和版本控制技巧。 阅读更多 已更新九月 4, 2025 我们如何将文档处理内存减少 98%:IronOCR 工程突破 IronOCR 2025.9 通过流架构将 TIFF 处理内存减少 98%,消除崩溃并提高企业工作流的速度。 阅读更多 为什么 IronOCR 是 LLMs 更佳的 OCR 选择优化性能以更快、更高效...
已发布九月 29, 2025 如何使用 IronOCR 创建 .NET OCR SDK 使用 IronOCR 的 .NET SDK 创建强大的 OCR 解决方案。简单的 API、企业功能,以及用于文档处理应用程序的跨平台支持。 阅读更多
已发布九月 29, 2025 如何在 C# GitHub 项目中集成 OCR 使用 IronOCR OCR C# GitHub 教程:使用 IronOCR 在您的 GitHub 项目中实施文本识别。包括代码示例和版本控制技巧。 阅读更多
已更新九月 4, 2025 我们如何将文档处理内存减少 98%:IronOCR 工程突破 IronOCR 2025.9 通过流架构将 TIFF 处理内存减少 98%,消除崩溃并提高企业工作流的速度。 阅读更多