Test in production without watermarks.
Works wherever you need it to.
Get 30 days of fully functional product.
Have it up and running in minutes.
Full access to our support engineering team during your product trial
A receipt scanning API extracts key data from receipts using advanced OCR technology. It streamlines the data entry process by eliminating manual errors and enhancing productivity. The API, versatile and accurate, supports multiple languages, currencies, and formats. By automating receipt parsing, businesses can gain insights into spending patterns and make data-driven decisions. This article will demonstrate how to use the C# OCR library, IronOCR, to extract important information from a receipt.
IronOCR is a versatile OCR library and API developed by Iron Software, offering developers a powerful solution for extracting text from various sources such as scanned documents, images, and PDFs. With its advanced OCR algorithms, computer vision, and machine learning models, IronOCR ensures high accuracy and reliability, even in challenging scenarios. The library supports multiple languages and font styles, making it suitable for global applications. By incorporating IronOCR with machine learning model capabilities into their applications, developers can easily automate data entry, text analysis, and other tasks, enhancing productivity and efficiency.
With IronOCR, developers can effortlessly fetch text from a variety of sources, including documents, photographs, screenshots, and even live camera feeds as JSON responses. By utilizing sophisticated algorithms and machine learning models, IronOCR analyzes the image data, recognizes individual characters, and converts them into machine-readable text. This extracted text can then be used for various purposes, such as data entry, information retrieval, text analysis, and automation of manual tasks.
Before you can start working with IronOCR, there are a few prerequisites that need to be in place. These prerequisites include:
By ensuring that these prerequisites are met, you'll be ready to dive into the process of working with IronOCR.
To get started with IronOCR, the first step is to create a new Visual Studio project.
Open Visual Studio and go to Files, then hover on New, and click on Project.
New Project Image
In the new window, select Console Application and click on Next.
Console Application
A new window will appear. Write the name of your new project, and location and click on Next.
Project Configuration
Finally, provide the Target Framework and click on Create.
Target Framework
Now your new Visual Studio project is created, let's install the IronOCR.
There are several methods for downloading and installing the IronOCR library. However, here are the two simplest approaches.
IronOCR may be included in a C# project by utilizing the Visual Studio NuGet Package Manager.
Navigate to the NuGet Package Manager graphical user interface by selecting Tools > NuGet Package Manager > Manage NuGet Packages for Solution
NuGet Package Manager
After this, a new window will appear. Search for IronOCR and install the package in the project.
IronOCR
Additional language packs for IronOCR can also be installed using the same method described above.
Enter the following line in the Package Manager Console tab:
Install-Package IronOcr
Install-Package IronOcr
Package Manager Console
The package will now download/install in the current project and be ready to use.
Extracting data from receipt images using IronOCR and saving them in structured data form is a lifesaver for most developers. Using IronOCR, you can achieve that with just a few lines of code. Using this you can extract line items, pricing, tax amount, total amount, and many more with different document types.
using IronOcr;
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
class ReceiptScanner
{
static void Main()
{
var ocr = new IronTesseract();
// Load the image of the receipt
using (var input = new OcrInput(@"r2.png"))
{
// Perform OCR on the input image
var result = ocr.Read(input);
// Regular expression patterns to extract relevant details from the OCR result
var descriptionPattern = @"\w+\s+(.*?)\s+(\d+\.\d+)\s+Units\s+(\d+\.\d+)\s+Tax15%\s+\$(\d+\.\d+)";
var pricePattern = @"\$\d+(\.\d{2})?";
// Variables to store extracted data
var descriptions = new List<string>();
var unitPrices = new List<decimal>();
var taxes = new List<decimal>();
var amounts = new List<decimal>();
var lines = result.Text.Split('\n');
foreach (var line in lines)
{
// Match each line against the description pattern
var descriptionMatch = Regex.Match(line, descriptionPattern);
if (descriptionMatch.Success)
{
descriptions.Add(descriptionMatch.Groups[1].Value.Trim());
unitPrices.Add(decimal.Parse(descriptionMatch.Groups[2].Value));
// Calculate tax and total amount for each item
var tax = unitPrices[unitPrices.Count - 1] * 0.15m;
taxes.Add(tax);
amounts.Add(unitPrices[unitPrices.Count - 1] + tax);
}
}
// Output the extracted data
for (int i = 0; i < descriptions.Count; i++)
{
Console.WriteLine($"Description: {descriptions[i]}");
Console.WriteLine($"Quantity: 1.00 Units");
Console.WriteLine($"Unit Price: ${unitPrices[i]:0.00}");
Console.WriteLine($"Taxes: ${taxes[i]:0.00}");
Console.WriteLine($"Amount: ${amounts[i]:0.00}");
Console.WriteLine("-----------------------");
}
}
}
}
using IronOcr;
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
class ReceiptScanner
{
static void Main()
{
var ocr = new IronTesseract();
// Load the image of the receipt
using (var input = new OcrInput(@"r2.png"))
{
// Perform OCR on the input image
var result = ocr.Read(input);
// Regular expression patterns to extract relevant details from the OCR result
var descriptionPattern = @"\w+\s+(.*?)\s+(\d+\.\d+)\s+Units\s+(\d+\.\d+)\s+Tax15%\s+\$(\d+\.\d+)";
var pricePattern = @"\$\d+(\.\d{2})?";
// Variables to store extracted data
var descriptions = new List<string>();
var unitPrices = new List<decimal>();
var taxes = new List<decimal>();
var amounts = new List<decimal>();
var lines = result.Text.Split('\n');
foreach (var line in lines)
{
// Match each line against the description pattern
var descriptionMatch = Regex.Match(line, descriptionPattern);
if (descriptionMatch.Success)
{
descriptions.Add(descriptionMatch.Groups[1].Value.Trim());
unitPrices.Add(decimal.Parse(descriptionMatch.Groups[2].Value));
// Calculate tax and total amount for each item
var tax = unitPrices[unitPrices.Count - 1] * 0.15m;
taxes.Add(tax);
amounts.Add(unitPrices[unitPrices.Count - 1] + tax);
}
}
// Output the extracted data
for (int i = 0; i < descriptions.Count; i++)
{
Console.WriteLine($"Description: {descriptions[i]}");
Console.WriteLine($"Quantity: 1.00 Units");
Console.WriteLine($"Unit Price: ${unitPrices[i]:0.00}");
Console.WriteLine($"Taxes: ${taxes[i]:0.00}");
Console.WriteLine($"Amount: ${amounts[i]:0.00}");
Console.WriteLine("-----------------------");
}
}
}
}
Imports Microsoft.VisualBasic
Imports IronOcr
Imports System
Imports System.Collections.Generic
Imports System.Text.RegularExpressions
Friend Class ReceiptScanner
Shared Sub Main()
Dim ocr = New IronTesseract()
' Load the image of the receipt
Using input = New OcrInput("r2.png")
' Perform OCR on the input image
Dim result = ocr.Read(input)
' Regular expression patterns to extract relevant details from the OCR result
Dim descriptionPattern = "\w+\s+(.*?)\s+(\d+\.\d+)\s+Units\s+(\d+\.\d+)\s+Tax15%\s+\$(\d+\.\d+)"
Dim pricePattern = "\$\d+(\.\d{2})?"
' Variables to store extracted data
Dim descriptions = New List(Of String)()
Dim unitPrices = New List(Of Decimal)()
Dim taxes = New List(Of Decimal)()
Dim amounts = New List(Of Decimal)()
Dim lines = result.Text.Split(ControlChars.Lf)
For Each line In lines
' Match each line against the description pattern
Dim descriptionMatch = Regex.Match(line, descriptionPattern)
If descriptionMatch.Success Then
descriptions.Add(descriptionMatch.Groups(1).Value.Trim())
unitPrices.Add(Decimal.Parse(descriptionMatch.Groups(2).Value))
' Calculate tax and total amount for each item
Dim tax = unitPrices(unitPrices.Count - 1) * 0.15D
taxes.Add(tax)
amounts.Add(unitPrices(unitPrices.Count - 1) + tax)
End If
Next line
' Output the extracted data
For i As Integer = 0 To descriptions.Count - 1
Console.WriteLine($"Description: {descriptions(i)}")
Console.WriteLine($"Quantity: 1.00 Units")
Console.WriteLine($"Unit Price: ${unitPrices(i):0.00}")
Console.WriteLine($"Taxes: ${taxes(i):0.00}")
Console.WriteLine($"Amount: ${amounts(i):0.00}")
Console.WriteLine("-----------------------")
Next i
End Using
End Sub
End Class
As you can see below, IronOCR can easily extract the required text from the receipt.
Output
If you want to extract the whole receipt, you can easily do this with a few lines of code on the OCR receipt.
using IronOcr;
using System;
class WholeReceiptExtractor
{
static void Main()
{
var ocr = new IronTesseract();
using (var input = new OcrInput(@"r3.png"))
{
// Perform OCR on the entire receipt and print text output to console
var result = ocr.Read(input);
Console.WriteLine(result.Text);
}
}
}
using IronOcr;
using System;
class WholeReceiptExtractor
{
static void Main()
{
var ocr = new IronTesseract();
using (var input = new OcrInput(@"r3.png"))
{
// Perform OCR on the entire receipt and print text output to console
var result = ocr.Read(input);
Console.WriteLine(result.Text);
}
}
}
Imports IronOcr
Imports System
Friend Class WholeReceiptExtractor
Shared Sub Main()
Dim ocr = New IronTesseract()
Using input = New OcrInput("r3.png")
' Perform OCR on the entire receipt and print text output to console
Dim result = ocr.Read(input)
Console.WriteLine(result.Text)
End Using
End Sub
End Class
Scan receipt API output
The receipt image scanning API, such as IronOCR, offers a powerful software solution for automating the extraction of data from receipts. By leveraging advanced OCR technology, businesses can easily extract important information from receipt images or scans, including business vendor names, purchase dates, itemized lists, prices, taxes, and total amounts. With support for multiple languages, currencies, receipt formats, and barcode support, businesses can streamline their receipt management processes, save time, gain insights into spending patterns, and make data-driven decisions. IronOCR, as a versatile OCR library and API, provides developers with the tools they need to extract text from various sources accurately and efficiently, enabling automation of tasks and improving overall efficiency. By meeting the necessary prerequisites and integrating IronOCR into their applications, developers can unlock the benefits of receipt data processing and enhance their workflows.
For more information on IronOCR, visit this licensing page. To know about how to use computer vision to find text, visit this computer vision how-to page. For more tutorials on receipt OCR, visit the following OCR C# tutorial.
This OCR library, IronOCR, is a versatile OCR library and API developed by Iron Software that allows developers to extract text from various sources such as scanned documents, images, and PDFs with high accuracy and reliability.
To use this OCR library, you need to have a suitable development environment like Visual Studio, a basic understanding of C# programming, and the IronOCR library installed in your project.
Open Visual Studio, go to File > New > Project, select Console Application, configure your project name and location, and provide the Target Framework.
Navigate to Tools > NuGet Package Manager > Manage NuGet Packages for Solution, search for IronOCR in the NuGet Package Manager, and install it in your project.
Yes, you can install IronOCR by opening the Package Manager Console in Visual Studio and entering the command: Install-Package IronOcr.
Using IronOCR, you can extract line items, pricing, tax amounts, total amounts, and other details from receipts.
You can extract the whole receipt by performing OCR on the entire receipt image and printing the text output to the console using a few lines of C# code.
A receipt scanning API like IronOCR automates data extraction from receipts, reduces manual errors, enhances productivity, and allows businesses to gain insights into spending patterns and make data-driven decisions.
Yes, IronOCR supports multiple languages, currencies, and receipt formats, making it suitable for global applications.
IronOCR uses advanced OCR algorithms, computer vision, and machine learning models to ensure high accuracy and reliability, even in challenging scenarios.