Test in a live environment
Test in production without watermarks.
Works wherever you need it to.
This tutorial is designed to help beginners create an OCR Receipt Scanner using the IronOCR, an OCR API in C#. By the end of this guide, you will understand how to implement optical character recognition (OCR) to convert different types of receipt files into editable and searchable data using receipt OCR API. This technology can be a game-changer for businesses looking to automate expense management and minimize manual data entry. Let's get started!
Before we dive into the coding part, make sure you have the following:
Open Visual Studio: Locate the Visual Studio icon on your desktop or in your applications menu and double-click it to open the program.
Create a New Project: Once Visual Studio is open, you’ll find a launch window. Click on the "Create a new project" button. If you have already opened Visual Studio and don’t see the launch window, you can access this by clicking File > New > Project from the top menu.
Select Project Type: In the “Create a new project” window, you’ll see a variety of project templates. In the search box, type “Console App” to filter the options, then select Console App (.NET Core) or Console App (.NET Framework), depending on your preference and compatibility. Then click the Next button.
Configure Your New Project: Now, you’ll see a screen titled "Configure your new project".
Before we can use the IronOCR library, we need to include it in our project. Follow these steps:
After selecting the IronOCR package, you will notice a panel on the right side displaying the package's information, including its description and version. There is also an Install button in this panel.
After installing IronOCR, your next step is to configure your project. Here's how:
Add Namespaces: At the top of your Program.cs file, include the following namespaces:
using IronOcr;
using System;
using IronOcr;
using System;
Imports IronOcr
Imports System
Configuration Settings: If you have any configuration settings like an API key or a license key, make sure to include them. For IronOCR, you'll need to set the license key as shown in the provided code:
License.LicenseKey = "License-Key"; // replace 'License-Key' with your key
License.LicenseKey = "License-Key"; // replace 'License-Key' with your key
License.LicenseKey = "License-Key" ' replace 'License-Key' with your key
Now, let's write the code to read the receipt.
Define the Path to Your Receipt: Specify the path to the receipt file you want to scan.
string pdfFilePath = "Sample_Receipt.jpg";
string pdfFilePath = "Sample_Receipt.jpg";
Dim pdfFilePath As String = "Sample_Receipt.jpg"
Try-Catch Block: Implement error handling using a try-catch block. This will help you manage any exceptions that occur during the OCR process.
try
{
// OCR code will go here
}
catch (Exception ex)
{
// Handle exceptions here
Console.WriteLine($"An error occurred: {ex.Message}");
}
try
{
// OCR code will go here
}
catch (Exception ex)
{
// Handle exceptions here
Console.WriteLine($"An error occurred: {ex.Message}");
}
Try
' OCR code will go here
Catch ex As Exception
' Handle exceptions here
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
In Step 5, we delve into the core functionality of our application: implementing OCR to read and interpret the data from our receipt. This involves initializing the OCR engine, configuring the input, performing the OCR operation, and displaying the results.
The first part of the code creates an instance of the IronTesseract class:
var ocr = new IronTesseract();
var ocr = new IronTesseract();
Dim ocr = New IronTesseract()
By creating an instance of IronTesseract, we are essentially setting up our OCR tool, gearing it up to perform the text recognition tasks. It's like starting the engine of a car before you can drive it. This object will be used to control the OCR process, including reading the input and extracting text from it.
Next, we define the input for our OCR process:
using (var input = new OcrInput(pdfFilePath))
{
// OCR processing will go here
}
using (var input = new OcrInput(pdfFilePath))
{
// OCR processing will go here
}
Using input = New OcrInput(pdfFilePath)
' OCR processing will go here
End Using
In this segment, OcrInput is used to specify the file we want to process. pdfFilePath is a variable that contains the path to our receipt file. By passing this variable to OcrInput, we are telling the OCR engine, "Here's the file I want you to read." The using statement is a special C# construct that ensures that the resources used by OcrInput (like file handles) are properly released once the processing is done. It's a way to manage resources efficiently and ensure that your application runs smoothly without unnecessary memory usage.
Within the using block, we call the Read method on our ocr instance:
var result = ocr.Read(input);
var result = ocr.Read(input);
Dim result = ocr.Read(input)
The Read method will take the input file path as the parameter. This line will start the receipt scanning. It'll do the OCR of the given input file, extract data, and store it into a variable result. We can use the extracted text from this method to perform any text operation.
Finally, we output the text that was recognized by the OCR process:
Console.WriteLine(result.Text);
Console.WriteLine(result.Text);
Console.WriteLine(result.Text)
The result variable contains the output of the OCR process and result.Text contains the actual text extracted from the receipt. The Console.WriteLine function then takes this text and displays it on the console. This allows you to see and verify the results of the OCR process. Here is the complete Program.cs file code:
using IronOcr;
using Microsoft.Extensions.Configuration;
using System;
class Program
{
static void Main(string [] args)
{
License.LicenseKey = "Your-License-Key";
string pdfFilePath = "Sample_Receipt.jpg";
try
{
var ocr = new IronTesseract();
using (var input = new OcrInput(pdfFilePath))
{
var result = ocr.Read(input);
Console.WriteLine(result.Text);
}
}
catch (Exception ex)
{
// Handle exceptions (e.g., file not found, OCR errors) and log them if necessary.
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronOcr;
using Microsoft.Extensions.Configuration;
using System;
class Program
{
static void Main(string [] args)
{
License.LicenseKey = "Your-License-Key";
string pdfFilePath = "Sample_Receipt.jpg";
try
{
var ocr = new IronTesseract();
using (var input = new OcrInput(pdfFilePath))
{
var result = ocr.Read(input);
Console.WriteLine(result.Text);
}
}
catch (Exception ex)
{
// Handle exceptions (e.g., file not found, OCR errors) and log them if necessary.
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports IronOcr
Imports Microsoft.Extensions.Configuration
Imports System
Friend Class Program
Shared Sub Main(ByVal args() As String)
License.LicenseKey = "Your-License-Key"
Dim pdfFilePath As String = "Sample_Receipt.jpg"
Try
Dim ocr = New IronTesseract()
Using input = New OcrInput(pdfFilePath)
Dim result = ocr.Read(input)
Console.WriteLine(result.Text)
End Using
Catch ex As Exception
' Handle exceptions (e.g., file not found, OCR errors) and log them if necessary.
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
Now, you see the text from your receipt output to the console. This text represents the data extracted from your receipt image. It's how we scan receipts using IronOCR. This is a simple example of using OCR capabilities to extract data from paper receipts. It's a very generic implementation. You can modify your code to match the layout of your receipt images.
After that, you can use the unstructured data from receipts that we got after scanning receipts. We can get important information from a particular section of the receipt. Or we can show the receipt data in a more organized way. We can make an OCR Receipt Scanning software application using the IronOCR. That will help us to extract accurate data of receipt fields.
Congratulations! You've successfully built an OCR receipt scanner using C# and IronOCR. This scanner can significantly increase the accuracy of data extraction for various business needs such as expense tracking, supply chain management, and more. There will be no more need to review the scanned receipts and extract data manually.
IronOCR offers a free trial, allowing users to explore and assess its capabilities at no initial cost. For those seeking to integrate and leverage the full spectrum of features in a professional setting, licenses begin at $749, providing a comprehensive solution for robust OCR receipt scanning and data extraction needs.
Remember, this is just the beginning. You can expand this application to support various file types, improve data privacy, or integrate additional features like receipt recognition for specific fields such as tax amount, date, line items, and more. With OCR technology, the possibilities are vast, paving the way for more efficient and intelligent business processes. Happy coding!
9 .NET API products for your office documents