Financial OCR in C# and .NET
IronOCR is a C# software component allowing .NET coders to read text from images and PDF documents in 126 languages, including Financial.
It is an advanced fork of Tesseract, built exclusively for .NET developers and regularly outperforms other Tesseract engines for both speed and accuracy.
Contents of IronOcr.Languages.Financial
This package contains 16 OCR languages for .NET:
- Financial
Download
Financial Language Pack [Financial]
Installation
The first thing we have to do is install our Financial OCR package to your .NET project.
Install-Package IronOCR.Languages.Financial
Code Example
This C# code example reads financial text from an image or PDF document.
// Import the IronOcr namespace
using IronOcr;
// Instantiate the IronTesseract OCR engine
var Ocr = new IronTesseract();
// Set the OCR language to Financial
Ocr.Language = OcrLanguage.Financial;
// Create an OCR input object, specifying the path to the image or PDF
using (var Input = new OcrInput(@"images\Financial.png"))
{
// Perform OCR to read text from the input
var Result = Ocr.Read(Input);
// Retrieve the extracted text
var AllText = Result.Text;
}
// Import the IronOcr namespace
using IronOcr;
// Instantiate the IronTesseract OCR engine
var Ocr = new IronTesseract();
// Set the OCR language to Financial
Ocr.Language = OcrLanguage.Financial;
// Create an OCR input object, specifying the path to the image or PDF
using (var Input = new OcrInput(@"images\Financial.png"))
{
// Perform OCR to read text from the input
var Result = Ocr.Read(Input);
// Retrieve the extracted text
var AllText = Result.Text;
}
' Import the IronOcr namespace
Imports IronOcr
' Instantiate the IronTesseract OCR engine
Private Ocr = New IronTesseract()
' Set the OCR language to Financial
Ocr.Language = OcrLanguage.Financial
' Create an OCR input object, specifying the path to the image or PDF
Using Input = New OcrInput("images\Financial.png")
' Perform OCR to read text from the input
Dim Result = Ocr.Read(Input)
' Retrieve the extracted text
Dim AllText = Result.Text
End Using
Explanation:
- Using IronOcr: This namespace includes all necessary classes for the OCR process.
- IronTesseract Class: This is the main class enabling OCR tasks.
- Language Setting: Setting the language to
Financial
enables the OCR engine to recognize financial terminology. - OcrInput Class: It takes a file path that specifies the image or PDF file to be processed.
- Read Method: Executed on
Ocr.Read(Input)
, it processes the image to retrieve the text based on the provided input and language settings. - Result.Text: It stores the recognized text from the image or PDF.