Skip to footer content
USING IRONOCR

Passport OCR SDK (Developer Tutorial)

A passport is an individual's identity; we use passports to travel and register essential aspects of our lives. However, the passport format is not always easy to read. Imagine many travelers suddenly appearing during the holiday season for travel and leisure. How can the immigration agents handle that large amount of data with manual data entry and retrieve the correct information manually?

Hence, many applications and enterprises are turning to optical character recognition (OCR), which allows developers to quickly extract printed text and digital images.

Similarly, Passport OCR is a technology that uses optical character recognition (OCR) software to extract meaningful information from passports; it also utilizes the machine-readable zone for all passports to retrieve information to quickly identify the individual trying to pass through immigration. In scenarios where you need to recognize passport information quickly or in a process involving automating passport data extraction, Passport OCR is vital and is the cornerstone of efficiency and speed in airports and immigration borders.

Although Passport OCR software and technology have developed further and further over the years, various factors can affect the document scanning process. Digital images with noise or smudges on the passport can heavily affect the accuracy of the passport. Furthermore, OCR libraries can sometimes be confusing when operating on a passport, as the machine-readable zone is a unique structured data set. Developers might be able to extract data but have to sort the parameters independently. However, with IronOCR, specialized methods are optimized for reading passports; its results allow developers to obtain and manipulate the information quickly, which is ideal for high-volume scanning and automation.

In this article, we'll briefly discuss using IronOCR to obtain and manipulate passport information to automate data extraction and provide further details on how IronOCR interacts with the passport.

IronOCR: A C# OCR Library

Passport OCR SDK (Developer Tutorial): Figure 1 - IronOCR: A C# OCR Library

IronOCR is a C# Library that offers easy-to-use methods and flexible functionality for all OCR-related needs. In addition to the standard techniques, IronOCR allows developers to fully utilize and customize a version of Tesseract to achieve all related tasks.

Here's a quick rundown of its most notable features below:

  1. Cross-compatibility: IronOCR is compatible with most .NET platforms, including .NET 8, 7, 6, and 5, and supports .NET Framework 4.6.2 upwards. With this library, developers don't have to worry about cross-compatibility as it also supports all operating systems, ranging from Windows, macOS to Azure, and even Linux.
  2. Flexibility: OCR input comes in many formats, so a library has to handle all sorts of formats to be truly flexible. IronOCR accepts all popular image formats (jpg, png, and gif) while supporting the native "System.Drawing.Objects" from C#, allowing easier integration into existing codebases.
  3. Support and ease of use: IronOCR is well documented, with extensive API and tutorials indicating all forms of functionality. Furthermore, there is 24/5 support, ensuring developers are always supported.
  4. Multiple languages support: IronOCR supports up to 125 languages and also supports custom languages, making it versatile for all international document processing.

Reading the Passport with IronOCR

License Key

Please remember that IronOCR requires a licensing key for operation. You can get a key as part of a free trial by visiting this link.

// Replace the license key variable with the trial key you obtained
IronOcr.License.LicenseKey = "REPLACE-WITH-YOUR-KEY";
// Replace the license key variable with the trial key you obtained
IronOcr.License.LicenseKey = "REPLACE-WITH-YOUR-KEY";
' Replace the license key variable with the trial key you obtained
IronOcr.License.LicenseKey = "REPLACE-WITH-YOUR-KEY"
$vbLabelText   $csharpLabel

After receiving a trial key, set this variable in your project.

Code example

The code below showcases how IronOCR takes a passport image and extracts all relevant information using the library's passport OCR SDK.

Input image

Passport OCR SDK (Developer Tutorial): Figure 2 - Input image

using IronOcr;
using System;

class Program {
    public static void Main() {
        // Instantiate OCR engine
        var ocr = new IronTesseract();
        using var inputPassport = new OcrInput();
        inputPassport.AddImage("Passport.jpg");

        // Perform OCR to read the passport
        OcrResult result = ocr.Read(inputPassport);

        // Output passport information
        Console.WriteLine("Given Names: " + result.Passport?.GivenNames);
        Console.WriteLine("Country: " + result.Passport?.Country);
        Console.WriteLine("Passport Number: " + result.Passport?.PassportNumber);
        Console.WriteLine("Surname: " + result.Passport?.Surname);
        Console.WriteLine("Date of Birth: " + result.Passport?.DateOfBirth.ToString("yyyy-MM-dd"));
        Console.WriteLine("Date of Expiry: " + result.Passport?.DateOfExpiry.ToString("yyyy-MM-dd"));
    }
}
using IronOcr;
using System;

class Program {
    public static void Main() {
        // Instantiate OCR engine
        var ocr = new IronTesseract();
        using var inputPassport = new OcrInput();
        inputPassport.AddImage("Passport.jpg");

        // Perform OCR to read the passport
        OcrResult result = ocr.Read(inputPassport);

        // Output passport information
        Console.WriteLine("Given Names: " + result.Passport?.GivenNames);
        Console.WriteLine("Country: " + result.Passport?.Country);
        Console.WriteLine("Passport Number: " + result.Passport?.PassportNumber);
        Console.WriteLine("Surname: " + result.Passport?.Surname);
        Console.WriteLine("Date of Birth: " + result.Passport?.DateOfBirth.ToString("yyyy-MM-dd"));
        Console.WriteLine("Date of Expiry: " + result.Passport?.DateOfExpiry.ToString("yyyy-MM-dd"));
    }
}
Imports IronOcr
Imports System

Friend Class Program
	Public Shared Sub Main()
		' Instantiate OCR engine
		Dim ocr = New IronTesseract()
		Dim inputPassport = New OcrInput()
		inputPassport.AddImage("Passport.jpg")

		' Perform OCR to read the passport
		Dim result As OcrResult = ocr.Read(inputPassport)

		' Output passport information
		Console.WriteLine("Given Names: " & result.Passport?.GivenNames)
		Console.WriteLine("Country: " & result.Passport?.Country)
		Console.WriteLine("Passport Number: " & result.Passport?.PassportNumber)
		Console.WriteLine("Surname: " & result.Passport?.Surname)
		Console.WriteLine("Date of Birth: " & result.Passport?.DateOfBirth.ToString("yyyy-MM-dd"))
		Console.WriteLine("Date of Expiry: " & result.Passport?.DateOfExpiry.ToString("yyyy-MM-dd"))
	End Sub
End Class
$vbLabelText   $csharpLabel

Code explanation

  1. Import Libraries: We first import IronOCR to the code base and other necessary libraries.
  2. Instantiate OCR Engine: We create a new IronTesseract object to initialize the OCR engine.
  3. Load Passport Image: We then create a new OcrInput and load the image containing the passport using AddImage().
  4. Read Passport Using OCR: We use the Read() method to perform the OCR operation on the input image and save the result.
  5. Output Results: We output the extracted passport information such as given names, country, passport number, surname, date of birth, and date of expiry.

Console Output

Passport OCR SDK (Developer Tutorial): Figure 3 - Console output

Machine Readable Zone

IronOCR can extract the Machine Readable Zone (MRZ) information from the bottom two rows of any passport following the International Civil Aviation Organization (ICAO) standard. The MRZ data comprises two rows, each containing unique information.

Here's a brief table on it:

Passport OCR SDK (Developer Tutorial): Figure 4 - Table of MRZ

Challenges for Passport OCR and Debugging

Image quality is always a problem when scanning digital images. A distorted image quality would obscure the information and make it harder to confirm the accuracy of the data. Furthermore, developers must consider data security and compliance when dealing with mission-critical information such as a passport.

IronOCR also provides a way to debug and showcase the concept for interaction information. These methods allow developers to troubleshoot and be confident in the extracted data.

Here's a brief example of it:

using IronOcr;
using System;

class DebugExample {
    public static void Main() {
        // Instantiate OCR engine
        var ocr = new IronTesseract();
        using var inputPassport = new OcrInput();
        inputPassport.AddImage("Passport.jpg");

        // Perform OCR
        OcrResult result = ocr.Read(inputPassport);

        // Output Confidence level and raw extracted text
        Console.WriteLine("OCR Confidence: " + result.Confidence);
        Console.WriteLine("Extracted Text: ");
        Console.WriteLine(result.Text);
    }
}
using IronOcr;
using System;

class DebugExample {
    public static void Main() {
        // Instantiate OCR engine
        var ocr = new IronTesseract();
        using var inputPassport = new OcrInput();
        inputPassport.AddImage("Passport.jpg");

        // Perform OCR
        OcrResult result = ocr.Read(inputPassport);

        // Output Confidence level and raw extracted text
        Console.WriteLine("OCR Confidence: " + result.Confidence);
        Console.WriteLine("Extracted Text: ");
        Console.WriteLine(result.Text);
    }
}
Imports IronOcr
Imports System

Friend Class DebugExample
	Public Shared Sub Main()
		' Instantiate OCR engine
		Dim ocr = New IronTesseract()
		Dim inputPassport = New OcrInput()
		inputPassport.AddImage("Passport.jpg")

		' Perform OCR
		Dim result As OcrResult = ocr.Read(inputPassport)

		' Output Confidence level and raw extracted text
		Console.WriteLine("OCR Confidence: " & result.Confidence)
		Console.WriteLine("Extracted Text: ")
		Console.WriteLine(result.Text)
	End Sub
End Class
$vbLabelText   $csharpLabel

Explanation of Debugging Code

  1. Confidence: The Confidence property in the OcrResult is a floating-point number representing the OCR's statistical accuracy confidence, calculated as an average of every character. A lower value indicates that the passport image may be blurry or contain extra information. One represents the highest confidence level, while zero represents the lowest.
  2. Text: The Text property in the OcrResult holds the unprocessed text extracted from the passport image. Developers can use it in unit tests to validate the extracted text from the passport image by doing equal assertions.

Conclusion

Passport OCR SDK (Developer Tutorial): Figure 5 - IronOCR

Passport OCR technology significantly enhances document processing by automating data extraction and improving operational efficiency. It streamlines identity verification and KYC processes, ensuring high accuracy while handling sensitive personal information. Immigration borders and airports can reduce processing time and improve workflow efficiency by choosing IronOCR as their Passport OCR API.

IronOCR provides developers with flexibility and scalability through its easy-to-use methods. It allows developers to sort information quickly through the OcrResult object. Furthermore, IronOCR provides debugging tools, including confidence levels and raw, unparsed text, for developers to use in product unit tests. IronOCR also minimizes the digital noise manually for more advanced usage by clearing out the passport image input before passing it through the method.

Feel free to take advantage of IronOCR's free trial license page.

Frequently Asked Questions

How can I use OCR to extract information from passports in C#?

You can use IronOCR to extract passport information by processing images of passports and extracting data from the machine-readable zone using its powerful OCR capabilities.

What are the benefits of using OCR for passport data processing?

OCR for passport data processing automates the extraction of information, significantly increasing efficiency and accuracy in high-traffic areas such as airports and border controls.

Is it possible to process multiple languages with OCR technology?

Yes, IronOCR supports up to 125 languages and allows for the addition of custom languages, making it versatile for processing international documents.

How does IronOCR ensure accurate data extraction from passports?

IronOCR provides a 'Confidence' property in the OcrResult to indicate the statistical accuracy, allowing developers to verify the reliability of the extracted data.

What image formats are supported by IronOCR for passport scanning?

IronOCR supports all popular image formats, including jpg, png, and gif, and it can also work with native C# System.Drawing.Objects for easy integration.

What challenges might developers face with Passport OCR implementation?

Challenges include dealing with low-quality images, ensuring data security, and compliance with handling sensitive passport information.

How can developers get started with using IronOCR for passport OCR?

Developers can start using IronOCR by obtaining a trial license key from the provider's website and following the detailed documentation to integrate it into their C# applications.

What platforms are compatible with IronOCR?

IronOCR is compatible with most .NET platforms, including .NET 8, 7, 6, and 5, as well as .NET Framework 4.6.2 and upwards, and it supports major operating systems like Windows, macOS, Azure, and Linux.

Kannaopat Udonpant
Software Engineer
Before becoming a Software Engineer, Kannapat completed a Environmental Resources PhD from Hokkaido University in Japan. While pursuing his degree, Kannapat also became a member of the Vehicle Robotics Laboratory, which is part of the Department of Bioproduction Engineering. In 2022, he leveraged his C# skills to join Iron Software's engineering ...Read More