How to Read Photos Using IronOCR

ByCurtis Chau

February 16, 2025

Updated June 22, 2025

When dealing with large volumes of documents, particularly scanned images like TIFF files, manually extracting text can be time-consuming and prone to human error. This is where Optical Character Recognition (OCR) comes in, offering an automated method to accurately convert text from images into digital data. OCR technology can handle the complexity of images, such as scanned documents or photographs, and turn them into searchable, editable text. This not only speeds up document processing but also ensures more accurate data extraction compared to manual transcription.

Using OCR on formats like TIFF, which may be hard to read due to their size, color depth, or compression, enables businesses and developers to quickly digitize and manage vast amounts of data. With OCR solutions like IronOCR's ReadPhoto function, developers can extract text from images and even perform advanced operations such as searching for keywords or converting scanned data into searchable PDFs. This technology is especially useful for industries that deal with legal documents, archives, or receipts, where efficient data retrieval is critical.

In this tutorial, we'll briefly provide an input and an example on how to use ReadPhoto and how to manipulate the results object. We'll also discuss scenarios where developers might prefer using ReadPhoto instead of the standard Read from IronOCR.

How to Read Photos Using IronOCR

Download the C# library for reading photos
Import the images for processing
Use the appropriate import method based on the image type
Use the ReadPhoto method to extract data from the image
Access the OcrPhotoResult property to view and manipulate the extracted data

Start using IronOCR in your project today with a free trial.

First Step:

To use this function, you must also install the IronOcr.Extension.AdvancedScan package.

Read Photos Example

Reading high-quality photo formats such as tiff and gif is relatively simple using IronOCR. First, we create a new variable and assign it as an OcrInput then load the image in using LoadImageFrame. Finally, we use the ReadPhoto method and obtain the results.

Please note

Since Tiff contains multiple frames within a singular image, the frameNumber parameter is needed. Furthermore the index starts at 0, rather than 1.
The method currently only works for English, Chinese, Japanese, Korean, and LatinAlphabet.
Using advanced scan on .NET Framework requires the project to run on x64 architecture.

Input

Since most browsers do not natively support the TIFF format, you can download the TIFF input here. To display the TIFF file, I converted it to WEBP.

Input

Code

:path=/static-assets/ocr/content-code-examples/how-to/read-photo-read-photo.cs

using IronOcr;
using IronSoftware.Drawing;
using System;

// Instantiate a Tesseract OCR engine
var ocr = new IronTesseract();

// Create a new OcrInput object for handling OCR input data
using var inputPhoto = new OcrInput();

// Load the first frame of the image "ocr.tiff" into the input for OCR processing
inputPhoto.AddImage("ocr.tiff");

// Perform OCR on the input photo
OcrResult result = ocr.Read(inputPhoto);

// Assuming at least one text region exists, get information about the first text region
if (result.TextRegions.Count > 0)
{
    // Index number refers to the region order in the page
    int pageNumber = result.TextRegions[0].PageNumber;

    // Extract the text in the first region
    string textInRegion = result.TextRegions[0].Text;

    // Extract the coordinates of the first text region
    Rectangle region = result.TextRegions[0].Bounds;

    // Format the output string with details about the first text region and the entire scanned text
    var output = $"Page Number: {pageNumber}\n"
                 + $"Text in First Region: {textInRegion}\n"
                 + $"Text Region:\n"
                 + $"Starting X: {region.X}\n"
                 + $"Starting Y: {region.Y}\n"
                 + $"Region Width: {region.Width}\n"
                 + $"Region Height: {region.Height}\n"
                 + $"Result Confidence: {result.Confidence:P}\n\n"
                 + $"Full Scanned Photo Text: {result.Text}";

    // Print the output to the console
    Console.WriteLine(output);
}
else
{
    Console.WriteLine("No text regions found in the provided image.");
}

Imports Microsoft.VisualBasic
Imports IronOcr
Imports IronSoftware.Drawing
Imports System

' Instantiate a Tesseract OCR engine
Private ocr = New IronTesseract()

' Create a new OcrInput object for handling OCR input data
Private inputPhoto = New OcrInput()

' Load the first frame of the image "ocr.tiff" into the input for OCR processing
inputPhoto.AddImage("ocr.tiff")

' Perform OCR on the input photo
Dim result As OcrResult = ocr.Read(inputPhoto)

' Assuming at least one text region exists, get information about the first text region
If result.TextRegions.Count > 0 Then
	' Index number refers to the region order in the page
	Dim pageNumber As Integer = result.TextRegions(0).PageNumber

	' Extract the text in the first region
	Dim textInRegion As String = result.TextRegions(0).Text

	' Extract the coordinates of the first text region
	Dim region As Rectangle = result.TextRegions(0).Bounds

	' Format the output string with details about the first text region and the entire scanned text
	Dim output = $"Page Number: {pageNumber}" & vbLf & $"Text in First Region: {textInRegion}" & vbLf & $"Text Region:" & vbLf & $"Starting X: {region.X}" & vbLf & $"Starting Y: {region.Y}" & vbLf & $"Region Width: {region.Width}" & vbLf & $"Region Height: {region.Height}" & vbLf & $"Result Confidence: {result.Confidence:P}" & vbLf & vbLf & $"Full Scanned Photo Text: {result.Text}"

	' Print the output to the console
	Console.WriteLine(output)
Else
	Console.WriteLine("No text regions found in the provided image.")
End If

$vbLabelText $csharpLabel

Output

output

Text: The extracted text from OCR input. Confidence: A "double" property that indicates the statistical accuracy confidence of an average of every character, with one being the highest and 0 being the lowest. TextRegions: A list of the "TextRegions" property indicating where the OCR text and its location is within the input. In the example above, we printed the frame number as well as the rectangle containing the text.

Difference between `ReadPhoto` and `Read`

The main difference between the ReadPhoto method compared to the standard Read is the result object and the file format it takes. LoadImageFrame specifically only takes in tiff and gif and does not support formats like jpeg for several reasons.

Comparison between TIFF and JPEG Images

TIFF as a file format is lossless and usually used to condense multiple pages and multiple frames into one single format. It is typically used for high-quality, multi-image storage (for example legal documents, medical images). It is much more complex than standard JPEG formats and as such requires a different method to fully extract text from it.

Furthermore, TIFF images use a different compression method, so IronOCR has to use a specialized method to decipher the text.

Here's a further breakdown between TIFF and JPEG for comparison.

Feature	TIFF (Tagged Image File Format)	JPG/JPEG (Joint Photographic Experts Group)
Compression	Lossless or uncompressed (preserves quality)	Lossy compression (reduces quality for smaller file size)
File Size	Large (due to high quality and optional lack of compression)	Smaller, optimized for web use and fast loading
Image Quality	High (ideal for professional use, retains all details)	Lower (due to lossy compression, some quality is sacrificed)
Color Depth	Supports high color depth (up to 16-bit or 32-bit per channel)	24-bit color (16.7 million colors)
Use Case	Professional photography, publishing, scanning, archiving	Web images, social media, everyday photos
Transparency	Supports transparency and alpha channels	Does not support transparency
Editing	Good for multiple edits (no quality loss with resaving)	Quality degrades with repeated edits and saves
Compatibility	Widely supported in professional software	Universally supported across all platforms and devices
Animation	Does not support animation	Does not support animation
Metadata	Stores extensive metadata (EXIF, layers, etc.)	Stores EXIF metadata but is more limited

Different scenarios

Developers will have to consider each use case in production to further optimize and allow their applications to run effectively. Although ReadPhoto is suited for complex images such as TIFF like above, the result would be processed slowly. On the other hand, JPEG may be lower in quality but the operation would generally be faster. However, image quality such as having noise would result in a low confidence rate with the OCR.

Using the confidence property in the OcrPhotoResults or any class that uses the interface IOcrResult would give you an idea of how accurate the results are, allowing developers to test, re-iterate, and optimize as desired.

Developers should find a fine line between efficiency and accuracy ensuring that the images are up to a certain threshold for consistency.

Frequently Asked Questions

What is the ReadPhoto feature?

IronOCR's ReadPhoto is a method that allows developers to extract text from image files such as TIFF and GIF, turning them into digital, searchable text.

How does the ReadPhoto method differ from standard image processing methods?

The main differences lie in the types of image formats they process and the result objects they return. IronOCR's ReadPhoto is specifically designed for complex formats like TIFF and GIF, while standard methods may support formats like JPEG.

Why is TIFF preferred for high-quality image storage?

TIFF is a lossless format that supports high color depth, making it ideal for professional use where image quality is critical, such as in legal documents and medical imaging.

What are the system requirements to use advanced image processing features?

To use IronOCR's ReadPhoto, you must have the IronOCR library and the IronOcr.Extension.AdvancedScan package installed. It requires the project to run on x64 architecture for advanced scanning.

Which languages does the ReadPhoto feature support?

The ReadPhoto method in IronOCR supports English, Chinese, Japanese, Korean, and LatinAlphabet languages.

What is the significance of the 'confidence' property in OCR results?

The 'confidence' property indicates the statistical accuracy of the OCR results, helping developers assess and optimize the text extraction's reliability.

Can the ReadPhoto feature be used for JPEG images?

No, IronOCR's ReadPhoto is not designed for JPEG images. It is optimized for complex formats like TIFF and GIF, which require different handling due to their compression and quality characteristics.

How can developers manipulate OCR results from the ReadPhoto feature?

Developers can access the OcrPhotoResult property in IronOCR to view extracted text, confidence scores, and text regions, allowing them to further process and utilize the data in their applications.

What is the advantage of using OCR technology in document processing?

OCR technology automates the conversion of text from images into digital data, increasing processing speed and accuracy compared to manual transcription, especially in handling large volumes of documents.

What factors should be considered when choosing between different image processing methods?

Consider the image format, quality, and the specific use case. IronOCR's ReadPhoto is suited for high-quality, complex images like TIFF, while other methods may be faster for simpler formats like JPEG, depending on the application's needs.

Curtis Chau

Chat with engineering team now

Technical Writer

Curtis Chau holds a Bachelor’s degree in Computer Science (Carleton University) and specializes in front-end development with expertise in Node.js, TypeScript, JavaScript, and React. Passionate about crafting intuitive and aesthetically pleasing user interfaces, Curtis enjoys working with modern frameworks and creating well-structured, visually appealing manuals.

Beyond development, Curtis has a strong interest in the Internet of Things (IoT), exploring innovative ways to integrate hardware and software. In his free time, he enjoys gaming and building Discord bots, combining his love for technology with creativity.