How to Read Text From Image in Blazor

Blazor framework is built by the ASP.NET team which is used to develop interactive UI web applications using HTML and C# instead of JavaScript. Blazor runs C# code directly in the web browser using WebAssembly. This makes it easy to build and develop components with logic and reuse them over and over again. It is a popular framework among developers for building UI in C#

In this article, we are going to create a Blazor Server app for reading text from image files using Optical Character Recognition (OCR) with IronOCR.

How to Read Text from Image using Optical character recognition in Blazor?

Prerequisites

  1. Have the latest version of Visual Studio. You can download it from this link.
  2. ASP.NET and Web Development workload. While installing Visual Studio, select ASP.NET and Web Development workload for installation as it is required for this project.
  3. IronOCR C# Library. We are going to use IronOCR to convert image data to machine-readable text. You can download the IronOCR package .dll file directly from the Iron website or download it from the NuGet website. A more convenient way to download and install IronOCR is from the NuGet package manager in Visual Studio.

Create a Blazor Server App

Open Visual Studio 2022 and follow the steps to create Blazor Server App:

  1. Click Create a New Project and then Select "Blazor Server App" from the listed project templates.

    How to Read Text From Image in Blazor: Figure 1

  2. Next, name your project appropriately. Here, we are naming it "BlazorReadText".

    How to Read Text From Image in Blazor: Figure 2

  3. Finally, set the additional information and click Create.

    How to Read Text From Image in Blazor: Figure 3

The Blazor Server App is now created. Now we need to install the necessary packages before extracting Image data using IronOCR.

Adding Necessary Packages

BlazorInputFile

First, we will install the BlazorInputFile package. It is a component for Blazor applications and is used to upload single or multiple files to the server. We are going to use this to upload an image file on the Razor page in our application. Open NuGet Package Manager for solutions and browse for BlazorInputFile.

How to Read Text From Image in Blazor: Figure 4

Select the checkbox for the project and click Install.

Now, open _Host.cshtml file in the Pages folder and add the following JavaScript file:

 <script src="_content/BlazorInputFile/inputfile.js"></script>
 <script src="_content/BlazorInputFile/inputfile.js"></script>
HTML

How to Read Text From Image in Blazor: Figure 5

Finally, add the following code in the _Imports.razor file.

@using BlazorInputFile
@using BlazorInputFile
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'@using BlazorInputFile
VB   C#

IronOCR

IronOCR is a C# library to scan and read Images in different formats. It provides the facility to work with images in over 127+ global languages.

To install IronOCR, open the NuGet Package Manager and browse for IronOCR. Select the project and click on the Install button.

How to Read Text From Image in Blazor: Figure 6

Add the IronOCR namespace in _Imports.razor file:

@using IronOCR
@using IronOCR
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'@using IronOCR
VB   C#

Create Blazor UI Component

A component represents a user interface with business logic to exhibit dynamic behavior. Blazor use Razor Components to build its apps. These components can be nested, reused, and shared among projects. By default, the Counter and FetchData pages are provided in the application. We are going to delete these pages as they won't affect our application.

Right-click on the pages folder under the BlazorReadText application, and then select Add > Razor Component. If you do not find Razor Component, then click on New Item, and from C# components select "Razor Component". Name the component "OCR.razor" and click Add.

How to Read Text From Image in Blazor: Figure 7

For easy maintenance of our application, we are going to keep the code for this razor page as a separate class. Again, right-click on the pages folder and select Add > Class. Name the class the same as the page name and click Add. Blazor is a smart framework, and it tags this class with the page which has the same name.

How to Read Text From Image in Blazor: Figure 8

Now, let's move to the actual code implementation which will read image data using IronOCR.

Blazor OCR.razor UI Component Source code to Read Image Data

To recognize text in an image, we will need to first upload the image, convert it to binary data and then apply the IronOCR method to extract text.

Open the OCR.razor.cs class and write the following example source code:


using BlazorInputFile;
using Microsoft.AspNetCore.Components;
using IronOcr;

namespace BlazorReadText.Pages
{
    public class OCRModel : ComponentBase
    {
        protected string imageText;
        protected string imagePreview;
        byte[] imageFileBytes;

        const string DefaultStatus = "Maximum size allowed for the image is 4 MB";
        protected string status = DefaultStatus;

        const int MaxFileSize = 4 * 1024 * 1024; // 4MB

        protected async Task ViewImage(IFileListEntry[] files)
        {
            var file = files.FirstOrDefault();
            if (file == null)
            {
                return;
            }
            else if (file.Size > MaxFileSize)
            {
                status = $"The file size is {file.Size} bytes, this is more than the allowed limit of {MaxFileSize} bytes.";
                return;
            }
            else if (!file.Type.Contains("image"))
            {
                status = "Please uplaod a valid image file";
                return;
            }
            else
            {
                var memoryStream = new MemoryStream();
                await file.Data.CopyToAsync(memoryStream);
                imageFileBytes = memoryStream.ToArray();
                string base64String = Convert.ToBase64String(imageFileBytes, 0, imageFileBytes.Length);

                imagePreview = string.Concat("data:image/png;base64,", base64String);
                memoryStream.Flush();
                status = DefaultStatus;
            }
        }

        protected private async Task GetText()
        {
            if (imageFileBytes != null)
            {
        IronTesseract ocr = new IronTesseract();
        using (OcrInput input = new OcrInput(imageFileBytes))
        {
            OcrResult result = ocr.Read(input);
                        imageText = result.Text;

        }
            }
        }
    }
}

using BlazorInputFile;
using Microsoft.AspNetCore.Components;
using IronOcr;

namespace BlazorReadText.Pages
{
    public class OCRModel : ComponentBase
    {
        protected string imageText;
        protected string imagePreview;
        byte[] imageFileBytes;

        const string DefaultStatus = "Maximum size allowed for the image is 4 MB";
        protected string status = DefaultStatus;

        const int MaxFileSize = 4 * 1024 * 1024; // 4MB

        protected async Task ViewImage(IFileListEntry[] files)
        {
            var file = files.FirstOrDefault();
            if (file == null)
            {
                return;
            }
            else if (file.Size > MaxFileSize)
            {
                status = $"The file size is {file.Size} bytes, this is more than the allowed limit of {MaxFileSize} bytes.";
                return;
            }
            else if (!file.Type.Contains("image"))
            {
                status = "Please uplaod a valid image file";
                return;
            }
            else
            {
                var memoryStream = new MemoryStream();
                await file.Data.CopyToAsync(memoryStream);
                imageFileBytes = memoryStream.ToArray();
                string base64String = Convert.ToBase64String(imageFileBytes, 0, imageFileBytes.Length);

                imagePreview = string.Concat("data:image/png;base64,", base64String);
                memoryStream.Flush();
                status = DefaultStatus;
            }
        }

        protected private async Task GetText()
        {
            if (imageFileBytes != null)
            {
        IronTesseract ocr = new IronTesseract();
        using (OcrInput input = new OcrInput(imageFileBytes))
        {
            OcrResult result = ocr.Read(input);
                        imageText = result.Text;

        }
            }
        }
    }
}
Imports BlazorInputFile
Imports Microsoft.AspNetCore.Components
Imports IronOcr

Namespace BlazorReadText.Pages
	Public Class OCRModel
		Inherits ComponentBase

		Protected imageText As String
		Protected imagePreview As String
		Private imageFileBytes() As Byte

		Private Const DefaultStatus As String = "Maximum size allowed for the image is 4 MB"
		Protected status As String = DefaultStatus

		Private Const MaxFileSize As Integer = 4 * 1024 * 1024 ' 4MB

		Protected Async Function ViewImage(ByVal files() As IFileListEntry) As Task
			Dim file = files.FirstOrDefault()
			If file Is Nothing Then
				Return
			ElseIf file.Size > MaxFileSize Then
				status = $"The file size is {file.Size} bytes, this is more than the allowed limit of {MaxFileSize} bytes."
				Return
			ElseIf Not file.Type.Contains("image") Then
				status = "Please uplaod a valid image file"
				Return
			Else
				Dim memoryStream As New MemoryStream()
				Await file.Data.CopyToAsync(memoryStream)
				imageFileBytes = memoryStream.ToArray()
				Dim base64String As String = Convert.ToBase64String(imageFileBytes, 0, imageFileBytes.Length)

				imagePreview = String.Concat("data:image/png;base64,", base64String)
				memoryStream.Flush()
				status = DefaultStatus
			End If
		End Function

		Private Protected Async Function GetText() As Task
			If imageFileBytes IsNot Nothing Then
		Dim ocr As New IronTesseract()
		Using input As New OcrInput(imageFileBytes)
			Dim result As OcrResult = ocr.Read(input)
						imageText = result.Text

		End Using
			End If
		End Function
	End Class
End Namespace
VB   C#

In the above code, the ViewImage method is used to take the uploaded file from the Input file and check whether it is an image and the size is less than as specified. If any error occurred in file size or file type, the if-else block handles it. Then the image is copied to a MemoryStream. Finally, the image is converted to a byte array as IronOcr.OcrInput can accept an image in binary format.

The GetText method uses IronOCR to read text from the input image. IronOCR uses the latest Tesseract 5 engine and in 127+ supported languages. The converted image to a byte array is passed as OCRInput and the result is retrieved using the IronTesseract Read method. IronTesseract developed by the IronOCR team is the extended version of Google Tesseract. For more information visit the C# Tesseract OCR example.

Finally, the extracted text is saved in the imageText variable for display. Here, we are going to work with English text images. You can have a look at how to use different languages on this code example page.

Blazor Frontend UI Component Source code

Now, we need to create the UI for the application. Open the OCR.razor file and write the following code:

@page "/IronOCR"
@inherits OCRModel

<h2>Optical Character Recognition (OCR) Using Blazor and IronOCR Software</h2>

<div class="row">
    <div class="col-md-5">
        <textarea disabled class="form-control" rows="10" cols="15">@imageText</textarea>
    </div>
    <div class="col-md-5">
        <div class="image-container">
            <img class="preview-image" width="800" height="500" src=@imagePreview>
        </div>
        <BlazorInputFile.InputFile OnChange="@ViewImage" />
        <p>@status</p>
        <hr />
        <button class="btn btn-primary btn-lg" @onclick="GetText">
            Extract Text
        </button>
    </div>
</div>
@page "/IronOCR"
@inherits OCRModel

<h2>Optical Character Recognition (OCR) Using Blazor and IronOCR Software</h2>

<div class="row">
    <div class="col-md-5">
        <textarea disabled class="form-control" rows="10" cols="15">@imageText</textarea>
    </div>
    <div class="col-md-5">
        <div class="image-container">
            <img class="preview-image" width="800" height="500" src=@imagePreview>
        </div>
        <BlazorInputFile.InputFile OnChange="@ViewImage" />
        <p>@status</p>
        <hr />
        <button class="btn btn-primary btn-lg" @onclick="GetText">
            Extract Text
        </button>
    </div>
</div>
'INSTANT VB TODO TASK: The following line uses invalid syntax:
'@page "/IronOCR" @inherits OCRModel <h2> Optical Character Recognition(OCR) @Using Blazor @and IronOCR Software</h2> <div class="row"> <div class="col-md-5"> <textarea disabled class="form-control" rows="10" cols="15"> @imageText</textarea> </div> <div class="col-md-5"> <div class="image-container"> <img class="preview-image" width="800" height="500" src=@imagePreview> </div> <BlazorInputFile.InputFile OnChange="@ViewImage" /> <p> @status</p> <hr /> <button class="btn btn-primary btn-lg" @onclick="GetText"> Extract Text </button> </div> </div>
VB   C#

In the code above, we have an input file to choose an image file and an image tag to display the image. There is a button below the input field that triggers the GetText method. There is a text area that is used to display the extracted text from image data.

Lastly, add a link to the OCR.razor page in "NavMenu.razor" file under the Shared folder.

<div class="nav-item px-3">
    <NavLink class="nav-link" href="IronOCR">
        <span class="oi oi-plus" aria-hidden="true"></span> Read Image Text
    </NavLink>
</div>
<div class="nav-item px-3">
    <NavLink class="nav-link" href="IronOCR">
        <span class="oi oi-plus" aria-hidden="true"></span> Read Image Text
    </NavLink>
</div>
HTML

Remove the links to Counter and FetchData, as we don't need them.

Everything is now completed and ready to use. Press F5 to run the application.

The frontend should appear as shown below:

How to Read Text From Image in Blazor: Figure 9

Let's upload an image and extract text to visualize the output.

How to Read Text From Image in Blazor: Figure 10

The output text is clean, and it can be copied from the text area.

Summary

In this article, we looked at how to create Blazor UI Component with code behind it in the Blazor Server app to Read Text from images. We used IronOCR, which is a versatile library to extract text in any C# based application. It supports the latest .NET Framework and can be used well with Razor applications. IronOCR is a cross-platform library supported on Windows, Linux, macOS, Docker, Azure, and AWS.

You can also try IronOCR for free in a 30-day trial. Download the software library from here.