OCR Image Color Editing
OCR works faster and more accurately when we read black text on a white background.
If we have, for example, blue text on a pink background, we will want to swap blue to black and pink to white before performing OCR.
This can be very time-consuming and slow using System.Drawing
, but it is completely automated with IronOCR.
The OcrInput.ReplaceColor
method allows us to replace one color with another in a document. This method is adaptive, allowing you to specify a percentage tolerance for a match to an exact RGB color. This removes the need to use Photoshop or ImageMagick scripts to prepare images for OCR.
Here is an example of how to use IronOCR
to replace colors:
using IronOcr; // Import IronOCR namespace
class Program
{
static void Main()
{
// Create an instance of OcrInput
var OcrInput = new OcrInput(@"path/to/your/image.png"); // Specify the image path
// Replace blue color with black
// You can specify a tolerance value that determines how much the color can differ from an exact match
OcrInput.ReplaceColor(System.Drawing.Color.Blue, System.Drawing.Color.Black, 10f); // 10f specifies 10% tolerance
// Replace pink color with white
OcrInput.ReplaceColor(System.Drawing.Color.Pink, System.Drawing.Color.White, 10f); // replace with a tolerance of 10%
// Perform OCR with the adjusted colors
var OcrResult = new IronTesseract().Read(OcrInput);
// Output the text recognized by OCR
Console.WriteLine(OcrResult.Text);
}
}
using IronOcr; // Import IronOCR namespace
class Program
{
static void Main()
{
// Create an instance of OcrInput
var OcrInput = new OcrInput(@"path/to/your/image.png"); // Specify the image path
// Replace blue color with black
// You can specify a tolerance value that determines how much the color can differ from an exact match
OcrInput.ReplaceColor(System.Drawing.Color.Blue, System.Drawing.Color.Black, 10f); // 10f specifies 10% tolerance
// Replace pink color with white
OcrInput.ReplaceColor(System.Drawing.Color.Pink, System.Drawing.Color.White, 10f); // replace with a tolerance of 10%
// Perform OCR with the adjusted colors
var OcrResult = new IronTesseract().Read(OcrInput);
// Output the text recognized by OCR
Console.WriteLine(OcrResult.Text);
}
}
Imports IronOcr ' Import IronOCR namespace
Friend Class Program
Shared Sub Main()
' Create an instance of OcrInput
Dim OcrInput As New OcrInput("path/to/your/image.png") ' Specify the image path
' Replace blue color with black
' You can specify a tolerance value that determines how much the color can differ from an exact match
OcrInput.ReplaceColor(System.Drawing.Color.Blue, System.Drawing.Color.Black, 10F) ' 10f specifies 10% tolerance
' Replace pink color with white
OcrInput.ReplaceColor(System.Drawing.Color.Pink, System.Drawing.Color.White, 10F) ' replace with a tolerance of 10%
' Perform OCR with the adjusted colors
Dim OcrResult = (New IronTesseract()).Read(OcrInput)
' Output the text recognized by OCR
Console.WriteLine(OcrResult.Text)
End Sub
End Class
Explanation
- IronOCR: A .NET library that simplifies the process of OCR (Optical Character Recognition) for .NET languages.
- OcrInput: Represents the image input for OCR processing. The
ReplaceColor
method is used to adjust colors within this image. - ReplaceColor Method Parameters:
- The first parameter is the color to be replaced.
- The second parameter is the replacement color.
- The third optional parameter specifies tolerance for how closely the colors must match which allows for variations within the specified percentage.
By replacing colors in your images, you can optimize the conditions for OCR and increase its efficacy significantly.