Guide to using IronOCR Filters

IronOCR has the tools you need to read images that may need some preprocessing in the form of Filters. You can choose from a wide arrange of filters that can manipulate your images to become processable.

List of OCR Image Filters

The following Image filters can really improve performance:

  • Filters to change the Image Orientation
    • Rotate - Rotates images by a number of degrees clockwise. For anti-clockwise, use negative numbers.
    • Deskew - Rotates an image so it is the right way up and orthogonal. This is very useful for OCR because Tesseract's tolerance for skewed scans can be as low as 5 degrees.
  • Filters to manipulate Image Colors
    • Binarize - This image filter turns every pixel black or white with no middle ground. May Improve OCR performance cases of very low contrast of text to background.
    • ToGrayScale - This image filter turns every pixel into a shade of grayscale. Unlikely to improve OCR accuracy but may improve speed
    • Invert - Inverts every color. E.g. White becomes black : black becomes white.
  • Filters to try improve Contrast in an Image
    • Contrast - Increases contrast automatically. This filter often improves OCR speed and accuracy in low contrast scans.
    • Dilate - Advanced Morphology. Dilation adds pixels to the boundaries of objects in an image. Opposite of Erode
    • Erode - Advanced Morphology. Erosion removes pixels on object boundaries. Opposite of Dilate
  • Filters to try reduce Image Noise
    • DeNoise - Removes digital noise. This filter should only be used where noise is expected.
    • DeepCleanBackgroundNoise - Heavy background noise removal. Only use this filter in case extreme document background noise is known, because this filter will also risk reducing OCR accuracy of clean documents, and is very CPU expensive.
    • EnhanceResolution - Enhances the resolution of low quality images. This filter is not often needed because OcrInput.MinimumDPI and OcrInput.TargetDPI will automatically catch and resolve low resolution inputs.

Filter Example and Usage

In the following example, we demonstrate how to apply filters within your code

using IronOcr;

var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"my_image.png"))
{
    Input.Deskew();
    var result = Ocr.Read(Input);
    Console.WriteLine(result.Text);
}
using IronOcr;

var Ocr = new IronTesseract();
using (var Input = new OcrInput(@"my_image.png"))
{
    Input.Deskew();
    var result = Ocr.Read(Input);
    Console.WriteLine(result.Text);
}
Imports IronOcr

Private Ocr = New IronTesseract()
Using Input = New OcrInput("my_image.png")
	Input.Deskew()
	Dim result = Ocr.Read(Input)
	Console.WriteLine(result.Text)
End Using
VB   C#

Debug Filter / What is the filter doing?

If you are having difficulty with reading images or barcodes within your program. There is a way to save an image of a filtered result within your program. This way you can debug and see exactly what each filter does and how it is manipulating your image.

using IronOcr;

var file = @"skewed_image.tiff";
var Ocr = new IronTesseract();
using (var Input = new OcrInput(file))
{
    // Here we apply a Deskew
    Input.Deskew();

    // The following line saves the input with filters applied to an image
    Input.Pages.ToList().ForEach(page => page.SaveAsImage($"deskewed_page_{page.Index}.png"));

    // We read, then print the text to the console
    var result = Ocr.Read(Input);
    Console.WriteLine(result.Text);
}
using IronOcr;

var file = @"skewed_image.tiff";
var Ocr = new IronTesseract();
using (var Input = new OcrInput(file))
{
    // Here we apply a Deskew
    Input.Deskew();

    // The following line saves the input with filters applied to an image
    Input.Pages.ToList().ForEach(page => page.SaveAsImage($"deskewed_page_{page.Index}.png"));

    // We read, then print the text to the console
    var result = Ocr.Read(Input);
    Console.WriteLine(result.Text);
}
Imports IronOcr

Private file = "skewed_image.tiff"
Private Ocr = New IronTesseract()
Using Input = New OcrInput(file)
	' Here we apply a Deskew
	Input.Deskew()

	' The following line saves the input with filters applied to an image
	Input.Pages.ToList().ForEach(Function(page) page.SaveAsImage($"deskewed_page_{page.Index}.png"))

	' We read, then print the text to the console
	Dim result = Ocr.Read(Input)
	Console.WriteLine(result.Text)
End Using
VB   C#

Here is an example input image that will require a Deskew:

Using the code above this is the outputted Image with a Deskew applied:

Console Output of the text after a Deskew: