How to use the Filter Wizard

When it comes to preprocessing and preparing an image for the OCR process, it can be daunting. Multiple filters can be used on an image; however, it can be complicated to try and test which combination suits your image best, as it is often a case-by-case basis. The process itself can be incredibly time-consuming, as you try different combinations repeatedly to verify which method yields the best results.

However, IronOCR provides an effective and easy way to handle this by introducing the OcrInputFilterWizard. The Filter Wizard automatically evaluates combinations of preprocessing filters to maximize OCR confidence and accuracy. It performs a "brute-force" scan for optimal settings and additionally returns the best filter combination as a code snippet, allowing developers to reproduce the result easily.

In this how-to guide, we'll quickly go through an example of how the Filter Wizard works and showcase the code snippets and parameters it uses.

Quickstart: Enhance OCR with Filter Wizard

Get started quickly with IronOCR's Filter Wizard to improve OCR accuracy. This tool empowers developers to create custom image processing code, optimizing image attributes like contrast and brightness. Ideal for handling diverse images, the Filter Wizard ensures precise OCR results with minimal effort. Simply apply the wizard to your image input and see immediate improvement in data extraction accuracy.

Nuget IconGet started making PDFs with NuGet now:

  1. Install IronPDF with NuGet Package Manager

    PM > Install-Package IronOcr

  2. Copy and run this code snippet.

    var result = new IronOcr.OcrInput("image.png").UseFilterWizard().Read();
  3. Deploy to test on your live environment

    Start using IronPDF in your project today with a free trial
    arrow pointer

Filter Wizard Example

The OcrInputFilterWizard.Run method takes in three parameters: the input image, an out parameter for the resulting confidence level, and the Tesseract Engine instance.

It works using a brute-force method by repeatedly combining different combinations of preprocess filters to achieve the best confidence score. The highest confidence score at the end determines which set of preprocessing image filters you should ideally apply to your input image.

Do note that there are no presets in the filter wizard, and there's no limit on the combinations it can try. The primary focus of the filter wizard is to achieve the best possible confidence score by testing various combinations of image filters.

Here's a list of all the filters that it can use in its combinations. Note that these are all filter methods available within the IronOCR library:

  • input.Contrast()
  • input.Sharpen()
  • input.Binarize()
  • input.ToGrayScale()
  • input.Invert()
  • input.Deskew()
  • input.Scale(...)
  • input.Denoise()
  • input.DeepCleanBackgroundNoise()
  • input.EnhanceResolution()
  • input.Dilate(), input.Erode()

For a more in-depth look at what each individual filter does, please refer to this extensive tutorial on image filters we have.

Warning Since this is a brute-force method, this operation will take some time to complete as it tests many possibilities to find the best result for your image.

Input

For this input, we'll use a screenshot with heavy artificial noise to illustrate the functionality of the filter wizard.

Input Image

Code

:path=/static-assets/ocr/content-code-examples/how-to/filter-wizard-process.cs
using IronOcr;
using System;

// Initialize the Tesseract engine
var ocr = new IronTesseract();

// 1. Pass the image path ("noise.png").
// 2. Pass an 'out' variable to store the best confidence score found.
// 3. Pass the tesseract instance to be used for testing.
string codeToRun = OcrInputFilterWizard.Run("noise.png", out double confidence, ocr);

// The 'confidence' variable is now populated with the highest score achieved.
Console.WriteLine($"Best Confidence Score: {confidence}");

// 'codeToRun' holds the exact C# code snippet that achieved this score.
// The returned string is the code you can use to filter similar images.
Console.WriteLine("Recommended Filter Code:");
Console.WriteLine(codeToRun);
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

Output

Output from Filter Wizard

As you can see from its output, the filter wizard determined that, through all the combinations, 65% confidence is the best it can achieve with this specific image.

WarningThe input image is highly distorted and heavily impacted by artificial noise. This is an extreme case intended to illustrate how IronOCR's filter wizard can help, even under challenging scenarios.

Filter Wizard Best Combination

After the filter wizard runs, we can then follow the code snippet it provided. We apply those exact settings to our input image to verify the result and confidence.

Code

:path=/static-assets/ocr/content-code-examples/how-to/filter-wizard-best-combination.cs
using IronOcr;
using System;

// Initialize the Tesseract engine
var ocrTesseract = new IronTesseract();

// Load the image into an OcrInput object
using (var input = new OcrImageInput("noise.png"))
{
    // Apply the exact filter chain recommended by the Wizard's output
    input.Invert();
    input.DeNoise();
    input.Contrast();
    input.AdaptiveThreshold();

    // Run OCR on the pre-processed image
    OcrResult result = ocrTesseract.Read(input);

    // Print the final result and confidence
    Console.WriteLine($"Result: {result.Text}");
    Console.WriteLine($"Confidence: {result.Confidence}");
}
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

Output

Output from image

As you can see, IronOCR can make out most of the text even under these heavily distorted conditions, and the confidence level matches what was reported by the filter wizard.

Frequently Asked Questions

What is the Filter Wizard in IronOCR?

The Filter Wizard in IronOCR is a tool designed to enhance OCR accuracy by generating custom image processing code tailored for specific image types.

How does the Filter Wizard improve OCR accuracy?

The Filter Wizard improves OCR accuracy by allowing users to create custom image processing filters that can be applied to images before OCR is performed, ensuring better text recognition results.

Can I use the Filter Wizard for different image types?

Yes, the Filter Wizard can be used to generate custom processing code for a variety of image types, making it versatile for different OCR needs.

Is the Filter Wizard difficult to use for beginners?

The Filter Wizard is designed with user-friendliness in mind, making it accessible for both beginners and experienced users to enhance their OCR projects.

What are the benefits of using custom image processing code in OCR?

Using custom image processing code can significantly improve text extraction accuracy, especially in challenging conditions such as low contrast images or images with noise.

Do I need programming skills to use the Filter Wizard?

While programming skills can be beneficial, the Filter Wizard simplifies the process of generating image processing code, making it approachable for users with varying technical backgrounds.

Can the Filter Wizard handle batch processing?

The Filter Wizard is capable of generating code that can be integrated into batch processing workflows, allowing for the efficient handling of multiple images.

What types of images benefit most from the Filter Wizard?

Images with low contrast, noise, or complex backgrounds benefit significantly from the custom processing capabilities of the Filter Wizard, leading to improved OCR results.

How do I access the Filter Wizard in IronOCR?

The Filter Wizard can be accessed within the IronOCR suite, providing users with tools to customize image processing for optimized OCR performance.

Is there support available for using the Filter Wizard?

IronOCR offers documentation and support resources to assist users in effectively utilizing the Filter Wizard for their OCR tasks.

Curtis Chau
Technical Writer

Curtis Chau holds a Bachelor’s degree in Computer Science (Carleton University) and specializes in front-end development with expertise in Node.js, TypeScript, JavaScript, and React. Passionate about crafting intuitive and aesthetically pleasing user interfaces, Curtis enjoys working with modern frameworks and creating well-structured, visually appealing manuals.

...

Read More
Ready to Get Started?
Nuget Downloads 4,946,486 | Version: 2025.10 just released