Reading Identity Documents with IronOCR

This article was translated from English: Does it need improvement?
Translated
View the article in English

Identity documents are, by design, very difficult for OCR engines to read due to anti-copying/fraud protection -- holograms, watermarking images, variable digital noise, etc.-- added to the backgrounds of the card.

This is not to say it is impossible. Results will likely be dependent on image quality. Image formats with less digital noise, such as TIFF or PNG, are recommended over lossy image formats, such as JPEG.

Please also try the following image optimization filters:

  • DeNoise(): Removes digital noise. This filter should only be used where noise is expected. It flattens Alpha channels to white.
  • DeepCleanBackgroundNoise(): Performs heavy background noise removal. Only use this filter in cases where extreme document background noise is known, as this filter will also risk reducing OCR accuracy of clean documents and is very CPU expensive.

You may also try crop rectangles: Crop Rectangles Example.

Curtis Chau
Escritor Técnico

Curtis Chau tiene una licenciatura en Ciencias de la Computación (Carleton University) y se especializa en el desarrollo front-end con experiencia en Node.js, TypeScript, JavaScript y React. Apasionado por crear interfaces de usuario intuitivas y estéticamente agradables, disfruta trabajando con frameworks modernos y creando manuales bien ...

Leer más
¿Listo para empezar?
Nuget Descargas 5,044,537 | Versión: 2025.11 recién lanzado