How to Redact Text on PDF with IronSecureDoc

In this article, we will discuss redacting text on a PDF using IronSecureDoc. This allows the service or process to quickly and easily redact sensitive information by making a simple POST request with the PDF to the running IronSecureDoc server. We will demonstrate this visually through the use of Swagger docs. The POST request takes in both required and optional parameters and is highly customizable; the response returns the PDF with the redacted text.

Pull and Start IronSecureDoc

If you don't have IronSecureDoc running yet, please follow the links below to get it set up:

Host LocallyDeploy to Cloud

The [POST] Redact Text API

The [POST] Redact Text API endpoint allows you to hide sensitive text within a PDF document using redaction. This functionality is essential for applications that handle confidential documents, such as legal contracts, medical records, or financial reports. Using this API ensures that specific text is permanently removed, providing enhanced security and ensuring compliance with data protection standards.

Please noteOnce a text is redacted, the content cannot be recovered.

Swagger

Swagger is a powerful tool that enables developers to interact with RESTful APIs through a user-friendly web interface. Whether you're using languages like Python, Java, or others, Swagger offers a convenient way to test and implement this API.

Steps to Redact Text with Swagger

  1. Access the Swagger UI:

    If your API server is running locally, you can access Swagger by navigating to http://localhost:8080/swagger/index.html in your web browser.

    Swagger docs

  2. Locate the [POST] Redact Text API:

    Within the Swagger UI, find the [POST] /v1/document-services/pdfs/redact-text endpoint.

    Redact text

  3. Specify Configurations:

    In this example, I am providing both the PDF file and the words to redact in the POST request. We will redact the word "we" and overlay a black box on it. For this demonstration, we will use the 'sample.pdf' file with the following configurations:

    • draw_black_box: true
    • match_whole_word: true
    • words_to_redact: we
  4. Upload a Sample PDF:

    In the request body, upload a sample PDF file where you want to apply the redaction. Ensure that the file is added as pdf_file.

  5. Execute the Request:

    Click "Execute" to run the request. The response will include the redacted PDF. This Swagger UI interaction allows you to easily test the redaction process, providing immediate feedback.


Use CURL Request through Command Prompt

Alternatively, we can use the Command Prompt with a curl POST request to achieve the same result.

curl -X POST 'http://localhost:8080/v1/document-services/pdfs/redact-text' \
 -H 'accept: */*' \
 -H 'Content-Type: multipart/form-data' \
 -F 'pdf_file=@sample.pdf;type=application/pdf' \
 -F 'words_to_redact="we"' \
 -F 'draw_black_box=true' \
 -F 'match_whole_word=true'
curl -X POST 'http://localhost:8080/v1/document-services/pdfs/redact-text' \
 -H 'accept: */*' \
 -H 'Content-Type: multipart/form-data' \
 -F 'pdf_file=@sample.pdf;type=application/pdf' \
 -F 'words_to_redact="we"' \
 -F 'draw_black_box=true' \
 -F 'match_whole_word=true'
SHELL

Please note By default, PowerShell may interpret curl as an alias for Invoke-WebRequest, a built-in PowerShell cmdlet. Try using curl.exe instead of curl.

curl.exe --version
curl.exe --version
SHELL

Required Request Body Parameters

NameData TypeDescription
pdf_fileapplication/pdfThe PDF file you want to manipulate.
words_to_redactarray[string]This parameter takes a list of words and redacts the text matching the input.

Optional Request Body Parameters

NameData TypeDescription
user_passwordstringThis is required if the input PDF has a user password. The operation will fail if no password is provided for the password-protected PDF.
owner_passwordstringThis is required if the input PDF has an owner password. The operation will fail if no password is provided for the password-protected PDF.
specific_pagesarray[int]Allows you to specify which pages to redact text on. By default, the value is null, meaning the provided word in all the pages will be redacted.
draw_black_boxbooleanAllows you to specify whether to draw a black box over the redacted text. By default, this value is set to True.
match_whole_wordbooleanSpecifies whether partial matches within words should also be redacted. For example, if the provided word is "are," any words containing "are," such as "hare," will have the "are" redacted as well. By default, this is set to True.
match_casebooleanSpecifies whether the provided word should be an exact match in terms of case. By default, this value is null. Note: Setting this to True means that lowercase and uppercase strings will not be matched. For example, if the provided word is "WE," the lowercase version "we" would not be redacted.
overlay_textstringIt specifies the overlay text, such as words or symbols, over the redacted text. By default, this string is empty.
save_as_pdfabooleanSaves the modified PDF with PDF/A-3 compliance. By default, this is set to False.
save_as_pdfuabooleanSaves the modified PDF with PDF/UA compliance. By default, this is set to False.

Optional Header Parameters

NameData TypeDescription
authorstringUseful for identifying you as the author of the PDF document. By default, this field is empty.
titlestringDisplays the title of the PDF document. By default, this field is empty.
subjectstringUseful for identifying the content of the PDF document at a glance. By default, this field is empty.

Frequently Asked Questions

How can I redact text in a PDF using a POST request?

You can redact text in a PDF by making a POST request to the IronSecureDoc server with the PDF file and the words you want to redact. The server processes the request and returns a PDF with redacted text.

What are the steps to use the IronSecureDoc API for PDF redaction?

To use the IronSecureDoc API for PDF redaction, you should first pull and start the IronSecureDoc Docker Image, test the API using Swagger, specify the text to redact, execute the API call, and finally export the redacted PDF document.

How can I test the IronSecureDoc API before using it in production?

You can test the IronSecureDoc API using Swagger by accessing the Swagger UI, which allows you to use the provided endpoints to simulate the redaction process.

What parameters can be customized in a PDF redaction request?

In a PDF redaction request, you can customize parameters such as user_password, owner_password, specific_pages, draw_black_box, match_whole_word, match_case, overlay_text, save_as_pdfa, and save_as_pdfua for further customization.

How do I execute a PDF redaction request using curl?

To execute a PDF redaction request using curl, you can use a curl POST request command, specifying the necessary parameters and file path in your command prompt.

What should I do if my PDF is password-protected during redaction?

If your PDF is password-protected, you need to include the user_password or owner_password in the optional parameters to ensure the redaction process can access and modify the document.

What is the purpose of the 'draw_black_box' parameter in text redaction?

The 'draw_black_box' parameter specifies whether to cover the redacted text with a black box. This option is useful for visualizing the redacted areas and is enabled by default.

How can I host IronSecureDoc locally for redaction purposes?

You can host IronSecureDoc locally by following the tutorials provided for various operating systems like Windows, Mac, or Linux, allowing you to manage the redaction process on your local server.

Is it possible to redact specific pages in a PDF?

Yes, you can specify which pages to redact by using the 'specific_pages' parameter, which allows you to target particular areas of the document for redaction.

Can I overlay text on redacted areas in a PDF?

Yes, you can overlay text on redacted areas by using the 'overlay_text' parameter, which allows you to replace the redacted text with a custom message or placeholder.

Chaknith Bin
Software Engineer
Chaknith works on IronXL and IronBarcode. He has deep expertise in C# and .NET, helping improve the software and support customers. His insights from user interactions contribute to better products, documentation, and overall experience.
Talk to an Expert Five Star Trust Score Rating

Ready to Get Started?