IronSecureDocでPDF上のテキストを再編集する方法

チャクニット・ビン

2024年10月20日

更新済み 2024年12月17日

共有:

Translated

View the article in English

この記事では、IronSecureDocを使用したPDF上のテキストの朱書きについて説明します。これにより、サービスやプロセスは、実行中のIronSecureDocサーバにPDFを添えて簡単なPOSTリクエストを行うだけで、機密情報を迅速かつ簡単に再編集することができます。 Swaggerドキュメントを使用し、視覚的に説明します。 POSTリクエストは、必須パラメータとオプションパラメータの両方を受け取り、高度にカスタマイズ可能です；応答は、テキストを編集したPDFを返します。

IronSecureDocでPDF上のテキストを再編集する方法

IronSecureDoc Docker イメージをプルして開始する
Swaggerを使用してAPIをテストする
編集するテキストを指定する
提供された詳細でAPIコールを実行する
再編集されたPDF文書をエクスポートする

IronSecureDocをプルして開始する

IronSecureDocをまだお使いでない場合は、以下のリンクからセットアップしてください：

ローカルでホスト	クラウドにデプロイ
Hosting on Windows Hosting on Mac Hosting on Linux	Deploy on Azure Container Deploy on AWS Container

POST] リダクトテキストAPI

[POST] Redact Text APIエンドポイントは、編集を使用してPDFドキュメント内の機密テキストを非表示にすることができます。この機能は、法的契約書、医療記録、財務報告書などの機密文書を扱うアプリケーションに不可欠です。このAPIを使用することで、特定のテキストが永久に削除され、セキュリティが強化され、データ保護基準への準拠が保証されます。

次の内容にご注意ください。

テキストは一度再編集されると、内容を復元することはできません。

スワッガー

Swaggerは、開発者がユーザーフレンドリーなWebインターフェイスを介してRESTful APIと対話できる強力なツールです。 Python、Javaなどの言語を使用しているかどうかにかかわらず、SwaggerはこのAPIをテストし実装する便利な方法を提供します。

Swaggerでテキストを再編集する手順

Swagger UIにアクセスする:

APIサーバーがローカルで稼働している場合、ウェブブラウザでhttp://localhost:8080/swagger/index.htmlに移動することでSwaggerにアクセスできます。
[POST] Redact Text API を見つける:

Swagger UI 内で、[POST] /v1/document-services/pdfs/redact-text エンドポイントを見つけてください。
設定を指定する:

この例では、PDFファイルとPOSTリクエストで編集する単語の両方を提供します。私たちは "we "という単語を編集し、その上にブラックボックスを重ねます。このデモンストレーションでは、次の構成で 'sample.pdf' ファイルを使用します：
- draw_black_box: true
- match_whole_word: true
- words_to_redact: 私たち
サンプルPDFをアップロード：

リクエスト本文に、再編集を適用したいサンプルPDFファイルをアップロードしてください。 pdf_fileとしてファイルが追加されていることを確認してください。
リクエストを実行する：

実行」をクリックしてリクエストを実行します。レスポンスには、再編集されたPDFが含まれます。このSwagger UIインタラクションにより、再編集プロセスを簡単にテストでき、すぐにフィードバックが得られます。

コマンドプロンプトからCURLリクエストを使用する

別の方法として、コマンドプロンプトでcurl POSTリクエストを使って同じ結果を得ることもできます。

curl -X POST 'http://localhost:8080/v1/document-services/pdfs/redact-text' \
 -H 'accept: */*' \
 -H 'Content-Type: multipart/form-data' \
 -F 'pdf_file=@sample.pdf;type=application/pdf' \
 -F 'words_to_redact="we"' \
 -F 'draw_black_box=true' \
 -F 'match_whole_word=true'

curl -X POST 'http://localhost:8080/v1/document-services/pdfs/redact-text' \
 -H 'accept: */*' \
 -H 'Content-Type: multipart/form-data' \
 -F 'pdf_file=@sample.pdf;type=application/pdf' \
 -F 'words_to_redact="we"' \
 -F 'draw_black_box=true' \
 -F 'match_whole_word=true'

SHELL

{i:(既定では、PowerShell は curl を組み込みの PowerShell コマンドレットである Invoke-WebRequest のエイリアスとして解釈する場合があります。 curlの代わりにcurl.exeを使ってみてください。

curl.exe --version

curl.exe --version

SHELL

)}]

必須リクエスト・ボディ・パラメータ

Name	Data Type	Description
pdf_file	application/pdf	The PDF file you want to manipulate.
words_to_redact	array[string]	This parameter takes a list of words and redacts the text matching the input.

オプションのリクエストボディパラメータ

Name	Data Type	Description
user_password	string	This is required if the input PDF has a user password. The operation will fail if no password is provided for the password-protected PDF.
owner_password	string	This is required if the input PDF has an owner password. The operation will fail if no password is provided for the password-protected PDF.
specific_pages	array[int]	Allows you to specify which pages to redact text on. By default, the value is null, meaning the provided word in all the pages will be redacted.
draw_black_box	boolean	Allows you to specify whether to draw a black box over the redacted text. By default, this value is set to True.
match_whole_word	boolean	Specifies whether partial matches within words should also be redacted. For example, if the provided word is "are," any words containing "are," such as "hare," will have the "are" redacted as well. By default, this is set to True.
match_case	boolean	Specifies whether the provided word should be an exact match in terms of case. By default, this value is null. Note: Setting this to True means that lowercase and uppercase strings will not be matched. For example, if the provided word is "WE," the lowercase version "we" would not be redacted.
overlay_text	string	It specifies the overlay text, such as words or symbols, over the redacted text. By default, this string is empty.
save_as_pdfa	boolean	Saves the modified PDF with PDF/A-3 compliance. By default, this is set to False.
save_as_pdfua	boolean	Saves the modified PDF with PDF/UA compliance. By default, this is set to False.

オプションのヘッダーパラメータ

Name	Data Type	Description
author	string	Useful for identifying you as the author of the PDF document. By default, this field is empty.
title	string	Displays the title of the PDF document. By default, this field is empty.
subject	string	Useful for identifying the content of the PDF document at a glance. By default, this field is empty.