using IronWord; using System.Linq; using System; // Load docx WordDocument doc = new WordDocument("multi-paragraph.docx"); // Returns text from the entire file Console.WriteLine(doc.ExtractText()); // Returns text from the first paragraph only Console.WriteLine(doc.Paragraphs[0].ExtractText()); // Returns text from the last paragraph only Console.WriteLine(doc.Paragraphs.Last().ExtractText());

提取文本

從文件中提取大量文字時，尤其是在處理表格和大量段落時，效率低且耗時。然而，IronWord 的 ExtractText 方法是節省時間的解決方案。它使開發人員能夠輕鬆提取文件中特定數量的所有文本，從而無需額外的循環，並簡化了對 Text 屬性的存取。這種方法可以確保開發人員有效率地工作並節省寶貴時間。

在這個例子中，我們將展示幾種使用 ExtractText 方法的方法，以提高從文件中檢索文字的效率。

從 Docx 檔案中提取文字的實用方法

using IronWord;
WordDocument doc = new WordDocument("multi-paragraph.docx");
Console.WriteLine(doc.ExtractText());
Console.WriteLine(doc.Paragraphs[0].ExtractText());
Console.WriteLine(doc.Paragraphs.Last().ExtractText());

提取文字

使用IronWord庫，從 Word 文件中提取文字是一個簡單的過程。我們首先導入庫並初始化 WordDocument 類別。這一步驟允許我們載入一個包含段落的現有文件。然後，我們呼叫 ExtractText 方法，並將文件的全部文字列印到控制台。

提取特定文本

上面的範例提取了整個文件的文本，但使用IronWord庫，您可以完全控制提取過程。如果您只想取得特定部分或段落，可以使用 Paragraphs 屬性在 WordDocument 中傳回一個 Paragraphs 陣列。作為通用列表，您可以根據需要操作此數組，既可以透過像上面那樣使用 doc.Paragraphs[0] 呼叫索引，也可以使用 C# 集合的內建數組方法。

當訪問 Paragraphs 的索引時，我們只返回並提取文件第一段的文本，並將其列印到控制台。隨後，我們也呼叫 Last 數組，以便僅從文件中傳回和提取最後一個段落的文字。

探索IronWord API 的高階文字擷取功能