跳至页脚内容
使用 IRONWORD

如何在 C# 中带格式读取 Word 文档

Microsoft Word documents often contain rich formatting such as fonts, styles, and various elements that make them visually appealing. IronWord is a powerful library from Iron Software that has an intuitive C# and VB.NET Word and Docx Document API. There is no need to install Microsoft Office or Word Interop to build, edit, and export Word documents. IronWord fully supports .NET 8, 7, 6, Framework, Core, and Azure. This means that the library does not require Word installed on the machine and reads the files independently. If you're working with C# and need to read Word documents while preserving their formatting, this tutorial will guide you through the process using the IronWord library.

How to (in C#) Read Word Document With Formatting

  1. Install the IronWord library to read Word documents.
  2. Load 'sample.docx', the input Word document using the WordDocument class from the IronWord library.
  3. Read the paragraphs with formatting using a loaded Word document.
  4. Show the extracted data with format information in the console output.

Prerequisites

  1. Visual Studio: Ensure you have Visual Studio or any other C# development environment installed.
  2. NuGet Package Manager: Make sure you can use NuGet to manage packages in your project

Step 1: Create a New C# Project

Create a new C# console application or use an existing project where you want to read Word documents.

Select the console application template and click next.

How to Read Word Document With Formatting in C#: Figure 1 - Creating a new C# project

Click the 'Next' Button to provide the solution name, project name, and path for the code.

How to Read Word Document With Formatting in C#: Figure 2 - Configuring the new project

Then select the desired .NET version. The best practice is always to select the latest version available, though if your project has specific requirements then use the necessary .NET version.

How to Read Word Document With Formatting in C#: Figure 3 - Choosing the necessary .NET version type

Step 2: Install the IronWord Library

Open your C# project and install the IronWord library using the NuGet Package Manager Console:

Install-Package IronWord

The NuGet package can also be installed using Visual Studio's NuGet Package Manager, as shown below.

How to Read Word Document With Formatting in C#: Figure 4 - Installing IronWord through NuGet package manager

Step 3: Read the Word Document with Formatting

To read a Word file, first, we need to create a new document and then add some content to it as below.

How to Read Word Document With Formatting in C#: Figure 5 - Created sample document

Now save the file to the project directory and change the properties of the file to copy it to the output directory.

How to Read Word Document With Formatting in C#: Figure 6 - What the file properties should look like

Now add the below code snippet to the program.cs file:

using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through each paragraph in the Word document
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through each paragraph in the Word document
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports Microsoft.VisualBasic
Imports IronWord

Friend Class Program
	Shared Sub Main()
		Try
			' Load existing docx
			Dim sampleDoc = New WordDocument("sample.docx")
			Dim paragraphs = sampleDoc.Paragraphs

			' Iterate through each paragraph in the Word document
			For Each paragraph In paragraphs
				Dim textRun = paragraph.FirstTextRun
				Dim text = textRun.Text ' Read text content

				' Extract Formatting details if available
				If textRun.Style IsNot Nothing Then
					Dim fontSize = textRun.Style.FontSize ' Font size
					Dim isBold = textRun.Style.IsBold
					Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
				Else
					' Print text without formatting details
					Console.WriteLine($vbTab & "Text: {text}")
				End If
			Next paragraph
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

The above code reads the Word document using the IronWord library class WordDocument constructor method.

Output

How to Read Word Document With Formatting in C#: Figure 7 - Console output from the previous code

Explanation

  1. Open the Word Document: Load the Word document using WordDocument from IronWord.
  2. Iterate Through Paragraphs and Runs: Use nested loops to iterate through paragraphs and runs. Runs represent portions of text with specific formatting.
  3. Extract Text and Formatting: Extract text content from each run and check for formatting properties. In this example, we've demonstrated how to extract the font size and bold formatting.
  4. Handle Exceptions: A try-and-catch block is used to handle any exceptions and print them.

The loaded file can be used to print documents, we can also change the font color in the style object.

Read Tables from Word Files

We can also read tables from Word documents. Add the code snippet below to the program.

using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");

            // Read Tables
            var tables = sampleDoc.Tables;
            foreach (var table in tables)
            {
                var rows = table.Rows;
                foreach (var row in rows)
                {
                    foreach (var cell in row.Cells)
                    {
                        var contents = cell.Contents;
                        contents.ForEach(x => Console.WriteLine(x));
                        // Print cell contents
                    }                    
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");

            // Read Tables
            var tables = sampleDoc.Tables;
            foreach (var table in tables)
            {
                var rows = table.Rows;
                foreach (var row in rows)
                {
                    foreach (var cell in row.Cells)
                    {
                        var contents = cell.Contents;
                        contents.ForEach(x => Console.WriteLine(x));
                        // Print cell contents
                    }                    
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports IronWord

Friend Class Program
	Shared Sub Main()
		Try
			' Load existing docx
			Dim sampleDoc = New WordDocument("sample.docx")

			' Read Tables
			Dim tables = sampleDoc.Tables
			For Each table In tables
				Dim rows = table.Rows
				For Each row In rows
					For Each cell In row.Cells
						Dim contents = cell.Contents
						contents.ForEach(Sub(x) Console.WriteLine(x))
						' Print cell contents
					Next cell
				Next row
			Next table
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

Here we are using the Tables property on the WordDocument class to fetch all the tables in the document, then iterate through them and print the contents.

Add Style to Existing Text

We can add new style information to an existing Word document using the IronWord library as shown in the code snippet below.

using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Change the formatting of the text
            var style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].FirstTextRun.Style = style;

            // Save the document with the new style applied
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Change the formatting of the text
            var style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].FirstTextRun.Style = style;

            // Save the document with the new style applied
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models

Friend Class Program
	Shared Sub Main()
		Try
			' Load existing docx
			Dim sampleDoc = New WordDocument("sample.docx")
			Dim paragraphs = sampleDoc.Paragraphs

			' Iterate through paragraphs
			For Each paragraph In paragraphs
				Dim textRun = paragraph.FirstTextRun
				Dim text = textRun.Text ' Read text content

				' Extract Formatting details if available
				If textRun.Style IsNot Nothing Then
					Dim fontSize = textRun.Style.FontSize ' Font size
					Dim isBold = textRun.Style.IsBold
					Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
				Else
					' Print text without formatting details
					Console.WriteLine($vbTab & "Text: {text}")
				End If
			Next paragraph

			' Change the formatting of the text
			Dim style = New TextStyle() With {
				.FontFamily = "Caveat",
				.FontSize = 72,
				.TextColor = New IronColor(System.Drawing.Color.Blue),
				.IsBold = True,
				.IsItalic = True,
				.IsUnderline = True,
				.IsSuperscript = False,
				.IsStrikethrough = True,
				.IsSubscript = False
			}
			paragraphs(1).FirstTextRun.Style = style

			' Save the document with the new style applied
			sampleDoc.SaveAs("sample2.docx")
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

Here we are creating a TextStyle and adding it to the existing paragraph object.

Adding New Styled Content to the Word document

We can add new content to a loaded Word document as shown in the code snippet below.

using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load Word Document
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract the formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Add TextRun with Style to Paragraph
            TextRun blueTextRun = new TextRun();
            blueTextRun.Text = "Add text using IronWord";
            blueTextRun.Style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].AddTextRun(blueTextRun);

            // Add New Content to the Word file and save
            Paragraph newParagraph = new Paragraph();
            TextRun newTextRun = new TextRun("New Add Information");
            newParagraph.AddTextRun(newTextRun);

            // Configure the text with different styles
            TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
            TextStyle italicStyle = new TextStyle()
            {
                IsItalic = true
            };
            TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
            TextStyle boldStyle = new TextStyle()
            {
                IsBold = true
            };
            TextRun boldText = new TextRun("Bold example sentence.", boldStyle);

            // Add the styled text to the paragraph
            newParagraph.AddTextRun(introText);
            newParagraph.AddTextRun(italicText);
            newParagraph.AddTextRun(boldText);

            // Save the modified document
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load Word Document
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract the formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Add TextRun with Style to Paragraph
            TextRun blueTextRun = new TextRun();
            blueTextRun.Text = "Add text using IronWord";
            blueTextRun.Style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].AddTextRun(blueTextRun);

            // Add New Content to the Word file and save
            Paragraph newParagraph = new Paragraph();
            TextRun newTextRun = new TextRun("New Add Information");
            newParagraph.AddTextRun(newTextRun);

            // Configure the text with different styles
            TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
            TextStyle italicStyle = new TextStyle()
            {
                IsItalic = true
            };
            TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
            TextStyle boldStyle = new TextStyle()
            {
                IsBold = true
            };
            TextRun boldText = new TextRun("Bold example sentence.", boldStyle);

            // Add the styled text to the paragraph
            newParagraph.AddTextRun(introText);
            newParagraph.AddTextRun(italicText);
            newParagraph.AddTextRun(boldText);

            // Save the modified document
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models

Friend Class Program
	Shared Sub Main()
		Try
			' Load Word Document
			Dim sampleDoc = New WordDocument("sample.docx")
			Dim paragraphs = sampleDoc.Paragraphs

			' Iterate through paragraphs
			For Each paragraph In paragraphs
				Dim textRun = paragraph.FirstTextRun
				Dim text = textRun.Text ' Read text content

				' Extract the formatting details if available
				If textRun.Style IsNot Nothing Then
					Dim fontSize = textRun.Style.FontSize ' Font size
					Dim isBold = textRun.Style.IsBold
					Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
				Else
					' Print text without formatting details
					Console.WriteLine($vbTab & "Text: {text}")
				End If
			Next paragraph

			' Add TextRun with Style to Paragraph
			Dim blueTextRun As New TextRun()
			blueTextRun.Text = "Add text using IronWord"
			blueTextRun.Style = New TextStyle() With {
				.FontFamily = "Caveat",
				.FontSize = 72,
				.TextColor = New IronColor(System.Drawing.Color.Blue),
				.IsBold = True,
				.IsItalic = True,
				.IsUnderline = True,
				.IsSuperscript = False,
				.IsStrikethrough = True,
				.IsSubscript = False
			}
			paragraphs(1).AddTextRun(blueTextRun)

			' Add New Content to the Word file and save
			Dim newParagraph As New Paragraph()
			Dim newTextRun As New TextRun("New Add Information")
			newParagraph.AddTextRun(newTextRun)

			' Configure the text with different styles
			Dim introText As New TextRun("This is an example paragraph with italic and bold styling.")
			Dim italicStyle As New TextStyle() With {.IsItalic = True}
			Dim italicText As New TextRun("Italic example sentence.", italicStyle)
			Dim boldStyle As New TextStyle() With {.IsBold = True}
			Dim boldText As New TextRun("Bold example sentence.", boldStyle)

			' Add the styled text to the paragraph
			newParagraph.AddTextRun(introText)
			newParagraph.AddTextRun(italicText)
			newParagraph.AddTextRun(boldText)

			' Save the modified document
			sampleDoc.SaveAs("sample2.docx")
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

Here we are creating new TextRun and Paragraph objects with style information and adding them to the loaded Word document.

Licensing (Free Trial Available)

Obtain your IronWord free trial license key. This key needs to be placed in appsettings.json.

{
    "IronWord.LicenseKey": "IRONWORD.MYLICENSE.KEY.TRIAL"
}

Provide your email to get a trial license. After you submit your email ID, the key will be delivered via email.

How to Read Word Document With Formatting in C#: Figure 8 - Successfully submitted trial form

Conclusion

IronWord provides a convenient way to read Word documents with formatting in C#. Extend the provided code based on your specific requirements and the complexity of the documents you're working with. This tutorial serves as a starting point for integrating IronWord into your C# applications for Word document processing.

常见问题解答

我如何在 C# 中读取具有格式的 Word 文档?

要在 C# 中读取具有格式的 Word 文档,请使用 IronWord 库。首先通过 NuGet 包管理器安装 IronWord。使用 WordDocument 类加载文档,并迭代段落以提取文本和格式详细信息。

设置用于读取 Word 文档的 C# 项目的步骤是什么?

要设置用于读取 Word 文档的 C# 项目,安装 Visual Studio 或其他 C# 开发环境。使用 NuGet 包管理器将 IronWord 添加到您的项目中。使用 WordDocument 类加载 Word 文档以访问其内容。

在 C# 中读取 Word 文档时如何处理异常?

在 C# 中使用 IronWord 读取 Word 文档时,通过在文档处理代码周围实现 try-catch 块来处理异常。这将有助于管理运行时错误并确保应用程序的稳健性。

我可以使用 C# 从 Word 文档中读取表格吗?

可以,您可以在 C# 中使用 IronWord 从 Word 文档中读取表格。通过 WordDocument 类的 Tables 属性访问表格,并根据需要迭代表格数据。

如何在 C# 中修改 Word 文档的文本样式?

通过创建 TextStyle 对象并将其应用于特定的文本运行或段落,在使用 IronWord 的 Word 文档中修改文本样式。这使您可以自定义字体、大小和其他样式属性。

是否可以在 C# 中向 Word 文档添加新内容?

可以,您可以在 C# 中使用 IronWord 向 Word 文档添加新内容。创建 TextRunParagraph 对象,以在保存更改之前向文档添加样式化内容。

如何在 C# 中保存对 Word 文档的修改?

在使用 IronWord 编辑 Word 文档后,通过调用 WordDocument 实例的 Save 方法保存更改。指定文件路径以创建一个包含修改内容的新文档。

在 C# 中处理 Word 文档是否需要安装 Microsoft Office?

不,使用 IronWord 处理 C# 中的 Word 文档无需安装 Microsoft Office。该库独立于 Microsoft Office 运行,允许您直接处理 Word 文件。

哪些 .NET 版本与 Word 处理库兼容?

IronWord 与许多 .NET 版本兼容,包括 .NET 8、7、6、Framework、Core 和 Azure。这确保它满足各种项目要求和环境。

如何获取 C# 中 Word 处理库的试用许可证?

要获取 IronWord 的试用许可证,请访问 Iron Software 网站并提供您的电子邮件地址。您将通过电子邮件收到试用许可证密钥,可以将其添加到 appsettings.json 文件中。

Jordi Bardia
软件工程师
Jordi 最擅长 Python、C# 和 C++,当他不在 Iron Software 利用这些技能时,他就在游戏编程。分享产品测试、产品开发和研究的责任,Jordi 在持续的产品改进中增加了巨大的价值。多样的经验使他面临挑战并保持投入,他表示这是在 Iron Software 工作的最喜欢的方面之一。Jordi 在佛罗里达州迈阿密长大,并在佛罗里达大学学习计算机科学和统计学。