How to Read Word Document With Formatting in C#
Microsoft Word documents often contain rich formatting such as fonts, styles, and various elements that make them visually appealing. IronWord is a powerful library from Iron Software that has an intuitive C# and VB.NET Word and Docx Document API. There is no need to install Microsoft Office or Word Interop to build, edit, and export Word documents. IronWord fully supports .NET 8, 7, 6, Framework, Core, and Azure. This means that the library does not require Word installed on the machine and reads the files independently. If you're working with C# and need to read Word documents while preserving their formatting, this tutorial will guide you through the process using the IronWord library.
How to (in C#) Read Word Document With Formatting
- Install the IronWord library to read Word documents.
- Load 'sample.docx', the input Word document using the
WordDocument
class from the IronWord library. - Read the paragraphs with formatting using a loaded Word document.
- Show the extracted data with format information in the console output.
Prerequisites
- Visual Studio: Ensure you have Visual Studio or any other C# development environment installed.
- NuGet Package Manager: Make sure you can use NuGet to manage packages in your project
Step 1: Create a New C# Project
Create a new C# console application or use an existing project where you want to read Word documents.
Select the console application template and click next.
Click the 'Next' Button to provide the solution name, project name, and path for the code.
Then select the desired .NET version. The best practice is always to select the latest version available, though if your project has specific requirements then use the necessary .NET version.
Step 2: Install the IronWord Library
Open your C# project and install the IronWord library using the NuGet Package Manager Console:
Install-Package IronWord
The NuGet package can also be installed using Visual Studio's NuGet Package Manager, as shown below.
Step 3: Read the Word Document with Formatting
To read a Word file, first, we need to create a new document and then add some content to it as below.
Now save the file to the project directory and change the properties of the file to copy it to the output directory.
Now add the below code snippet to the program.cs
file:
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through each paragraph in the Word document
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through each paragraph in the Word document
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports Microsoft.VisualBasic
Imports IronWord
Friend Class Program
Shared Sub Main()
Try
' Load existing docx
Dim sampleDoc = New WordDocument("sample.docx")
Dim paragraphs = sampleDoc.Paragraphs
' Iterate through each paragraph in the Word document
For Each paragraph In paragraphs
Dim textRun = paragraph.FirstTextRun
Dim text = textRun.Text ' Read text content
' Extract Formatting details if available
If textRun.Style IsNot Nothing Then
Dim fontSize = textRun.Style.FontSize ' Font size
Dim isBold = textRun.Style.IsBold
Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
Else
' Print text without formatting details
Console.WriteLine($vbTab & "Text: {text}")
End If
Next paragraph
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
The above code reads the Word document using the IronWord library class WordDocument
constructor method.
Output
Explanation
- Open the Word Document: Load the Word document using
WordDocument
from IronWord. - Iterate Through Paragraphs and Runs: Use nested loops to iterate through paragraphs and runs. Runs represent portions of text with specific formatting.
- Extract Text and Formatting: Extract text content from each run and check for formatting properties. In this example, we've demonstrated how to extract the font size and bold formatting.
- Handle Exceptions: A try-and-catch block is used to handle any exceptions and print them.
The loaded file can be used to print documents, we can also change the font color in the style object.
Read Tables from Word Files
We can also read tables from Word documents. Add the code snippet below to the program.
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
// Read Tables
var tables = sampleDoc.Tables;
foreach (var table in tables)
{
var rows = table.Rows;
foreach (var row in rows)
{
foreach (var cell in row.Cells)
{
var contents = cell.Contents;
contents.ForEach(x => Console.WriteLine(x));
// Print cell contents
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
// Read Tables
var tables = sampleDoc.Tables;
foreach (var table in tables)
{
var rows = table.Rows;
foreach (var row in rows)
{
foreach (var cell in row.Cells)
{
var contents = cell.Contents;
contents.ForEach(x => Console.WriteLine(x));
// Print cell contents
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports IronWord
Friend Class Program
Shared Sub Main()
Try
' Load existing docx
Dim sampleDoc = New WordDocument("sample.docx")
' Read Tables
Dim tables = sampleDoc.Tables
For Each table In tables
Dim rows = table.Rows
For Each row In rows
For Each cell In row.Cells
Dim contents = cell.Contents
contents.ForEach(Sub(x) Console.WriteLine(x))
' Print cell contents
Next cell
Next row
Next table
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
Here we are using the Tables
property on the WordDocument
class to fetch all the tables in the document, then iterate through them and print the contents.
Add Style to Existing Text
We can add new style information to an existing Word document using the IronWord library as shown in the code snippet below.
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Change the formatting of the text
var style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].FirstTextRun.Style = style;
// Save the document with the new style applied
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Change the formatting of the text
var style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].FirstTextRun.Style = style;
// Save the document with the new style applied
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models
Friend Class Program
Shared Sub Main()
Try
' Load existing docx
Dim sampleDoc = New WordDocument("sample.docx")
Dim paragraphs = sampleDoc.Paragraphs
' Iterate through paragraphs
For Each paragraph In paragraphs
Dim textRun = paragraph.FirstTextRun
Dim text = textRun.Text ' Read text content
' Extract Formatting details if available
If textRun.Style IsNot Nothing Then
Dim fontSize = textRun.Style.FontSize ' Font size
Dim isBold = textRun.Style.IsBold
Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
Else
' Print text without formatting details
Console.WriteLine($vbTab & "Text: {text}")
End If
Next paragraph
' Change the formatting of the text
Dim style = New TextStyle() With {
.FontFamily = "Caveat",
.FontSize = 72,
.TextColor = New IronColor(System.Drawing.Color.Blue),
.IsBold = True,
.IsItalic = True,
.IsUnderline = True,
.IsSuperscript = False,
.IsStrikethrough = True,
.IsSubscript = False
}
paragraphs(1).FirstTextRun.Style = style
' Save the document with the new style applied
sampleDoc.SaveAs("sample2.docx")
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
Here we are creating a TextStyle
and adding it to the existing paragraph object.
Adding New Styled Content to the Word document
We can add new content to a loaded Word document as shown in the code snippet below.
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load Word Document
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract the formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Add TextRun with Style to Paragraph
TextRun blueTextRun = new TextRun();
blueTextRun.Text = "Add text using IronWord";
blueTextRun.Style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].AddTextRun(blueTextRun);
// Add New Content to the Word file and save
Paragraph newParagraph = new Paragraph();
TextRun newTextRun = new TextRun("New Add Information");
newParagraph.AddTextRun(newTextRun);
// Configure the text with different styles
TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
TextStyle italicStyle = new TextStyle()
{
IsItalic = true
};
TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
TextStyle boldStyle = new TextStyle()
{
IsBold = true
};
TextRun boldText = new TextRun("Bold example sentence.", boldStyle);
// Add the styled text to the paragraph
newParagraph.AddTextRun(introText);
newParagraph.AddTextRun(italicText);
newParagraph.AddTextRun(boldText);
// Save the modified document
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load Word Document
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract the formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Add TextRun with Style to Paragraph
TextRun blueTextRun = new TextRun();
blueTextRun.Text = "Add text using IronWord";
blueTextRun.Style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].AddTextRun(blueTextRun);
// Add New Content to the Word file and save
Paragraph newParagraph = new Paragraph();
TextRun newTextRun = new TextRun("New Add Information");
newParagraph.AddTextRun(newTextRun);
// Configure the text with different styles
TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
TextStyle italicStyle = new TextStyle()
{
IsItalic = true
};
TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
TextStyle boldStyle = new TextStyle()
{
IsBold = true
};
TextRun boldText = new TextRun("Bold example sentence.", boldStyle);
// Add the styled text to the paragraph
newParagraph.AddTextRun(introText);
newParagraph.AddTextRun(italicText);
newParagraph.AddTextRun(boldText);
// Save the modified document
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models
Friend Class Program
Shared Sub Main()
Try
' Load Word Document
Dim sampleDoc = New WordDocument("sample.docx")
Dim paragraphs = sampleDoc.Paragraphs
' Iterate through paragraphs
For Each paragraph In paragraphs
Dim textRun = paragraph.FirstTextRun
Dim text = textRun.Text ' Read text content
' Extract the formatting details if available
If textRun.Style IsNot Nothing Then
Dim fontSize = textRun.Style.FontSize ' Font size
Dim isBold = textRun.Style.IsBold
Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
Else
' Print text without formatting details
Console.WriteLine($vbTab & "Text: {text}")
End If
Next paragraph
' Add TextRun with Style to Paragraph
Dim blueTextRun As New TextRun()
blueTextRun.Text = "Add text using IronWord"
blueTextRun.Style = New TextStyle() With {
.FontFamily = "Caveat",
.FontSize = 72,
.TextColor = New IronColor(System.Drawing.Color.Blue),
.IsBold = True,
.IsItalic = True,
.IsUnderline = True,
.IsSuperscript = False,
.IsStrikethrough = True,
.IsSubscript = False
}
paragraphs(1).AddTextRun(blueTextRun)
' Add New Content to the Word file and save
Dim newParagraph As New Paragraph()
Dim newTextRun As New TextRun("New Add Information")
newParagraph.AddTextRun(newTextRun)
' Configure the text with different styles
Dim introText As New TextRun("This is an example paragraph with italic and bold styling.")
Dim italicStyle As New TextStyle() With {.IsItalic = True}
Dim italicText As New TextRun("Italic example sentence.", italicStyle)
Dim boldStyle As New TextStyle() With {.IsBold = True}
Dim boldText As New TextRun("Bold example sentence.", boldStyle)
' Add the styled text to the paragraph
newParagraph.AddTextRun(introText)
newParagraph.AddTextRun(italicText)
newParagraph.AddTextRun(boldText)
' Save the modified document
sampleDoc.SaveAs("sample2.docx")
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
Here we are creating new TextRun
and Paragraph
objects with style information and adding them to the loaded Word document.
Licensing (Free Trial Available)
Obtain your IronWord free trial license key. This key needs to be placed in appsettings.json
.
{
"IronWord.LicenseKey": "IRONWORD.MYLICENSE.KEY.TRIAL"
}
Provide your email to get a trial license. After you submit your email ID, the key will be delivered via email.
Conclusion
IronWord provides a convenient way to read Word documents with formatting in C#. Extend the provided code based on your specific requirements and the complexity of the documents you're working with. This tutorial serves as a starting point for integrating IronWord into your C# applications for Word document processing.
Frequently Asked Questions
What is a powerful library for working with Word documents in C#?
IronWord is a powerful library from Iron Software that offers an intuitive C# and VB.NET Word and Docx Document API, allowing users to build, edit, and export Word documents without needing Microsoft Office or Word Interop installed.
How can I read a Word document with formatting in C#?
To read a Word document with formatting using IronWord in C#, first install the IronWord library, load the document using the `WordDocument` class, and then iterate through the paragraphs to extract text and formatting details.
What are the prerequisites for using a Word document processing library in a C# project?
The prerequisites for using IronWord in a C# project include having Visual Studio or another C# development environment installed and the ability to manage packages using NuGet Package Manager.
How do I install a library for processing Word documents in my C# project?
You can install the IronWord library in your C# project using the NuGet Package Manager Console with the command `Install-Package IronWord`. It can also be installed using Visual Studio's NuGet Package Manager.
Can a Word processing library read tables from Word documents?
Yes, IronWord can read tables from Word documents. You can use the `Tables` property on the `WordDocument` class to access and iterate through the tables.
Is it possible to add style to existing text in a Word document using a library?
Yes, you can add style to existing text in a Word document using IronWord by creating a `TextStyle` object and applying it to the desired text run or paragraph.
How can I add new styled content to a Word document using a library?
To add new styled content, create a `TextRun` with style information and add it to a paragraph in the document. You can also create new paragraphs and add them to the loaded document.
What .NET versions does a Word processing library support?
IronWord supports .NET 8, 7, 6, Framework, Core, and Azure, making it versatile for various project requirements.
How can I obtain a trial license for a Word processing library?
You can obtain a free trial license for IronWord by providing your email on the Iron Software website. The trial license key will be delivered via email and needs to be placed in the `appsettings.json` file.
Do I need Microsoft Office installed to use a Word processing library?
No, you do not need Microsoft Office installed to use IronWord, as it reads and processes Word files independently.