How to Read Word Document With Formatting in C#
Microsoft Word documents often contain rich formatting such as fonts, styles, and various elements that make them visually appealing. IronWord is a powerful library from Iron Software that has an intuitive C# and VB.NET Word and Docx Document API. There is no need to install Microsoft Office or Word Interop to build, edit, and export Word documents. IronWord fully supports .NET 8, 7, 6, Framework, Core, and Azure. This means that the library does not require Word installed on the machine and reads the files independently. If you're working with C# and need to read Word documents while preserving their formatting, this tutorial will guide you through the process using the IronWord library.
How to (in C#) Read Word Document With Formatting
- Install the IronWord library to read Word documents.
- Load 'sample.docx', the input Word document using the
WordDocument
class from the IronWord library. - Read the paragraphs with formatting using a loaded Word document.
- Show the extracted data with format information in the console output.
Prerequisites
- Visual Studio: Ensure you have Visual Studio or any other C# development environment installed.
- NuGet Package Manager: Make sure you can use NuGet to manage packages in your project
Step 1: Create a New C# Project
Create a new C# console application or use an existing project where you want to read Word documents.
Select the console application template and click next.
Click the 'Next' Button to provide the solution name, project name, and path for the code.
Then select the desired .NET version. The best practice is always to select the latest version available, though if your project has specific requirements then use the necessary .NET version.
Step 2: Install the IronWord Library
Open your C# project and install the IronWord library using the NuGet Package Manager Console:
Install-Package IronWord
The NuGet package can also be installed using Visual Studio's NuGet Package Manager, as shown below.
Step 3: Read the Word Document with Formatting
To read a Word file, first, we need to create a new document and then add some content to it as below.
Now save the file to the project directory and change the properties of the file to copy it to the output directory.
Now add the below code snippet to the program.cs
file:
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through each paragraph in the Word document
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through each paragraph in the Word document
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports Microsoft.VisualBasic
Imports IronWord
Friend Class Program
Shared Sub Main()
Try
' Load existing docx
Dim sampleDoc = New WordDocument("sample.docx")
Dim paragraphs = sampleDoc.Paragraphs
' Iterate through each paragraph in the Word document
For Each paragraph In paragraphs
Dim textRun = paragraph.FirstTextRun
Dim text = textRun.Text ' Read text content
' Extract Formatting details if available
If textRun.Style IsNot Nothing Then
Dim fontSize = textRun.Style.FontSize ' Font size
Dim isBold = textRun.Style.IsBold
Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
Else
' Print text without formatting details
Console.WriteLine($vbTab & "Text: {text}")
End If
Next paragraph
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
The above code reads the Word document using the IronWord library class WordDocument
constructor method.
Output
Explanation
- Open the Word Document: Load the Word document using
WordDocument
from IronWord. - Iterate Through Paragraphs and Runs: Use nested loops to iterate through paragraphs and runs. Runs represent portions of text with specific formatting.
- Extract Text and Formatting: Extract text content from each run and check for formatting properties. In this example, we've demonstrated how to extract the font size and bold formatting.
- Handle Exceptions: A try-and-catch block is used to handle any exceptions and print them.
The loaded file can be used to print documents, we can also change the font color in the style object.
Read Tables from Word Files
We can also read tables from Word documents. Add the code snippet below to the program.
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
// Read Tables
var tables = sampleDoc.Tables;
foreach (var table in tables)
{
var rows = table.Rows;
foreach (var row in rows)
{
foreach (var cell in row.Cells)
{
var contents = cell.Contents;
contents.ForEach(x => Console.WriteLine(x));
// Print cell contents
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
// Read Tables
var tables = sampleDoc.Tables;
foreach (var table in tables)
{
var rows = table.Rows;
foreach (var row in rows)
{
foreach (var cell in row.Cells)
{
var contents = cell.Contents;
contents.ForEach(x => Console.WriteLine(x));
// Print cell contents
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports IronWord
Friend Class Program
Shared Sub Main()
Try
' Load existing docx
Dim sampleDoc = New WordDocument("sample.docx")
' Read Tables
Dim tables = sampleDoc.Tables
For Each table In tables
Dim rows = table.Rows
For Each row In rows
For Each cell In row.Cells
Dim contents = cell.Contents
contents.ForEach(Sub(x) Console.WriteLine(x))
' Print cell contents
Next cell
Next row
Next table
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
Here we are using the Tables
property on the WordDocument
class to fetch all the tables in the document, then iterate through them and print the contents.
Add Style to Existing Text
We can add new style information to an existing Word document using the IronWord library as shown in the code snippet below.
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Change the formatting of the text
var style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].FirstTextRun.Style = style;
// Save the document with the new style applied
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load existing docx
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract Formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Change the formatting of the text
var style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].FirstTextRun.Style = style;
// Save the document with the new style applied
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models
Friend Class Program
Shared Sub Main()
Try
' Load existing docx
Dim sampleDoc = New WordDocument("sample.docx")
Dim paragraphs = sampleDoc.Paragraphs
' Iterate through paragraphs
For Each paragraph In paragraphs
Dim textRun = paragraph.FirstTextRun
Dim text = textRun.Text ' Read text content
' Extract Formatting details if available
If textRun.Style IsNot Nothing Then
Dim fontSize = textRun.Style.FontSize ' Font size
Dim isBold = textRun.Style.IsBold
Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
Else
' Print text without formatting details
Console.WriteLine($vbTab & "Text: {text}")
End If
Next paragraph
' Change the formatting of the text
Dim style = New TextStyle() With {
.FontFamily = "Caveat",
.FontSize = 72,
.TextColor = New IronColor(System.Drawing.Color.Blue),
.IsBold = True,
.IsItalic = True,
.IsUnderline = True,
.IsSuperscript = False,
.IsStrikethrough = True,
.IsSubscript = False
}
paragraphs(1).FirstTextRun.Style = style
' Save the document with the new style applied
sampleDoc.SaveAs("sample2.docx")
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
Here we are creating a TextStyle
and adding it to the existing paragraph object.
Adding New Styled Content to the Word document
We can add new content to a loaded Word document as shown in the code snippet below.
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load Word Document
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract the formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Add TextRun with Style to Paragraph
TextRun blueTextRun = new TextRun();
blueTextRun.Text = "Add text using IronWord";
blueTextRun.Style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].AddTextRun(blueTextRun);
// Add New Content to the Word file and save
Paragraph newParagraph = new Paragraph();
TextRun newTextRun = new TextRun("New Add Information");
newParagraph.AddTextRun(newTextRun);
// Configure the text with different styles
TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
TextStyle italicStyle = new TextStyle()
{
IsItalic = true
};
TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
TextStyle boldStyle = new TextStyle()
{
IsBold = true
};
TextRun boldText = new TextRun("Bold example sentence.", boldStyle);
// Add the styled text to the paragraph
newParagraph.AddTextRun(introText);
newParagraph.AddTextRun(italicText);
newParagraph.AddTextRun(boldText);
// Save the modified document
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
using IronWord;
using IronWord.Models;
class Program
{
static void Main()
{
try
{
// Load Word Document
var sampleDoc = new WordDocument("sample.docx");
var paragraphs = sampleDoc.Paragraphs;
// Iterate through paragraphs
foreach (var paragraph in paragraphs)
{
var textRun = paragraph.FirstTextRun;
var text = textRun.Text; // Read text content
// Extract the formatting details if available
if (textRun.Style != null)
{
var fontSize = textRun.Style.FontSize; // Font size
var isBold = textRun.Style.IsBold;
Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
}
else
{
// Print text without formatting details
Console.WriteLine($"\tText: {text}");
}
}
// Add TextRun with Style to Paragraph
TextRun blueTextRun = new TextRun();
blueTextRun.Text = "Add text using IronWord";
blueTextRun.Style = new TextStyle()
{
FontFamily = "Caveat",
FontSize = 72,
TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
IsBold = true,
IsItalic = true,
IsUnderline = true,
IsSuperscript = false,
IsStrikethrough = true,
IsSubscript = false
};
paragraphs[1].AddTextRun(blueTextRun);
// Add New Content to the Word file and save
Paragraph newParagraph = new Paragraph();
TextRun newTextRun = new TextRun("New Add Information");
newParagraph.AddTextRun(newTextRun);
// Configure the text with different styles
TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
TextStyle italicStyle = new TextStyle()
{
IsItalic = true
};
TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
TextStyle boldStyle = new TextStyle()
{
IsBold = true
};
TextRun boldText = new TextRun("Bold example sentence.", boldStyle);
// Add the styled text to the paragraph
newParagraph.AddTextRun(introText);
newParagraph.AddTextRun(italicText);
newParagraph.AddTextRun(boldText);
// Save the modified document
sampleDoc.SaveAs("sample2.docx");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models
Friend Class Program
Shared Sub Main()
Try
' Load Word Document
Dim sampleDoc = New WordDocument("sample.docx")
Dim paragraphs = sampleDoc.Paragraphs
' Iterate through paragraphs
For Each paragraph In paragraphs
Dim textRun = paragraph.FirstTextRun
Dim text = textRun.Text ' Read text content
' Extract the formatting details if available
If textRun.Style IsNot Nothing Then
Dim fontSize = textRun.Style.FontSize ' Font size
Dim isBold = textRun.Style.IsBold
Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
Else
' Print text without formatting details
Console.WriteLine($vbTab & "Text: {text}")
End If
Next paragraph
' Add TextRun with Style to Paragraph
Dim blueTextRun As New TextRun()
blueTextRun.Text = "Add text using IronWord"
blueTextRun.Style = New TextStyle() With {
.FontFamily = "Caveat",
.FontSize = 72,
.TextColor = New IronColor(System.Drawing.Color.Blue),
.IsBold = True,
.IsItalic = True,
.IsUnderline = True,
.IsSuperscript = False,
.IsStrikethrough = True,
.IsSubscript = False
}
paragraphs(1).AddTextRun(blueTextRun)
' Add New Content to the Word file and save
Dim newParagraph As New Paragraph()
Dim newTextRun As New TextRun("New Add Information")
newParagraph.AddTextRun(newTextRun)
' Configure the text with different styles
Dim introText As New TextRun("This is an example paragraph with italic and bold styling.")
Dim italicStyle As New TextStyle() With {.IsItalic = True}
Dim italicText As New TextRun("Italic example sentence.", italicStyle)
Dim boldStyle As New TextStyle() With {.IsBold = True}
Dim boldText As New TextRun("Bold example sentence.", boldStyle)
' Add the styled text to the paragraph
newParagraph.AddTextRun(introText)
newParagraph.AddTextRun(italicText)
newParagraph.AddTextRun(boldText)
' Save the modified document
sampleDoc.SaveAs("sample2.docx")
Catch ex As Exception
Console.WriteLine($"An error occurred: {ex.Message}")
End Try
End Sub
End Class
Here we are creating new TextRun
and Paragraph
objects with style information and adding them to the loaded Word document.
Licensing (Free Trial Available)
Obtain your IronWord free trial license key. This key needs to be placed in appsettings.json
.
{
"IronWord.LicenseKey": "IRONWORD.MYLICENSE.KEY.TRIAL"
}
Provide your email to get a trial license. After you submit your email ID, the key will be delivered via email.
Conclusion
IronWord provides a convenient way to read Word documents with formatting in C#. Extend the provided code based on your specific requirements and the complexity of the documents you're working with. This tutorial serves as a starting point for integrating IronWord into your C# applications for Word document processing.
Frequently Asked Questions
How can I read Word documents with formatting in C#?
To read Word documents with formatting in C#, use the IronWord library. Start by installing IronWord via the NuGet Package Manager. Load the document using the WordDocument
class and iterate through paragraphs to extract text and formatting details.
What are the steps to set up a C# project for reading Word documents?
To set up a C# project for reading Word documents, install Visual Studio or another C# development environment. Use NuGet Package Manager to add IronWord to your project. Load Word documents with the WordDocument
class to access their content.
How do I handle exceptions when reading Word documents in C#?
When reading Word documents in C# using IronWord, handle exceptions by implementing try-catch blocks around your document processing code. This will help manage runtime errors and ensure robust application behavior.
Can I read tables from Word documents using C#?
Yes, you can read tables from Word documents using IronWord in C#. Access the tables through the Tables
property of the WordDocument
class and iterate through the table data as needed.
How can I modify text styles in a Word document using C#?
Modify text styles in a Word document using IronWord by creating a TextStyle
object and applying it to specific text runs or paragraphs. This allows you to customize fonts, sizes, and other styling attributes.
Is it possible to add new content to Word documents in C#?
Yes, you can add new content to Word documents using IronWord in C#. Create TextRun
and Paragraph
objects to add styled content to the document before saving your changes.
How do I save modifications to a Word document in C#?
After editing a Word document using IronWord, save your changes by calling the Save
method on the WordDocument
instance. Specify the file path to create a new document with the applied modifications.
Do I need Microsoft Office installed to process Word documents in C#?
No, you do not need Microsoft Office installed to process Word documents in C# using IronWord. The library functions independently of Microsoft Office, allowing you to work with Word files directly.
What .NET versions are compatible with a Word processing library?
IronWord is compatible with a wide range of .NET versions, including .NET 8, 7, 6, Framework, Core, and Azure. This ensures it meets various project requirements and environments.
How can I obtain a trial license for a Word processing library in C#?
To obtain a trial license for IronWord, visit the Iron Software website and provide your email address. You will receive a trial license key by email, which you can add to your appsettings.json
file.