Saltar al pie de página
USANDO IRONWORD

Cómo Leer Documento Word Con Formateo en C#

Microsoft Word documents often contain rich formatting such as fonts, styles, and various elements that make them visually appealing. IronWord is a powerful library from Iron Software that has an intuitive C# and VB.NET Word and Docx Document API. There is no need to install Microsoft Office or Word Interop to build, edit, and export Word documents. IronWord fully supports .NET 8, 7, 6, Framework, Core, and Azure. This means that the library does not require Word installed on the machine and reads the files independently. If you're working with C# and need to read Word documents while preserving their formatting, this tutorial will guide you through the process using the IronWord library.

How to (in C#) Read Word Document With Formatting

  1. Install the IronWord library to read Word documents.
  2. Load 'sample.docx', the input Word document using the WordDocument class from the IronWord library.
  3. Read the paragraphs with formatting using a loaded Word document.
  4. Show the extracted data with format information in the console output.

Prerequisites

  1. Visual Studio: Ensure you have Visual Studio or any other C# development environment installed.
  2. NuGet Package Manager: Make sure you can use NuGet to manage packages in your project

Step 1: Create a New C# Project

Create a new C# console application or use an existing project where you want to read Word documents.

Select the console application template and click next.

How to Read Word Document With Formatting in C#: Figure 1 - Creating a new C# project

Click the 'Next' Button to provide the solution name, project name, and path for the code.

How to Read Word Document With Formatting in C#: Figure 2 - Configuring the new project

Then select the desired .NET version. The best practice is always to select the latest version available, though if your project has specific requirements then use the necessary .NET version.

How to Read Word Document With Formatting in C#: Figure 3 - Choosing the necessary .NET version type

Step 2: Install the IronWord Library

Open your C# project and install the IronWord library using the NuGet Package Manager Console:

Install-Package IronWord

The NuGet package can also be installed using Visual Studio's NuGet Package Manager, as shown below.

How to Read Word Document With Formatting in C#: Figure 4 - Installing IronWord through NuGet package manager

Step 3: Read the Word Document with Formatting

To read a Word file, first, we need to create a new document and then add some content to it as below.

How to Read Word Document With Formatting in C#: Figure 5 - Created sample document

Now save the file to the project directory and change the properties of the file to copy it to the output directory.

How to Read Word Document With Formatting in C#: Figure 6 - What the file properties should look like

Now add the below code snippet to the program.cs file:

using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through each paragraph in the Word document
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through each paragraph in the Word document
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports Microsoft.VisualBasic
Imports IronWord

Friend Class Program
	Shared Sub Main()
		Try
			' Load existing docx
			Dim sampleDoc = New WordDocument("sample.docx")
			Dim paragraphs = sampleDoc.Paragraphs

			' Iterate through each paragraph in the Word document
			For Each paragraph In paragraphs
				Dim textRun = paragraph.FirstTextRun
				Dim text = textRun.Text ' Read text content

				' Extract Formatting details if available
				If textRun.Style IsNot Nothing Then
					Dim fontSize = textRun.Style.FontSize ' Font size
					Dim isBold = textRun.Style.IsBold
					Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
				Else
					' Print text without formatting details
					Console.WriteLine($vbTab & "Text: {text}")
				End If
			Next paragraph
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

The above code reads the Word document using the IronWord library class WordDocument constructor method.

Output

How to Read Word Document With Formatting in C#: Figure 7 - Console output from the previous code

Explanation

  1. Open the Word Document: Load the Word document using WordDocument from IronWord.
  2. Iterate Through Paragraphs and Runs: Use nested loops to iterate through paragraphs and runs. Runs represent portions of text with specific formatting.
  3. Extract Text and Formatting: Extract text content from each run and check for formatting properties. In this example, we've demonstrated how to extract the font size and bold formatting.
  4. Handle Exceptions: A try-and-catch block is used to handle any exceptions and print them.

The loaded file can be used to print documents, we can also change the font color in the style object.

Read Tables from Word Files

We can also read tables from Word documents. Add the code snippet below to the program.

using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");

            // Read Tables
            var tables = sampleDoc.Tables;
            foreach (var table in tables)
            {
                var rows = table.Rows;
                foreach (var row in rows)
                {
                    foreach (var cell in row.Cells)
                    {
                        var contents = cell.Contents;
                        contents.ForEach(x => Console.WriteLine(x));
                        // Print cell contents
                    }                    
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");

            // Read Tables
            var tables = sampleDoc.Tables;
            foreach (var table in tables)
            {
                var rows = table.Rows;
                foreach (var row in rows)
                {
                    foreach (var cell in row.Cells)
                    {
                        var contents = cell.Contents;
                        contents.ForEach(x => Console.WriteLine(x));
                        // Print cell contents
                    }                    
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports IronWord

Friend Class Program
	Shared Sub Main()
		Try
			' Load existing docx
			Dim sampleDoc = New WordDocument("sample.docx")

			' Read Tables
			Dim tables = sampleDoc.Tables
			For Each table In tables
				Dim rows = table.Rows
				For Each row In rows
					For Each cell In row.Cells
						Dim contents = cell.Contents
						contents.ForEach(Sub(x) Console.WriteLine(x))
						' Print cell contents
					Next cell
				Next row
			Next table
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

Here we are using the Tables property on the WordDocument class to fetch all the tables in the document, then iterate through them and print the contents.

Add Style to Existing Text

We can add new style information to an existing Word document using the IronWord library as shown in the code snippet below.

using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Change the formatting of the text
            var style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].FirstTextRun.Style = style;

            // Save the document with the new style applied
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load existing docx
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract Formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Change the formatting of the text
            var style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].FirstTextRun.Style = style;

            // Save the document with the new style applied
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models

Friend Class Program
	Shared Sub Main()
		Try
			' Load existing docx
			Dim sampleDoc = New WordDocument("sample.docx")
			Dim paragraphs = sampleDoc.Paragraphs

			' Iterate through paragraphs
			For Each paragraph In paragraphs
				Dim textRun = paragraph.FirstTextRun
				Dim text = textRun.Text ' Read text content

				' Extract Formatting details if available
				If textRun.Style IsNot Nothing Then
					Dim fontSize = textRun.Style.FontSize ' Font size
					Dim isBold = textRun.Style.IsBold
					Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
				Else
					' Print text without formatting details
					Console.WriteLine($vbTab & "Text: {text}")
				End If
			Next paragraph

			' Change the formatting of the text
			Dim style = New TextStyle() With {
				.FontFamily = "Caveat",
				.FontSize = 72,
				.TextColor = New IronColor(System.Drawing.Color.Blue),
				.IsBold = True,
				.IsItalic = True,
				.IsUnderline = True,
				.IsSuperscript = False,
				.IsStrikethrough = True,
				.IsSubscript = False
			}
			paragraphs(1).FirstTextRun.Style = style

			' Save the document with the new style applied
			sampleDoc.SaveAs("sample2.docx")
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

Here we are creating a TextStyle and adding it to the existing paragraph object.

Adding New Styled Content to the Word document

We can add new content to a loaded Word document as shown in the code snippet below.

using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load Word Document
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract the formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Add TextRun with Style to Paragraph
            TextRun blueTextRun = new TextRun();
            blueTextRun.Text = "Add text using IronWord";
            blueTextRun.Style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].AddTextRun(blueTextRun);

            // Add New Content to the Word file and save
            Paragraph newParagraph = new Paragraph();
            TextRun newTextRun = new TextRun("New Add Information");
            newParagraph.AddTextRun(newTextRun);

            // Configure the text with different styles
            TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
            TextStyle italicStyle = new TextStyle()
            {
                IsItalic = true
            };
            TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
            TextStyle boldStyle = new TextStyle()
            {
                IsBold = true
            };
            TextRun boldText = new TextRun("Bold example sentence.", boldStyle);

            // Add the styled text to the paragraph
            newParagraph.AddTextRun(introText);
            newParagraph.AddTextRun(italicText);
            newParagraph.AddTextRun(boldText);

            // Save the modified document
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
using IronWord;
using IronWord.Models;

class Program
{
    static void Main()
    {
        try
        {
            // Load Word Document
            var sampleDoc = new WordDocument("sample.docx");
            var paragraphs = sampleDoc.Paragraphs;

            // Iterate through paragraphs
            foreach (var paragraph in paragraphs)
            {
                var textRun = paragraph.FirstTextRun;
                var text = textRun.Text; // Read text content

                // Extract the formatting details if available
                if (textRun.Style != null)
                {
                    var fontSize = textRun.Style.FontSize; // Font size
                    var isBold = textRun.Style.IsBold;
                    Console.WriteLine($"\tText: {text}, FontSize: {fontSize}, Bold: {isBold}");
                }
                else
                {
                    // Print text without formatting details
                    Console.WriteLine($"\tText: {text}");
                }
            }

            // Add TextRun with Style to Paragraph
            TextRun blueTextRun = new TextRun();
            blueTextRun.Text = "Add text using IronWord";
            blueTextRun.Style = new TextStyle()
            {
                FontFamily = "Caveat",
                FontSize = 72,
                TextColor = new IronColor(System.Drawing.Color.Blue), // Blue color
                IsBold = true,
                IsItalic = true,
                IsUnderline = true,
                IsSuperscript = false,
                IsStrikethrough = true,
                IsSubscript = false
            };
            paragraphs[1].AddTextRun(blueTextRun);

            // Add New Content to the Word file and save
            Paragraph newParagraph = new Paragraph();
            TextRun newTextRun = new TextRun("New Add Information");
            newParagraph.AddTextRun(newTextRun);

            // Configure the text with different styles
            TextRun introText = new TextRun("This is an example paragraph with italic and bold styling.");
            TextStyle italicStyle = new TextStyle()
            {
                IsItalic = true
            };
            TextRun italicText = new TextRun("Italic example sentence.", italicStyle);
            TextStyle boldStyle = new TextStyle()
            {
                IsBold = true
            };
            TextRun boldText = new TextRun("Bold example sentence.", boldStyle);

            // Add the styled text to the paragraph
            newParagraph.AddTextRun(introText);
            newParagraph.AddTextRun(italicText);
            newParagraph.AddTextRun(boldText);

            // Save the modified document
            sampleDoc.SaveAs("sample2.docx");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"An error occurred: {ex.Message}");
        }
    }
}
Imports Microsoft.VisualBasic
Imports IronWord
Imports IronWord.Models

Friend Class Program
	Shared Sub Main()
		Try
			' Load Word Document
			Dim sampleDoc = New WordDocument("sample.docx")
			Dim paragraphs = sampleDoc.Paragraphs

			' Iterate through paragraphs
			For Each paragraph In paragraphs
				Dim textRun = paragraph.FirstTextRun
				Dim text = textRun.Text ' Read text content

				' Extract the formatting details if available
				If textRun.Style IsNot Nothing Then
					Dim fontSize = textRun.Style.FontSize ' Font size
					Dim isBold = textRun.Style.IsBold
					Console.WriteLine($vbTab & "Text: {text}, FontSize: {fontSize}, Bold: {isBold}")
				Else
					' Print text without formatting details
					Console.WriteLine($vbTab & "Text: {text}")
				End If
			Next paragraph

			' Add TextRun with Style to Paragraph
			Dim blueTextRun As New TextRun()
			blueTextRun.Text = "Add text using IronWord"
			blueTextRun.Style = New TextStyle() With {
				.FontFamily = "Caveat",
				.FontSize = 72,
				.TextColor = New IronColor(System.Drawing.Color.Blue),
				.IsBold = True,
				.IsItalic = True,
				.IsUnderline = True,
				.IsSuperscript = False,
				.IsStrikethrough = True,
				.IsSubscript = False
			}
			paragraphs(1).AddTextRun(blueTextRun)

			' Add New Content to the Word file and save
			Dim newParagraph As New Paragraph()
			Dim newTextRun As New TextRun("New Add Information")
			newParagraph.AddTextRun(newTextRun)

			' Configure the text with different styles
			Dim introText As New TextRun("This is an example paragraph with italic and bold styling.")
			Dim italicStyle As New TextStyle() With {.IsItalic = True}
			Dim italicText As New TextRun("Italic example sentence.", italicStyle)
			Dim boldStyle As New TextStyle() With {.IsBold = True}
			Dim boldText As New TextRun("Bold example sentence.", boldStyle)

			' Add the styled text to the paragraph
			newParagraph.AddTextRun(introText)
			newParagraph.AddTextRun(italicText)
			newParagraph.AddTextRun(boldText)

			' Save the modified document
			sampleDoc.SaveAs("sample2.docx")
		Catch ex As Exception
			Console.WriteLine($"An error occurred: {ex.Message}")
		End Try
	End Sub
End Class
$vbLabelText   $csharpLabel

Here we are creating new TextRun and Paragraph objects with style information and adding them to the loaded Word document.

Licensing (Free Trial Available)

Obtain your IronWord free trial license key. This key needs to be placed in appsettings.json.

{
    "IronWord.LicenseKey": "IRONWORD.MYLICENSE.KEY.TRIAL"
}

Provide your email to get a trial license. After you submit your email ID, the key will be delivered via email.

How to Read Word Document With Formatting in C#: Figure 8 - Successfully submitted trial form

Conclusion

IronWord provides a convenient way to read Word documents with formatting in C#. Extend the provided code based on your specific requirements and the complexity of the documents you're working with. This tutorial serves as a starting point for integrating IronWord into your C# applications for Word document processing.

Preguntas Frecuentes

¿Cómo puedo leer documentos de Word con formato en C#?

Para leer documentos de Word con formato en C#, utiliza la biblioteca IronWord. Comienza instalando IronWord a través del Administrador de paquetes NuGet. Carga el documento utilizando la clase WordDocument e itera a través de los párrafos para extraer texto y detalles de formato.

¿Cuáles son los pasos para configurar un proyecto C# para leer documentos de Word?

Para configurar un proyecto C# para leer documentos de Word, instala Visual Studio u otro entorno de desarrollo C#. Usa el Administrador de paquetes NuGet para agregar IronWord a tu proyecto. Carga documentos de Word con la clase WordDocument para acceder a su contenido.

¿Cómo manejo excepciones al leer documentos de Word en C#?

Al leer documentos de Word en C# utilizando IronWord, maneja excepciones implementando bloques try-catch alrededor de tu código de procesamiento de documentos. Esto ayudará a manejar errores de ejecución y asegurará un comportamiento robusto de la aplicación.

¿Puedo leer tablas de documentos de Word usando C#?

Sí, puedes leer tablas de documentos de Word usando IronWord en C#. Accede a las tablas a través de la propiedad Tables de la clase WordDocument e itera a través de los datos de la tabla según sea necesario.

¿Cómo puedo modificar estilos de texto en un documento de Word usando C#?

Modifica estilos de texto en un documento de Word usando IronWord creando un objeto TextStyle y aplicándolo a ejecuciones de texto o párrafos específicos. Esto te permite personalizar fuentes, tamaños y otros atributos de estilo.

¿Es posible agregar nuevo contenido a documentos de Word en C#?

Sí, puedes agregar nuevo contenido a documentos de Word usando IronWord en C#. Crea objetos TextRun y Paragraph para agregar contenido con estilo al documento antes de guardar tus cambios.

¿Cómo guardo modificaciones a un documento de Word en C#?

Después de editar un documento de Word utilizando IronWord, guarda tus cambios llamando al método Save en la instancia WordDocument. Especifica la ruta del archivo para crear un nuevo documento con las modificaciones aplicadas.

¿Necesito tener instalado Microsoft Office para procesar documentos de Word en C#?

No, no necesitas tener instalado Microsoft Office para procesar documentos de Word en C# utilizando IronWord. La biblioteca funciona de manera independiente de Microsoft Office, permitiéndote trabajar con archivos de Word directamente.

¿Qué versiones de .NET son compatibles con una biblioteca de procesamiento de Word?

IronWord es compatible con una amplia gama de versiones de .NET, incluyendo .NET 8, 7, 6, Framework, Core y Azure. Esto asegura que cumpla con varios requisitos de proyectos y entornos.

¿Cómo puedo obtener una licencia de prueba para una biblioteca de procesamiento de Word en C#?

Para obtener una licencia de prueba para IronWord, visita el sitio web de Iron Software y proporciona tu dirección de correo electrónico. Recibirás una clave de licencia de prueba por correo electrónico, que puedes agregar a tu archivo appsettings.json.

Jordi Bardia
Ingeniero de Software
Jordi es más competente en Python, C# y C++. Cuando no está aprovechando sus habilidades en Iron Software, está programando juegos. Compartiendo responsabilidades para pruebas de productos, desarrollo de productos e investigación, Jordi agrega un valor inmenso a la mejora continua del producto. La experiencia variada lo mantiene ...
Leer más