USING IRONWORD

How To Read a Word File Using C#

Updated December 24, 2023
Share:

Introduction

Microsoft Word was developed by Microsoft as a word processor. It was first released on October 25, 1983, for Xenix systems under the name Multi-Tool Word. Later versions were written for a variety of platforms, including IBM PCs running DOS (1983), Apple Macintosh running the Classic macOS (1985), AT&T UNIX PC (1985), Atari ST (1988), OS/2 (1989), Microsoft Windows (1989), SCO Unix (1990), macOS (2001), Web browsers (2010), iOS (2014), and Android (2015). On Linux, earlier versions of MS Word can be operated with Wine.

Word is licensed in commercial editions as a stand-alone application or as a component of the Microsoft 365 software suite, which can be purchased with a perpetual license or as part of a Microsoft 365 subscription. In this article, we will use C# to read a Word document using Microsoft Interop assemblies and How IronXL can help us read Excel sheets.

How To Read a Word File using C#

  1. Create a new Visual Studio project.
  2. Install the required library to read Word documents.
  3. Create a new file and load the file into the object.
  4. Process and read Word document.
  5. Dispose of all the created objects.

What is Interop?

Office Interoperability for MS Word can be used to make or open a new document (DOC, DOCX, and RTF) from C# or VB.NET programs. But in projects, it has a lot of disadvantages.

In this article, we will cover common problems you could encounter when utilizing Microsoft Office Interop (Word Automation) from C# or VB.NET.

Examples include:

  • Every client computer used for word automation has to have a Microsoft Word license.
  • An identical version of MS Word must be installed on each client's computer.
  • When automation is used, Word loads various files and DLLs in the background, using a few MB.
  • A COM object is used to access the MS Word API. Calling any COM object from the managed code has the same drawbacks as this one (type conversions, COM wrapper required, poor .NET Framework integration, etc.).

Creating a New Project in Visual Studio

Using the Interop library requires first opening Visual Studio and starting a .NET project. Although the most recent version is recommended, Visual Studio can be used with any version. You can make an application that is similar to Windows Forms or a project template based on your requirements. For the sake of simplicity, I'll be utilizing the Console Application in this instance.

To do this, open Visual Studio, go to the "File" menu and select "New Project". From the various ..NET project templates choose the "Console App".

How To Read a Word File Using C#: Figure 1 - Creating a New VS Project in the Console

After that, enter the project's name and location.

How To Read a Word File Using C#: Figure 2 - Configure Project Details

Selecting a .NET Framework can be done via the Framework drop-down option. For this project, the Dot.NET Framework 4.7 will be used. Pressing the "Create" button is the next step.

By opening the Program.cs file, you can insert the code and build or run the program after the application has generated the solution.

How To Read a Word File Using C#: Figure 3 - Open Program.cs File

We can test the code now that Microsoft.Office.Interop.Word library has been added.

Installing the Interop library is necessary for the upcoming repair. To do this, type the following command in the NuGet Package Manager Console:

Install-Package Microsoft.Office.Interop.Word

How To Read a Word File Using C#: Figure 4 - Installing Interop In NuGet Console

We can alternatively use the NuGet Package Manager to look for the "Interop" package. After locating the entire list of NuGet packages associated with Interop, you may choose the particular package that has to be downloaded.

How To Read a Word File Using C#: Figure 5 - Installing the `Microsoft.Office.Interop.Word` Package through Browsing

Creating a Word Doc using Interop

An instance of Microsoft.Office.Interop.Word.Application must be created to use MS Word. This instance would be used for Word document communication. As demonstrated in the C# code snippet below, the next step is to construct a document instance using the Documents property of Microsoft.Office.Interop.Word.Application instance we just created

using System.Data;
using Microsoft.Office.Interop.Word;
using System;
using System.Runtime.InteropServices;
    internal class Program
    {
        static void Main(string [] args)
        {
            try
            {
                Microsoft.Office.Interop.Word.Application
                wordApplication = new Microsoft.Office.Interop.Word.Application();
                Document doc = wordApplication.Documents.Add();
                var paragraph = doc.Paragraphs.Add();
                paragraph.Range.Text = "Hello World";
                wordApplication.ActiveDocument.SaveAs("D:\\demo.doc", WdSaveFormat.wdFormatDocument);
                doc.Close();
                wordApplication.Quit();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.ToString());
            }
    }
   }
using System.Data;
using Microsoft.Office.Interop.Word;
using System;
using System.Runtime.InteropServices;
    internal class Program
    {
        static void Main(string [] args)
        {
            try
            {
                Microsoft.Office.Interop.Word.Application
                wordApplication = new Microsoft.Office.Interop.Word.Application();
                Document doc = wordApplication.Documents.Add();
                var paragraph = doc.Paragraphs.Add();
                paragraph.Range.Text = "Hello World";
                wordApplication.ActiveDocument.SaveAs("D:\\demo.doc", WdSaveFormat.wdFormatDocument);
                doc.Close();
                wordApplication.Quit();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.ToString());
            }
    }
   }
Imports System.Data
Imports Microsoft.Office.Interop.Word
Imports System
Imports System.Runtime.InteropServices
	Friend Class Program
		Shared Sub Main(ByVal args() As String)
			Try
				Dim wordApplication As New Microsoft.Office.Interop.Word.Application()
				Dim doc As Document = wordApplication.Documents.Add()
				Dim paragraph = doc.Paragraphs.Add()
				paragraph.Range.Text = "Hello World"
				wordApplication.ActiveDocument.SaveAs("D:\demo.doc", WdSaveFormat.wdFormatDocument)
				doc.Close()
				wordApplication.Quit()
			Catch ex As Exception
				Console.WriteLine(ex.ToString())
			End Try
		End Sub
	End Class
VB   C#

In the above code First, we are creating an object for the Interop word application. Then by using the object, we add the document with the corresponding method. Then we craft a paragraph with the corresponding method available in the document object.

Now we can add text to the paragraph and save the document available in the application object by passing the file name with location (ref path) with file format like DOC or DOCX file format and passing the Word file type as the second parameter. We can also able to read Word files using the Interop Library. It also supports various types of formats like DOCX, DOT, RTF, etc.

IronXL Library A Substitute for Interop

An alternative to Interop for handling Excel sheets in .NET programs is IronXL. While Microsoft Office Interop necessitates using the Interop assemblies to connect with Excel, IronXL provides a simpler, more efficient, and more potent way to manipulate Excel projects programmatically in .NET environments.

There are various advantages to using IronXL rather than MS Interop. These include:

  • Performance and Resource Efficiency: IronXL is more resource-efficient and performs better than Interop because it is not dependent on the Excel application being installed on the PC.
  • Simplicity and Ease of Use: IronXL offers a simpler API that makes it easier to read, write, and manipulate an Excel file without the hassles associated with MS Interop.
  • Compatibility and Dependency: IronXL does not require the installation of Microsoft Excel on the computer, so it removes dependencies and compatibility issues that could arise with different versions of Excel or Office.
  • Platform Independence: IronXL provides more flexibility and ease of deployment across a variety of contexts and platforms, in contrast to Interop, which could be more strongly associated with specific Microsoft Office versions.

Because of its speed, ease of use, and less dependency on third-party software installation, IronXL is often a superior choice for .NET developers who need to work with Excel sheets programmatically. The specifics of the project, the existing infrastructure, and the user's degree of familiarity with each library, however, may have an impact on the choice between IronXL and Microsoft Interop. Always consider the requirements of your application while choosing among these solutions. To know about IronXL Excel library refer to the link here.

Installing the IronXL Library

As the IronXL library is needed for the upcoming patch, install it. To finish, open the NuGet Package Manager Console and type the following command:

Install-Package IronWord

How To Read a Word File Using C#: Figure 6 - Installing IronXL Package from NuGet Console

Searching for the package "IronXL" via the NuGet Package Manager is an additional choice. From this list of every NuGet package linked to IronXL, we can select the one we need to download.

How To Read a Word File Using C#: Figure 7 - Installing the `IronXL.Excel` Package from Browsing

Creating a Word Document using IronXL

IronXL is groundbreaking for .NET developers because it offers a more efficient and adaptable way to interact with Word and Excel than Microsoft Interop. IronXL is distinct because it doesn't require Microsoft Office to be installed on the host computer and is quick and easy to integrate. IronXL turns out to be the superior and more effective choice for modern applications that must interact with Office files.

using IronXL;
//reading Excel sheet using the Load method
WorkBook workbook = WorkBook.Load("data.xlsx");
WorkSheet sheet = workbook.WorkSheets.First();
//Display the cell data one by one
foreach (var cell in sheet ["A1:B10"])
{
    Console.WriteLine(cell.Text);
}
using IronXL;
//reading Excel sheet using the Load method
WorkBook workbook = WorkBook.Load("data.xlsx");
WorkSheet sheet = workbook.WorkSheets.First();
//Display the cell data one by one
foreach (var cell in sheet ["A1:B10"])
{
    Console.WriteLine(cell.Text);
}
Imports IronXL
'reading Excel sheet using the Load method
Private workbook As WorkBook = WorkBook.Load("data.xlsx")
Private sheet As WorkSheet = workbook.WorkSheets.First()
'Display the cell data one by one
For Each cell In sheet ("A1:B10")
	Console.WriteLine(cell.Text)
Next cell
VB   C#

Using the file location and name as inputs, the LoadExcel method in the preceding to load an existing Excel file. This imports the file together with the 'Workbook' object. Next, we load the Excel worksheets using Worksheets.first, which enables us to do so by selecting the first available worksheets. The value was then read using the Excel address. For additional information on reading Excel files, click this page.

Additionally, we could use the same Excel URL to change the values on the Excel page. The Excel document can be saved as an XLSX or XLS file using the SaveAs function provided by the 'Workbook' object. The whole file is saved in the selected format throughout this process.

Conclusion

IronXL is among the most widely used Excel add-ons. It is independent of any other external libraries. Since it is self-contained, Microsoft Excel does not need to be installed. It uses a variety of channels to function. Unlike the Interop library, the IronXL library does not require any additional library to parse the file.

IronXL is a comprehensive solution for any programming procedure using MS Excel documents. Numerous operations are possible, including calculations, sorting strings or numbers, trimming, adding, finding, replacing, merging and unmerging, and file storing. You can make table cell data types in addition to validating spreadsheet data. It makes handling Excel data easier and makes it simpler to read and write in a file.

IronXL costs $599. To obtain software updates and support, users can choose to pay a one-year subscription charge. IronXL offers security against unauthorized redistribution in exchange for a fee. For more exact pricing details, visit. To read more about the Iron Software product refer here.

< PREVIOUS
3 C# Word Libraries (Updated List For Developer)
NEXT >
How To Manipulate A Word document Using C#

Ready to get started? Version: 2024.7 just released

Free NuGet Download Total downloads: 2,689 View Licenses >
123