USING IRONXL FOR PYTHON

How to Generate Word Document from Excel data in Python

Published December 15, 2024
Share:

Introduction

Generating Word templates from Excel spreadsheets using Python offers numerous benefits that can significantly enhance efficiency, accuracy, and presentation in various professional and personal contexts. By leveraging automation, customization, and data integrity, professionals can ensure their Word documents are effective communication tools that convey essential information clearly and accurately. One such library that can convert Microsoft Excel to Microsoft Word is the IronXL Python package from Iron Software and the python-docx library.

This article will examine the steps required to generate Word documents from Excel files.

How to Generate Word Document from Excel data in Python

  1. Create a Python file named excelToWord.py.
  2. Add IronXL and python-docx packages.
  3. Create or add an Excel file to the project folder.
  4. Read Excel documents using IronXL.
  5. Create a Word document and insert Excel data using python-docx.

What is IronXL?

IronXL for Pythonis a robust library developed by Iron Software that allows developers to create, read, and edit Excel files (XLS, XLSX, and CSV) in Python projects. Here are some key features and benefits of using IronXL:

Key Features

  1. No Excel Dependency: IronXL does not require installing Microsoft Excel on your server, making it ideal for server environments without Excel.
  2. Intuitive API: IronXL provides a natural and intuitive API for working with Excel files, making it easy to integrate into your Python projects.
  3. Support for Multiple Formats: IronXL supports various Excel file formats, including XLS, XLSX, CSV, and TSV.
  4. Cell Styling: You can style cells with different fonts, sizes, backgrounds, borders, and number formats.
  5. Formula Handling: IronXL can work with Excel formulas and re-calculate them whenever a sheet is edited.
  6. Cross-Platform Compatibility: IronXL works on Windows, macOS, Linux, Docker, Azure, and AWS.

What is python-docx

python-docx is a Python library that creates, modifies, and works with Microsoft Word documents such as `.docx` files. It provides a simple API to interact with Word documents, allowing you to perform tasks such as adding text, formatting, inserting tables and images, and more.

Key Features

1. Creating Documents

You can generate Word documents from scratch and add content, including paragraphs, tables, headings, and more. This package can also be used to edit individual documents.

2. Text Manipulation

Add and modify paragraphs of text. Format text (e.g., bold, italic, underline, etc.) using "runs" (parts of text with different styles within a paragraph). Add and style headings of various levels.

3. Adding Tables

Create tables with a specified number of rows and columns. Access and modify individual cells in a table.

4. Lists

Create bulleted or numbered lists with predefined styles.

5. Working with Styles

Apply predefined styles like "Heading 1", "Normal", etc. You can also define and apply custom styles to paragraphs or text.

6. Inserting Images

Insert images into the document at specific locations. You can resize images by specifying their width and height.

Prerequisites

Before we dive into the code, ensure that you have the following prerequisites:

  1. Python Installed: Make sure you have Python installed on your machine. You can download it from the official Python website.
  2. IronXL Installed: You need to install the IronXL package. You can do this using pip.
  3. python-docx Installed: You need to install the python-docx package. You can do this using pip.
  4. Excel File: Create a sample Excel file with data.

Step 1: Create a Python file named excelToWord.py

Open your favorite IDE like Visual Studio Code and create a file called excelToWord.py

How to Generate Word Document from Excel data in Python: Figure 1 - Excel Document Generation

Step 2: Add IronXL package

Use Pip to install the IronXL and python-docx packages.

pip install IronXL python-docx

Step 3: Create or Add an Excel file to the project folder

Copy sample Excel file to code folder. The file contains the below data.

How to Generate Word Document from Excel data in Python: Figure 2 - Sample Excel

Step 4: Read the Excel document using IronXL

Using IronXL load the Excel document and read all the cells using the below code.

import ironxl
from docx import Document
ironxl.License.LicenseKey = "your license"
workbook = ironxl.WorkBook.Load("sample.xlsx")
sheet = workbook.WorkSheets[0]
# read data from excel
data = []
# Iterate through rows and columns in the Excel sheet
for row in range(0, len(sheet.Rows)):
    row_data = []
    for col in range(0, len(sheet.Columns)):
        cell_value = sheet.GetCellAt(row, col)
        print(cell_value)
        row_data.append(cell_value)
    data.append(row_data)
PYTHON

Step 5: Create a Word Document and insert Excel data using python-docx

The Word document generation process involves creating a Word document that is read from previous Excel data.

doc = Document()
# Add a title to the Word document
doc.add_heading('Excel Data Export Using Python Docx', 0)
table = doc.add_table(rows=1, cols=len(data[0]))
hdr_cells = table.rows[0].cells
for i, header in enumerate(data[0]):
    hdr_cells[i].text = str(header)  # Add header cells
for row in data[1:]:
    row_cells = table.add_row().cells
    for i, cell in enumerate(row):
        row_cells[i].text = str(cell)
doc.save("sample.docx")
PYTHON

The complete code for generating Word documents is here.

# Word documents from Excel data
import ironxl
from docx import Document
ironxl.License.LicenseKey = "your license"
workbook = ironxl.WorkBook.Load("sample.xlsx")
sheet = workbook.WorkSheets[0]
# read data from excel
data = []
# Iterate through rows and columns in the Excel sheet
for row in range(0, len(sheet.Rows)):
    row_data = []
    for col in range(0, len(sheet.Columns)):
        cell_value = sheet.GetCellAt(row, col)
        print(cell_value)
        row_data.append(cell_value)
    data.append(row_data)
# document generation process
doc = Document()
# Add a title to the Word document
doc.add_heading('Excel Data Export Using Python Docx', 0)
table = doc.add_table(rows=1, cols=len(data[0]))
hdr_cells = table.rows[0].cells
for i, header in enumerate(data[0]):
    hdr_cells[i].text = str(header)  # Add header cells
for row in data[1:]:
    row_cells = table.add_row().cells
    for i, cell in enumerate(row):
        row_cells[i].text = str(cell)
doc.save("sample.docx") # save as Microsoft Word document
PYTHON

Code Explanation

This Python script performs two main tasks.

1. Reading Data from an Excel File Using IronXL

  • The script begins by setting up a license for the IronXL library, which is used for handling Excel files in Python.
  • It then loads an Excel file (sample.xlsx) and selects the first worksheet from the file.
  • The script reads the data from the worksheet, iterating through all rows and columns. It collects the values from each cell in a 2D list (data), where each row in the Excel sheet corresponds to a sublist within the data.
  • The values of the cells are printed to the console as they are read.

2. Creating a Word Document Using python-docx

  • A new Word document is created using the python-docx library.
  • The script adds a title ("Excel Data Export Using Python Docx") at the top of the document using a heading.
  • It then creates a table in the document, where the first row of the table contains the headers from the first row of the Excel sheet, and subsequent rows contain the corresponding data from the Excel file.
  • Finally, the Word document is saved as sample.docx.

The script reads data from an Excel file (sample.xlsx), processes it, and exports the data into a table in a new Word document (sample.docx). The first row of the Excel sheet is used as table headers, and each row of data from the Excel sheet is added to the Word document as a row in the table.

Output

How to Generate Word Document from Excel data in Python: Figure 3 - Excel to Word

Word File

How to Generate Word Document from Excel data in Python: Figure 4 - Word Document Format

IronXL License (Trial Available)

IronXL works on a valid license file attached to the code. Users can easily get a trial license from the license page.

To use the license, place the license somewhere in the code as below before using the IronXL library.

ironxl.License.LicenseKey = "Your License Key"
PYTHON

Conclusion

The sample code demonstrates an effective way to read data from an Excel file using IronXL and then export that data into a Word document using python-docx. The process involves two main steps:

  1. Extracting Data from Excel: The script loads an Excel file and extracts the data from its first worksheet. It iterates through the rows and columns to collect cell values into a list, which can easily be manipulated or saved.
  2. Creating and Populating a Word Document: Using the python-docx library, the script creates a new Word document, adds a title, and formats the extracted Excel data into a table in the Word document. It automatically places the first row of Excel data as headers and the remaining rows as table data.

This approach allows for seamless data transfer from Excel to Word, which can be useful for tasks such as report generation, data exports, or document automation. The combination of IronXL for Excel handling and python-docx for Word document creation provides a powerful solution for working with these file formats in Python.

< PREVIOUS
How to Create an Excel File in Python
NEXT >
How to Remove a Worksheet from an Excel File in Python

Ready to get started? Version: 2024.11 just released

Free pip Download View Licenses >