Saltar al pie de página
USANDO IRONXL PARA PYTHON

Generar Documentos Word a partir de Datos de Excel en Python

Generating Word templates from Excel spreadsheets using Python offers numerous benefits that can significantly enhance efficiency, accuracy, and presentation in various professional and personal contexts. By leveraging automation, customization, and data integrity, professionals can ensure their Word documents are effective communication tools that convey essential information clearly and accurately. One such library that can convert Microsoft Excel to Microsoft Word is the IronXL Python package from Iron Software and the python-docx library.

This article will examine the steps required to generate Word documents from Excel files.

How to Generate Word Document from Excel Data in Python

  1. Create a Python file named excelToWord.py.
  2. Install the IronXL and python-docx packages.
  3. Create or add an Excel file to the project folder.
  4. Read Excel documents using IronXL.
  5. Create a Word document and insert Excel data using python-docx.

What is IronXL?

IronXL for Python is a robust library developed by Iron Software that allows developers to create, read, and edit Excel files (XLS, XLSX, and CSV) in Python projects. Here are some key features and benefits of using IronXL:

Key Features

  1. No Excel Dependency: IronXL does not require installing Microsoft Excel on your server, making it ideal for server environments without Excel.
  2. Intuitive API: IronXL provides a natural and intuitive API for working with Excel files, making it easy to integrate into your Python projects.
  3. Support for Multiple Formats: IronXL supports various Excel file formats, including XLS, XLSX, CSV, and TSV.
  4. Cell Styling: You can style cells with different fonts, sizes, backgrounds, borders, and number formats.
  5. Formula Handling: IronXL can work with Excel formulas and re-calculate them whenever a sheet is edited.
  6. Cross-Platform Compatibility: IronXL works on Windows, macOS, Linux, Docker, Azure, and AWS.

What is python-docx

python-docx is a Python library that creates, modifies, and works with Microsoft Word documents such as .docx files. It provides a simple API to interact with Word documents, allowing you to perform tasks such as adding text, formatting, inserting tables and images, and more.

Key Features

1. Creating Documents

You can generate Word documents from scratch and add content, including paragraphs, tables, headings, and more. This package can also be used to edit individual documents.

2. Text Manipulation

Add and modify paragraphs of text. Format text (e.g., bold, italic, underline, etc.) using "runs" (parts of text with different styles within a paragraph). Add and style headings of various levels.

3. Adding Tables

Create tables with a specified number of rows and columns. Access and modify individual cells in a table.

4. Lists

Create bulleted or numbered lists with predefined styles.

5. Working with Styles

Apply predefined styles like "Heading 1", "Normal", etc. You can also define and apply custom styles to paragraphs or text.

6. Inserting Images

Insert images into the document at specific locations. You can resize images by specifying their width and height.

Prerequisites

Before we dive into the code, ensure that you have the following prerequisites:

  1. Python Installed: Make sure you have Python installed on your machine. You can download it from the official Python website.
  2. IronXL Installed: You need to install the IronXL package. You can do this using pip.
  3. python-docx Installed: You need to install the python-docx package. You can do this using pip.
  4. Excel File: Create a sample Excel file with data.

Step 1: Create a Python File Named excelToWord.py

Open your favorite IDE like Visual Studio Code and create a file called excelToWord.py.

How to Generate Word Document from Excel Data in Python: Figure 1 - Excel Document Generation

Step 2: Add IronXL and python-docx Packages

Use Pip to install the IronXL and python-docx packages.

pip install IronXL python-docx
pip install IronXL python-docx
SHELL

Step 3: Create or Add an Excel File to the Project Folder

Copy the sample Excel file to your code folder. The file contains the below data.

How to Generate Word Document from Excel Data in Python: Figure 2 - Sample Excel

Step 4: Read the Excel Document Using IronXL

Using IronXL, load the Excel document and read all the cells using the code below.

import ironxl
# Import Document class from python-docx to work with Word documents
from docx import Document

# Set the License Key for IronXL (replace 'your license' with your actual license key)
ironxl.License.LicenseKey = "your license"

# Load the Excel workbook and select the first worksheet
workbook = ironxl.WorkBook.Load("sample.xlsx")
sheet = workbook.WorkSheets[0]

# Read data from the Excel sheet
data = []

# Iterate through rows and columns in the Excel sheet
for row in range(0, len(sheet.Rows)):
    row_data = []
    for col in range(0, len(sheet.Columns)):
        cell_value = sheet.GetCellAt(row, col)
        print(cell_value)  # Print each cell value
        row_data.append(cell_value)
    data.append(row_data)
import ironxl
# Import Document class from python-docx to work with Word documents
from docx import Document

# Set the License Key for IronXL (replace 'your license' with your actual license key)
ironxl.License.LicenseKey = "your license"

# Load the Excel workbook and select the first worksheet
workbook = ironxl.WorkBook.Load("sample.xlsx")
sheet = workbook.WorkSheets[0]

# Read data from the Excel sheet
data = []

# Iterate through rows and columns in the Excel sheet
for row in range(0, len(sheet.Rows)):
    row_data = []
    for col in range(0, len(sheet.Columns)):
        cell_value = sheet.GetCellAt(row, col)
        print(cell_value)  # Print each cell value
        row_data.append(cell_value)
    data.append(row_data)
PYTHON

Step 5: Create a Word Document and Insert Excel Data Using python-docx

The Word document generation process involves creating a Word document and inserting data that was read from the Excel file.

# Create a new Word document
doc = Document()

# Add a title to the Word document
doc.add_heading('Excel Data Export Using Python Docx', 0)

# Create a table with headers (first row of Excel data)
table = doc.add_table(rows=1, cols=len(data[0]))
hdr_cells = table.rows[0].cells
# Populate header cells with data
for i, header in enumerate(data[0]):
    hdr_cells[i].text = str(header)

# Populate table with data from Excel
for row in data[1:]:
    row_cells = table.add_row().cells
    for i, cell in enumerate(row):
        row_cells[i].text = str(cell)

# Save the generated Word document
doc.save("sample.docx")
# Create a new Word document
doc = Document()

# Add a title to the Word document
doc.add_heading('Excel Data Export Using Python Docx', 0)

# Create a table with headers (first row of Excel data)
table = doc.add_table(rows=1, cols=len(data[0]))
hdr_cells = table.rows[0].cells
# Populate header cells with data
for i, header in enumerate(data[0]):
    hdr_cells[i].text = str(header)

# Populate table with data from Excel
for row in data[1:]:
    row_cells = table.add_row().cells
    for i, cell in enumerate(row):
        row_cells[i].text = str(cell)

# Save the generated Word document
doc.save("sample.docx")
PYTHON

Complete Code for Generating Word Documents

# Import required libraries
import ironxl
from docx import Document

# Set the License Key for IronXL
ironxl.License.LicenseKey = "your license"

# Load the Excel workbook
workbook = ironxl.WorkBook.Load("sample.xlsx")
sheet = workbook.WorkSheets[0]

# Read data from the Excel sheet
data = []
# Iterate through rows and columns in the Excel sheet
for row in range(0, len(sheet.Rows)):
    row_data = []
    for col in range(0, len(sheet.Columns)):
        cell_value = sheet.GetCellAt(row, col)
        print(cell_value)  # Print each cell value
        row_data.append(cell_value)
    data.append(row_data)

# Document generation process
# Create a new Word document
doc = Document()
# Add a title to the Word document
doc.add_heading('Excel Data Export Using Python Docx', 0)

# Create a table in the Word document
table = doc.add_table(rows=1, cols=len(data[0]))
hdr_cells = table.rows[0].cells
for i, header in enumerate(data[0]):
    hdr_cells[i].text = str(header)  # Add header cells

for row in data[1:]:
    row_cells = table.add_row().cells
    for i, cell in enumerate(row):
        row_cells[i].text = str(cell)

# Save the Word document
doc.save("sample.docx")
# Import required libraries
import ironxl
from docx import Document

# Set the License Key for IronXL
ironxl.License.LicenseKey = "your license"

# Load the Excel workbook
workbook = ironxl.WorkBook.Load("sample.xlsx")
sheet = workbook.WorkSheets[0]

# Read data from the Excel sheet
data = []
# Iterate through rows and columns in the Excel sheet
for row in range(0, len(sheet.Rows)):
    row_data = []
    for col in range(0, len(sheet.Columns)):
        cell_value = sheet.GetCellAt(row, col)
        print(cell_value)  # Print each cell value
        row_data.append(cell_value)
    data.append(row_data)

# Document generation process
# Create a new Word document
doc = Document()
# Add a title to the Word document
doc.add_heading('Excel Data Export Using Python Docx', 0)

# Create a table in the Word document
table = doc.add_table(rows=1, cols=len(data[0]))
hdr_cells = table.rows[0].cells
for i, header in enumerate(data[0]):
    hdr_cells[i].text = str(header)  # Add header cells

for row in data[1:]:
    row_cells = table.add_row().cells
    for i, cell in enumerate(row):
        row_cells[i].text = str(cell)

# Save the Word document
doc.save("sample.docx")
PYTHON

Code Explanation

This Python script performs two main tasks.

1. Reading Data from an Excel File Using IronXL

  • The script begins by setting up a license for the IronXL library, which is used for handling Excel files in Python.
  • It then loads an Excel file (sample.xlsx) and selects the first worksheet from the file.
  • The script reads the data from the worksheet, iterating through all rows and columns. It collects the values from each cell in a 2D list (data), where each row in the Excel sheet corresponds to a sublist within the data.
  • The values of the cells are printed to the console as they are read.

2. Creating a Word Document Using python-docx

  • A new Word document is created using the python-docx library.
  • The script adds a title ("Excel Data Export Using Python Docx") at the top of the document using a heading.
  • It then creates a table in the document, where the first row of the table contains the headers from the first row of the Excel sheet, and subsequent rows contain the corresponding data from the Excel file.
  • Finally, the Word document is saved as sample.docx.

The script reads data from an Excel file (sample.xlsx), processes it, and exports the data into a table in a new Word document (sample.docx). The first row of the Excel sheet is used as table headers, and each row of data from the Excel sheet is added to the Word document as a row in the table.

Output

How to Generate Word Document from Excel Data in Python: Figure 3 - Excel to Word

Word File

How to Generate Word Document from Excel Data in Python: Figure 4 - Word Document Format

IronXL License (Trial Available)

IronXL works on a valid license file attached to the code. Users can easily get a trial license from the license page.

To use the license, place the license key somewhere in the code as below before using the IronXL library.

ironxl.License.LicenseKey = "Your License Key"
ironxl.License.LicenseKey = "Your License Key"
PYTHON

Conclusion

The sample code demonstrates an effective way to read data from an Excel file using IronXL and then export that data into a Word document using python-docx. The process involves two main steps:

  1. Extracting Data from Excel: The script loads an Excel file and extracts the data from its first worksheet. It iterates through the rows and columns to collect cell values into a list, which can easily be manipulated or saved.
  2. Creating and Populating a Word Document: Using the python-docx library, the script creates a new Word document, adds a title, and formats the extracted Excel data into a table in the Word document. It automatically places the first row of Excel data as headers and the remaining rows as table data.

This approach allows for seamless data transfer from Excel to Word, which can be useful for tasks such as report generation, data exports, or document automation. The combination of IronXL for Excel handling and python-docx for Word document creation provides a powerful solution for working with these file formats in Python.

Preguntas Frecuentes

¿Cómo puedo convertir datos de Excel en un documento de Word usando Python?

Puedes convertir datos de Excel en un documento de Word usando Python al utilizar la biblioteca IronXL para leer y procesar archivos Excel, y luego usar la biblioteca python-docx para crear y poblar un documento de Word.

¿Cuáles son las ventajas de usar IronXL y python-docx juntos?

La combinación de IronXL y python-docx permite una integración y automatización sin problemas de las conversiones de documentos de Excel a Word. IronXL ofrece características como la lectura y edición de archivos Excel sin necesidad de Microsoft Excel, mientras que python-docx proporciona una API sencilla para manipular documentos de Word.

¿Cómo lees un archivo Excel en Python sin tener Microsoft Excel instalado?

Puedes leer un archivo Excel en Python sin tener Microsoft Excel instalado utilizando la biblioteca IronXL. IronXL te permite cargar archivos Excel y acceder a sus datos de forma programática.

¿Cuál es el proceso para crear un documento de Word a partir de datos de Excel?

El proceso incluye usar IronXL para leer datos de un archivo Excel y luego usar python-docx para crear un documento de Word donde los datos de Excel se insertan en tablas o bloques de texto.

¿Puede IronXL trabajar con múltiples formatos de archivo Excel?

Sí, IronXL admite múltiples formatos de archivo Excel, como XLS, XLSX y CSV, proporcionando flexibilidad en el manejo de diferentes tipos de archivos Excel.

¿Cómo puedo instalar IronXL y python-docx en mi entorno de Python?

Puedes instalar IronXL y python-docx en tu entorno de Python utilizando pip con el comando: pip install IronXL python-docx.

¿Cuáles son los beneficios de automatizar las conversiones de documentos de Excel a Word?

Automatizar las conversiones de documentos de Excel a Word puede mejorar la eficiencia, precisión y presentación al reducir tareas manuales, minimizar errores y permitir una formateo y integridad de datos consistentes en los documentos.

¿Cómo se estilizan las celdas de Excel usando IronXL?

IronXL proporciona funcionalidades para estilizar celdas de Excel, permitiéndote personalizar la apariencia de las celdas, como ajustar fuentes, colores y bordes, lo cual puede ser útil para el formateo antes de transferir datos a Word.

Curtis Chau
Escritor Técnico

Curtis Chau tiene una licenciatura en Ciencias de la Computación (Carleton University) y se especializa en el desarrollo front-end con experiencia en Node.js, TypeScript, JavaScript y React. Apasionado por crear interfaces de usuario intuitivas y estéticamente agradables, disfruta trabajando con frameworks modernos y creando manuales bien ...

Leer más