USING IRONXL FOR PYTHON

How to Read Excel Files in Python

Published June 6, 2024
Share:

Working with Excel files is a common requirement in data analysis and automation tasks. Python, with its extensive set of libraries, offers various ways to interact with XLSX files. In this tutorial, we'll learn how to read Excel files using IronXL, a Python Excel library.

IronXL: Python Excel Library

IronXL Python is designed to help developers read and write Excel files easily in Python. Not only can you write Excel files, but you can also work with multiple Excel sheets in a single go. This library is perfect for writing Excel files without installing Microsoft Excel on your machine.

When you need to import data directly into an Excel spreadsheet, IronXL comes to the rescue. IronXL makes handling Excel spreadsheets simple. It helps to manage data across multiple sheets in XLSX file easily.

Let's see how we can read Excel files in Python using IronXL.

How to Read Excel Files in Python

The following steps go over the process involved in reading Excel files in Python.

  1. Create a Python file in Visual Studio code.
  2. Install the Python Excel library using pip.
  3. Load the Excel file which we need to read
  4. Iterate Excel data using loops
  5. Show iterated data on the console

Prerequisites

Before diving into the tutorial, ensure you have the following prerequisites installed on your system:

  • .NET 6.0 SDK: IronXL is built on .NET, requiring the .NET 6.0 SDK to be installed on your system.
  • Python 3.0+: The tutorial assumes you have Python 3.0 or higher installed.
  • pip: Ensure pip, Python's package installer, is installed as it will be used to install IronXL.

Setting Up Your Environment

1. Creating a File in Visual Studio Code

Open Visual Studio Code and create a new Python file named ReadExcel.py. This file will contain our script to read Excel files using IronXL.

How to Read Excel Files in Python: Figure 1

2. Installing IronXL

Open the command line within Visual Studio Codeby selectingTerminal > New Terminal` from the menu.

How to Read Excel Files in Python: Figure 2

Install IronXL by running the following command:

 pip install IronXL

How to Read Excel Files in Python: Figure 3

Writing the Code

Now, let's break down the code into sections and explain each part.

Importing Libraries

from ironxl import *     import sys
PYTHON

Here, we import everything from the ironxl library, which provides the functionality to work with Excel files. sys is used to set the path where Python is installed, which is necessary for IronXL to function correctly.

Setting Python Installation Path

sys.prefix = r'C:\Users\User Name\AppData\Local\Programs\Python\Python312'
PYTHON

sys.prefix is assigned the path to your Python installation directory. It's a preparatory step that might not be necessary in every environment but is essential in some configurations to prevent path-related issues.

Setting License Key

License.LicenseKey = "License-Key";
PYTHON

The IronXL library requires a valid license key to unlock all its features. This line of code License.LicenseKey = "License-Key"; is where you would insert your obtained license key. Without a valid license, IronXL will not work.

Loading the Workbook

workbook = WorkBook.Load("data.xlsx")
PYTHON

This section of the script, workbook = WorkBook.Load("data.xlsx"), demonstrates how to load an Excel workbook. WorkBook.Load is a method provided by IronXL to open an existing Excel file, specified by its file path.

The example uses "data.xlsx", assuming this file is located in the same directory as the script. If your file is elsewhere, you'll need to provide the complete file path.

Selecting a Worksheet

worksheet = workbook.WorkSheets[0]
PYTHON

After loading the workbook, the next step is to select a worksheet. The workbook.WorkSheets[0] line accesses the first worksheet in the Excel file. Worksheets are indexed starting at 0, so this code snippet effectively selects the first sheet.

Reading and Displaying Data

print("Sample data from A2:A5:")
for cell in worksheet["A2:A5"]:
    print("Cell {} has value '{}'".format(cell.AddressString, cell.Text))
PYTHON

This part of the script exemplifies how to read and display data from specific cells. By iterating over a cell range for the cells in the worksheet["A2:A5"], we can access each cell's address and text value. You can retrieve information from predefined areas within a sheet.

Performing Calculations

IronXL provides functions to perform simple calculations directly on data ranges within an Excel sheet. The following sections of the code demonstrate how to perform simple calculations like sum, minimum, and maximum on data within a specified range.

total_sum = worksheet["B2:B10"].Sum()
print("Total Sum of B2:B10: ", total_sum)
minimum_value = worksheet["C2:C10"].Min()
print("Minimum Value in C2:C10: ", minimum_value)
maximum_value = worksheet["D2:D10"].Max()
print("Maximum Value in D2:D10: ", maximum_value)
PYTHON

Finding Unique Values

This part of the script demonstrates how to identify unique values within a specified range of an Excel sheet:

unique_values = set(cell.Text for cell in worksheet["F2:F10"])
print("Unique Values in F2:F10: ", unique_values)
PYTHON

To achieve this, a set comprehension is utilized. Sets in Python are collections that automatically eliminate duplicate entries, making them perfect for finding unique values. The comprehension iterates over each cell in the range F2:F10, accessing the text content of each cell using cell.Text.

By placing this iteration inside a set, we ensure that only unique values are kept.

Counting Occurrences of a Specific Value

Finally, this snippet counts how many times a specific value occurs in a given range.

specific_value = '5'  # Adjust this as needed based on your data
occurrences = sum(cell.Text == specific_value for cell in worksheet["G2:G10"])
print(f"Occurrences of value '{specific_value}' in G2:G10: ", occurrences)
PYTHON

The goal here is to quantify the occurrence of a specific value, labelled here as specific_value. The code employs a generator expression that iterates over each cell in the range G2:G10, comparing the cell's text content to specific_value.

The expression cell.Text == specific_value evaluates to True for each cell containing the target value and False otherwise. The sum function adds up these Boolean values, treating True as 1 and False as 0, effectively counting the number of times the specific value occurs.

Full Code Example

Here is the complete code for a better understanding of the code:

from ironxl import *     import sys
sys.prefix = r'C:\Users\User Name\AppData\Local\Programs\Python\Python312'
License.LicenseKey = "License-Key";
# Load the workbook
workbook = WorkBook.Load("data.xlsx")
# Select the worksheet at index 0
worksheet = workbook.WorkSheets[0]
# Display values in a specific range as an example
print("Sample data from A2:A5:")
for cell in worksheet["A2:A5"]:
    print("Cell {} has value '{}'".format(cell.AddressString, cell.Text))
# Calculate the sum of values in a different range
total_sum = worksheet["B2:B10"].Sum()
print("Total Sum of B2:B10: ", total_sum)
# Calculate the minimum value in another range
minimum_value = worksheet["C2:C10"].Min()
print("Minimum Value in C2:C10: ", minimum_value)
# Calculate the maximum value in a different range
maximum_value = worksheet["D2:D10"].Max()
print("Maximum Value in D2:D10: ", maximum_value)
# Find unique values in a specified range
unique_values = set(cell.Text for cell in worksheet["F2:F10"])
print("Unique Values in F2:F10: ", unique_values)
# Count occurrences of a specific value in a different range
specific_value = '5'  # Adjust this as needed based on your data
occurrences = sum(cell.Text == specific_value for cell in worksheet["G2:G10"])
print(f"Occurrences of value '{specific_value}' in G2:G10: ", occurrences)
PYTHON

Output

After running the ReadExcel.py Python file, you'll see the following output displayed in your console, which reflects the results of the operations carried out by the script on the Excel data.

How to Read Excel Files in Python: Figure 4

Conclusion

In this tutorial, we've explored how to set up a Python environment to use IronXL for reading Excel files. We've covered installing the necessary software, writing the script to load an Excel workbook, select a worksheet, read the data, perform basic data analysis operations, and more.

IronXL provides a powerful API for working with Excel files in Python, enabling data analysis, data cleaning, and visualization tasks without needing Microsoft Excel.

Whether you're analyzing sales data, importing data for reports, or creating visualizations from Excel data, IronXL offers a robust solution for managing Excel files in Python applications.

IronXL provides a free trial for users to explore its features, with licenses beginning at $749 for those ready to fully integrate its capabilities.

< PREVIOUS
Python Package For Excel Files (Without Using Interop)
NEXT >
How to View an Excel File in Python