Skip to footer content
USING IRONXL

How to Export an HTML Table to Excel in C# Using IronXL

Extracting HTML table data and converting it to an Excel sheet is a common requirement in business applications -- whether for data migration, report generation, or web pages that need further analysis. This guide provides clear steps to export data from HTML tables to Excel format using IronXL and HTML Agility Pack.

When you need to export HTML table to Excel, understanding the relationship between Excel worksheet structures and HTML tables is essential. This guide demonstrates how to efficiently transfer data from HTML tables to Excel format, creating professional Excel worksheet outputs that maintain data integrity.

IronXL provides a flexible way to convert HTML table content into an Excel worksheet, combining its powerful Excel manipulation capabilities with HTML parsing to export HTML table to Excel in C#. Whether you need to download data from a URL or process content from a database, this solution handles various input scenarios without requiring Microsoft Office to be installed on the machine.

Why Should You Use IronXL to Export HTML Table Data?

IronXL excels at creating and manipulating Excel files without requiring Microsoft Office installation, making it ideal for server environments and cross-platform applications. When paired with HTML Agility Pack, a capable HTML file and content parser, IronXL becomes a versatile solution for converting any HTML table structure to Excel sheet data. This approach works well with .NET 10 applications and can handle large datasets efficiently.

Unlike libraries such as the Syncfusion Excel library's XlsIO, which offers an ImportHtmlTable function limited to specific HTML formats and table structures, the IronXL approach gives developers complete control over the parsing and conversion process. This flexibility means developers can handle complex scenarios like nested tables, custom data formatting, and selective column extraction that rigid built-in methods cannot accommodate.

IronXL also provides a full suite of Excel features including formula support, cell styling, multiple worksheet management, and various export formats (XLSX, XLS, JSON, and CSV). You can create charts, export to PDF, and manage hidden field data, making it a complete solution for Excel automation beyond simple HTML table conversion.

How Do You Install the Required Libraries?

Install both IronXL and HTML Agility Pack through NuGet Package Manager. IronXL offers a free trial to test all features before committing to a license.

Package Manager Console

Install-Package IronXL.Excel
Install-Package HtmlAgilityPack
Install-Package IronXL.Excel
Install-Package HtmlAgilityPack
SHELL

.NET CLI

dotnet add package IronXL.Excel
dotnet add package HtmlAgilityPack
dotnet add package IronXL.Excel
dotnet add package HtmlAgilityPack
SHELL

These NuGet packages allow you to create, load, and save Excel documents programmatically. After installing both packages, add the necessary using statements at the top of your C# file:

using IronXL;
using HtmlAgilityPack;
using System;
using System.Linq;
using IronXL;
using HtmlAgilityPack;
using System;
using System.Linq;
$vbLabelText   $csharpLabel

These libraries work well together, with HTML Agility Pack handling the HTML parsing while IronXL manages the Excel file creation and manipulation. This example demonstrates a clear approach to converting HTML tables to XLSX format.

How Do You Parse HTML Table Data with HTML Agility Pack?

HTML Agility Pack provides a straightforward way to navigate HTML documents using XPath expressions. The following code shows how to extract data from an HTML table and prepare it for export:

// Sample HTML table with product data
string htmlContent = @"
<table>
    <thead>
        <tr>
            <th>Product</th>
            <th>Price</th>
            <th>Stock</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Laptop</td>
            <td>$999</td>
            <td>15</td>
        </tr>
        <tr>
            <td>Mouse</td>
            <td>$25</td>
            <td>50</td>
        </tr>
        <tr>
            <td>Keyboard</td>
            <td>$75</td>
            <td>30</td>
        </tr>
    </tbody>
</table>";

// Load HTML document for parsing
var doc = new HtmlDocument();
doc.LoadHtml(htmlContent);

// Select the HTML table element using XPath
var table = doc.DocumentNode.SelectSingleNode("//table");
// Sample HTML table with product data
string htmlContent = @"
<table>
    <thead>
        <tr>
            <th>Product</th>
            <th>Price</th>
            <th>Stock</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Laptop</td>
            <td>$999</td>
            <td>15</td>
        </tr>
        <tr>
            <td>Mouse</td>
            <td>$25</td>
            <td>50</td>
        </tr>
        <tr>
            <td>Keyboard</td>
            <td>$75</td>
            <td>30</td>
        </tr>
    </tbody>
</table>";

// Load HTML document for parsing
var doc = new HtmlDocument();
doc.LoadHtml(htmlContent);

// Select the HTML table element using XPath
var table = doc.DocumentNode.SelectSingleNode("//table");
$vbLabelText   $csharpLabel

This code loads the HTML content into an HtmlDocument object and uses XPath to query and select the table element. The SelectSingleNode method returns the first table found in the HTML, making it easy to target specific tables when multiple exist. Each table row is then processed to extract the cell value for conversion.

What XPath Expressions Work Best for Table Parsing?

For standard HTML tables, the XPath expression //table selects the first table in the document. When working with more complex pages containing multiple tables, you can use positional selectors such as (//table)[2] to target a specific table by its index. Attribute-based selectors like //table[@id='data-table'] or //table[@class='products'] are also useful when tables carry meaningful identifiers.

When the HTML comes from a live URL, you can load the document directly using the HtmlWeb class:

var web = new HtmlWeb();
var remoteDoc = web.Load("https://example.com/data-page");
var remoteTable = remoteDoc.DocumentNode.SelectSingleNode("//table[@class='data-table']");
var web = new HtmlWeb();
var remoteDoc = web.Load("https://example.com/data-page");
var remoteTable = remoteDoc.DocumentNode.SelectSingleNode("//table[@class='data-table']");
$vbLabelText   $csharpLabel

This lets you pull tables directly from public web pages without manually saving the HTML first.

How Do You Export Parsed Data to Excel Using IronXL?

With IronXL, you can convert the parsed HTML table data into a professional Excel spreadsheet with proper formatting. The following code demonstrates how to export the data with custom styling:

// Create a new Excel workbook
WorkBook workBook = WorkBook.Create(ExcelFileFormat.XLSX);
WorkSheet workSheet = workBook.CreateWorkSheet("Exported Data");

// Extract and write headers
var headers = table.SelectNodes(".//thead/tr/th");
if (headers != null)
{
    for (int col = 0; col < headers.Count; col++)
    {
        workSheet.SetCellValue(0, col, headers[col].InnerText.Trim());

        // Apply header formatting
        var headerCell = workSheet.GetCellAt(0, col);
        headerCell.Style.Font.Bold = true;
        headerCell.Style.BackgroundColor = "#4CAF50";
    }
}

// Extract and write data rows
var rows = table.SelectNodes(".//tbody/tr");
if (rows != null)
{
    for (int row = 0; row < rows.Count; row++)
    {
        var cells = rows[row].SelectNodes("td");
        if (cells != null)
        {
            for (int col = 0; col < cells.Count; col++)
            {
                string cellValue = cells[col].InnerText.Trim();
                workSheet.SetCellValue(row + 1, col, cellValue);
            }
        }
    }
}

// Auto-fit columns for better readability
for (int col = 0; col < headers?.Count; col++)
{
    workSheet.AutoSizeColumn(col);
}

// Save the Excel file
workBook.SaveAs("ExportedTable.xlsx");
// Create a new Excel workbook
WorkBook workBook = WorkBook.Create(ExcelFileFormat.XLSX);
WorkSheet workSheet = workBook.CreateWorkSheet("Exported Data");

// Extract and write headers
var headers = table.SelectNodes(".//thead/tr/th");
if (headers != null)
{
    for (int col = 0; col < headers.Count; col++)
    {
        workSheet.SetCellValue(0, col, headers[col].InnerText.Trim());

        // Apply header formatting
        var headerCell = workSheet.GetCellAt(0, col);
        headerCell.Style.Font.Bold = true;
        headerCell.Style.BackgroundColor = "#4CAF50";
    }
}

// Extract and write data rows
var rows = table.SelectNodes(".//tbody/tr");
if (rows != null)
{
    for (int row = 0; row < rows.Count; row++)
    {
        var cells = rows[row].SelectNodes("td");
        if (cells != null)
        {
            for (int col = 0; col < cells.Count; col++)
            {
                string cellValue = cells[col].InnerText.Trim();
                workSheet.SetCellValue(row + 1, col, cellValue);
            }
        }
    }
}

// Auto-fit columns for better readability
for (int col = 0; col < headers?.Count; col++)
{
    workSheet.AutoSizeColumn(col);
}

// Save the Excel file
workBook.SaveAs("ExportedTable.xlsx");
$vbLabelText   $csharpLabel

This code demonstrates IronXL's intuitive API for C# Excel manipulation. It creates a new WorkBook and WorkSheet, then iterates through the parsed HTML table headers, placing them in the first row while applying bold formatting and a green background color. The data rows from the HTML table are processed row by row, with each cell's text content extracted and placed in the corresponding Excel cell. The AutoSizeColumn function ensures all content is visible, and the workbook is saved as an XLSX file.

C# Export HTML Table to Excel File with IronXL: Image 1 - IronXL parsed table data output

Here you can see the original HTML table compared to the output from the code above:

C# Export HTML Table to Excel File with IronXL: Image 2 - Parsed Excel data vs. the original HTML table

How Do You Apply Cell Formatting to the Exported Data?

Beyond the basic bold and background color shown above, IronXL gives you fine-grained control over cell styling. You can set font size, font family, text alignment, borders, and number formats for any cell or range:

// Apply number formatting to a price column (column index 1)
var priceRange = workSheet[$"B2:B{rows.Count + 1}"];
priceRange.FormatString = "$#,##0.00";

// Set font size on all header cells
var headerRange = workSheet[$"A1:{(char)('A' + headers.Count - 1)}1"];
headerRange.Style.Font.Height = 13; // in half-points, so 13 = 6.5pt
// Apply number formatting to a price column (column index 1)
var priceRange = workSheet[$"B2:B{rows.Count + 1}"];
priceRange.FormatString = "$#,##0.00";

// Set font size on all header cells
var headerRange = workSheet[$"A1:{(char)('A' + headers.Count - 1)}1"];
headerRange.Style.Font.Height = 13; // in half-points, so 13 = 6.5pt
$vbLabelText   $csharpLabel

For column widths, AutoSizeColumn handles most cases, but you can also set explicit widths using the SetColumnWidth method when you need a precise layout. These styling controls are part of the same IronXL API that handles cell font styles throughout the rest of your workbook.

How Do You Handle Multiple Tables and Error Scenarios?

When working with multiple tables on a single page, use SelectNodes("//table") to retrieve all tables and iterate through them, creating a separate worksheet for each:

var tables = doc.DocumentNode.SelectNodes("//table");
if (tables != null)
{
    for (int t = 0; t < tables.Count; t++)
    {
        WorkSheet ws = workBook.CreateWorkSheet($"Table_{t + 1}");
        var tblHeaders = tables[t].SelectNodes(".//thead/tr/th");
        var tblRows = tables[t].SelectNodes(".//tbody/tr");

        if (tblHeaders != null)
        {
            for (int col = 0; col < tblHeaders.Count; col++)
            {
                ws.SetCellValue(0, col, tblHeaders[col].InnerText.Trim());
                ws.GetCellAt(0, col).Style.Font.Bold = true;
            }
        }

        if (tblRows != null)
        {
            for (int row = 0; row < tblRows.Count; row++)
            {
                var cells = tblRows[row].SelectNodes("td");
                if (cells != null)
                {
                    for (int col = 0; col < cells.Count; col++)
                    {
                        ws.SetCellValue(row + 1, col, cells[col].InnerText.Trim());
                    }
                }
            }
        }
    }
}

workBook.SaveAs("MultiTableExport.xlsx");
var tables = doc.DocumentNode.SelectNodes("//table");
if (tables != null)
{
    for (int t = 0; t < tables.Count; t++)
    {
        WorkSheet ws = workBook.CreateWorkSheet($"Table_{t + 1}");
        var tblHeaders = tables[t].SelectNodes(".//thead/tr/th");
        var tblRows = tables[t].SelectNodes(".//tbody/tr");

        if (tblHeaders != null)
        {
            for (int col = 0; col < tblHeaders.Count; col++)
            {
                ws.SetCellValue(0, col, tblHeaders[col].InnerText.Trim());
                ws.GetCellAt(0, col).Style.Font.Bold = true;
            }
        }

        if (tblRows != null)
        {
            for (int row = 0; row < tblRows.Count; row++)
            {
                var cells = tblRows[row].SelectNodes("td");
                if (cells != null)
                {
                    for (int col = 0; col < cells.Count; col++)
                    {
                        ws.SetCellValue(row + 1, col, cells[col].InnerText.Trim());
                    }
                }
            }
        }
    }
}

workBook.SaveAs("MultiTableExport.xlsx");
$vbLabelText   $csharpLabel

What Should You Do When the HTML Is Malformed?

Real-world HTML is not always valid. Pages scraped from external sources may have missing closing tags, inconsistent thead/tbody structures, or mixed th and td elements in the header row. HTML Agility Pack is lenient by design and will parse most malformed HTML without throwing exceptions, but your XPath selectors may not match what you expect.

A safe pattern is to wrap the parsing logic in a try-catch block and add a fallback that looks for tr elements directly under the table when no thead is found:

try
{
    var headerNodes = table.SelectNodes(".//thead/tr/th")
                     ?? table.SelectNodes(".//tr[1]/th")
                     ?? table.SelectNodes(".//tr[1]/td");

    var dataRows = table.SelectNodes(".//tbody/tr")
                  ?? table.SelectNodes(".//tr[position()>1]");

    // ... process as normal
}
catch (Exception ex)
{
    Console.WriteLine($"Table parsing failed: {ex.Message}");
}
try
{
    var headerNodes = table.SelectNodes(".//thead/tr/th")
                     ?? table.SelectNodes(".//tr[1]/th")
                     ?? table.SelectNodes(".//tr[1]/td");

    var dataRows = table.SelectNodes(".//tbody/tr")
                  ?? table.SelectNodes(".//tr[position()>1]");

    // ... process as normal
}
catch (Exception ex)
{
    Console.WriteLine($"Table parsing failed: {ex.Message}");
}
$vbLabelText   $csharpLabel

IronXL automatically handles data type detection, converting numeric strings to numbers when appropriate. For more complex scenarios involving JavaScript-rendered content, you can combine this approach with tools like Selenium WebDriver or Playwright to first render the page and then pass the resulting HTML to HTML Agility Pack for parsing.

How Do You Save and Export the Excel File?

IronXL supports multiple output formats beyond XLSX. You can save as XLS, CSV, TSV, or JSON depending on the downstream requirements. You can also stream the output directly to an HTTP response in ASP.NET Core, which avoids writing a file to disk:

// Save to disk as XLSX
workBook.SaveAs("ExportedTable.xlsx");

// Save as CSV
workBook.SaveAsCsv("ExportedTable.csv");

// Stream to HTTP response (ASP.NET Core)
// Response.Headers["Content-Disposition"] = "attachment; filename=ExportedTable.xlsx";
// workBook.SaveAs(Response.BodyWriter.AsStream());
// Save to disk as XLSX
workBook.SaveAs("ExportedTable.xlsx");

// Save as CSV
workBook.SaveAsCsv("ExportedTable.csv");

// Stream to HTTP response (ASP.NET Core)
// Response.Headers["Content-Disposition"] = "attachment; filename=ExportedTable.xlsx";
// workBook.SaveAs(Response.BodyWriter.AsStream());
$vbLabelText   $csharpLabel

When streaming to an HTTP response, make sure the Content-Disposition header is set to attachment so the browser treats the response as a file download. This pattern works well in both MVC controllers and Razor Pages.

For scenarios where you need to export data to an existing Excel template, IronXL can load an existing workbook and populate named ranges or specific cell addresses with the parsed HTML data, preserving all the formatting in the template.

What Are the Best Practices for Production Use?

Recommended practices when exporting HTML tables to Excel in production
Concern Recommendation Notes
Large datasets Process rows in batches IronXL handles thousands of rows, but streaming output avoids memory pressure
Malformed HTML Use fallback XPath selectors HTML Agility Pack is lenient; add explicit null checks on all SelectNodes calls
Dynamic content Pre-render with Selenium or Playwright JavaScript-heavy pages require a headless browser before HTML parsing
File format Prefer XLSX over XLS XLSX supports more rows, larger cell values, and modern styling features
Column widths Call AutoSizeColumn after writing all data Calling it before data is written will undersize the columns
Licensing Set the license key at startup Call IronXL.License.LicenseKey = "..."; before any IronXL call

When processing content from a URL or database query for further analysis, handle additional details such as hidden field values or special formatting requirements. The default behavior works well for standard tables, but you can customize font size, font family, and other styling properties for each column or any specific table row.

For reading data back from Excel files after export, IronXL uses the same WorkBook.Load API, making round-trip workflows straightforward. You can also convert the resulting Excel file to other formats such as JSON or XML for further processing.

How Do You Manage Licensing and Deployment?

IronXL requires a license key for production use. The free trial includes all features and is ideal for evaluating the library before purchasing. Explore the pricing and licensing options to find the right tier for your team.

For deployment in Docker, Azure Functions, or Linux environments, IronXL does not depend on Microsoft Office or COM Interop, making it a straightforward dependency to manage. It supports .NET 10, .NET 9, .NET 8, .NET Framework 4.6.2+, and is fully compatible with the current ASP.NET Core export patterns.

The combination of IronXL and HTML Agility Pack provides a flexible solution for exporting HTML tables to Excel in C#. The approach demonstrated here offers more control than rigid built-in methods, letting you handle complex HTML structures while taking advantage of IronXL's full Excel feature set.

Whether you are building web scrapers, migrating legacy data from a database, automating report generation, or performing data analysis on large datasets, this solution scales to meet enterprise needs. The code examples show how to handle various input sources, from static HTML strings to dynamic content retrieved via URL. Results can be exported for download or further processing in your .NET application.

Ready to transform your HTML data into professional Excel files? Start your free trial of IronXL today and experience the flexibility of programmatic Excel manipulation without Office dependencies.

Frequently Asked Questions

What is the main purpose of converting HTML tables to Excel in business applications?

The main purpose is to facilitate data migration, report generation, or further analysis of web page data by transforming HTML table data into a format that is easily manageable and analyzable in Excel.

Which library does the guide suggest for converting HTML tables to Excel in C#?

The guide suggests using IronXL for converting HTML tables to Excel in C#, as it provides a flexible approach without the need for Microsoft Office.

Why might some libraries not be suitable for converting HTML tables to Excel?

Some libraries may not be suitable because they have limitations in file formats or lack supporting features, which can restrict their effectiveness in handling diverse data conversion needs.

Is Microsoft Office required to use IronXL for exporting HTML tables to Excel?

No, Microsoft Office is not required to use IronXL. It works cross-platform and provides a flexible solution for exporting HTML tables to Excel.

Can IronXL handle cross-platform conversions of HTML tables to Excel?

Yes, IronXL can handle cross-platform conversions of HTML tables to Excel, making it a versatile tool for developers working in different environments.

What are some common use cases for converting HTML tables to Excel?

Common use cases include data migration, generating reports, and analyzing web page data in a more structured and accessible format.

Jordi Bardia
Software Engineer
Jordi is most proficient in Python, C# and C++, when he isn’t leveraging his skills at Iron Software; he’s game programming. Sharing responsibilities for product testing, product development and research, Jordi adds immense value to continual product improvement. The varied experience keeps him challenged and engaged, and he ...
Read More

Iron Support Team

We're online 24 hours, 5 days a week.
Chat
Email
Call Me