Scrape a Shopping Website in C

Learn how to scrape product categories and items from shopping websites using C# with the WebScraper framework, extracting structured data from HTML elements into custom models. This comprehensive guide walks you through building a robust e-commerce scraper using the IronWebScraper library.

Quickstart: Scrape Shopping Website in C#

Nuget IconGet started making PDFs with NuGet now:

  1. Install IronWebScraper with NuGet Package Manager

    PM > Install-Package IronWebScraper

  2. Copy and run this code snippet.

    using IronWebScraper;
    
    public class QuickShoppingScraper : WebScraper
    {
        public override void Init()
        {
            // Apply your license key
            License.LicenseKey = "YOUR-LICENSE-KEY";
    
            // Set the starting URL
            this.Request("https://shopping-site.com", Parse);
        }
    
        public override void Parse(Response response)
        {
            // Extract product data
            foreach (var product in response.Css(".product-item"))
            {
                var item = new
                {
                    Name = product.Css(".product-name").First().InnerText,
                    Price = product.Css(".price").First().InnerText,
                    Image = product.Css("img").First().Attributes["src"]
                };
    
                Scrape(item, "products.jsonl");
            }
        }
    }
    
    // Run the scraper
    var scraper = new QuickShoppingScraper();
    scraper.Start();
  3. Deploy to test on your live environment

    Start using IronWebScraper in your project today with a free trial
    arrow pointer
  1. Create a new Console App project named "ShoppingSiteSample"
  2. Add a class named "ShoppingScraper" that inherits from WebScraper
  3. Create models for Category and Product data
  4. Override Init() to set start URL and Parse() method for scraping
  5. Run the scraper to extract categories and products to JSONL files

How Do I Analyze the Shopping Site's HTML Structure?

Select a shopping site to analyze its content structure. Understanding the HTML structure is crucial for successful web scraping. Before writing any code, spend time analyzing the target website's structure using browser developer tools.

Jumia e-commerce homepage with Ramadan promotional banner and navigation menu

As shown in the image, the left sidebar contains links for the site's product categories. The first step is investigating the site's HTML and planning the scraping approach. This analysis phase is essential for building an effective scraping strategy.

E-commerce website navigation menu showing product categories, subcategories, and brand sections

Why Does Understanding the HTML Structure Matter?

The fashion site categories have subcategories (Men, Women, Kids). Understanding this hierarchical structure helps design appropriate data models and scraping logic. When working with advanced web scraping features, proper HTML analysis becomes even more critical.

<li class="menu-item" data-id="">
    <a href="https://domain.com/fashion-by-/" class="main-category">
        <i class="cat-icon osh-font-fashion"></i>
        <span class="nav-subTxt">FASHION </span>
        <i class="osh-font-light-arrow-left"></i><i class="osh-font-light-arrow-right"></i>
    </a>
    <div class="navLayerWrapper" style="width: 633px; display: none;">
        <div class="submenu">
            <div class="column">
                <div class="categories">
                    <a class="category" href="https://domain.com/fashion-by-/?sort=newest&amp;dir=desc&amp;viewType=gridView3">New Arrivals !</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/men-fashion/">Men</a>
                    <a class="subcategory" href="https://domain.com/mens-shoes/">Shoes</a>
                    <a class="subcategory" href="https://domain.com/mens-clothing/">Clothing</a>
                    <a class="subcategory" href="https://domain.com/mens-accessories/">Accessories</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/women-fashion/">Women</a>
                    <a class="subcategory" href="https://domain.com/womens-shoes/">Shoes</a>
                    <a class="subcategory" href="https://domain.com/womens-clothing/">Clothing</a>
                    <a class="subcategory" href="https://domain.com/womens-accessories/">Accessories</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/girls-boys-fashion/">Kids</a>
                    <a class="subcategory" href="https://domain.com/boys-fashion/">Boys</a>
                    <a class="subcategory" href="https://domain.com/girls/">Girls</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/maternity-clothes/">Maternity Clothes</a>
                </div>
            </div>
            <div class="column">
                <div class="categories">
                    <span class="category defaultCursor">Men Best Sellers</span>
                    <a class="subcategory" href="https://domain.com/mens-casual-shoes/">Casual Shoes</a>
                    <a class="subcategory" href="https://domain.com/mens-sneakers/">Sneakers</a>
                    <a class="subcategory" href="https://domain.com/mens-t-shirts/">T-shirts</a>
                    <a class="subcategory" href="https://domain.com/mens-polos/">Polos</a>
                </div>
                <div class="categories">
                    <span class="category defaultCursor">Women Best Sellers</span>
                    <a class="subcategory" href="https://domain.com/womens-sandals/">Sandals</a>
                    <a class="subcategory" href="https://domain.com/womens-sneakers/">Sneakers</a>
                    <a class="subcategory" href="https://domain.com/women-dresses/">Dresses</a>
                    <a class="subcategory" href="https://domain.com/women-tops/">Tops</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/womens-curvy-clothing/">Women's Curvy Clothing</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/fashion-bundles/v/">Fashion Bundles</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/hijab-fashion/">Hijab Fashion</a>
                </div>
            </div>
            <div class="column">
                <div class="categories">
                    <a class="category" href="https://domain.com/brands/fashion-by-/">SEE ALL BRANDS</a>
                    <a class="subcategory" href="https://domain.com/adidas/">Adidas</a>
                    <a class="subcategory" href="https://domain.com/converse/">Converse</a>
                    <a class="subcategory" href="https://domain.com/ravin/">Ravin</a>
                    <a class="subcategory" href="https://domain.com/dejavu/">Dejavu</a>
                    <a class="subcategory" href="https://domain.com/agu/">Agu</a>
                    <a class="subcategory" href="https://domain.com/activ/">Activ</a>
                    <a class="subcategory" href="https://domain.com/oxford--bellini--tie-house--milano/">Tie House</a>
                    <a class="subcategory" href="https://domain.com/shoe-room/">Shoe Room</a>
                    <a class="subcategory" href="https://domain.com/town-team/">Town Team</a>
                </div>
            </div>
        </div>
    </div>
</li>
<li class="menu-item" data-id="">
    <a href="https://domain.com/fashion-by-/" class="main-category">
        <i class="cat-icon osh-font-fashion"></i>
        <span class="nav-subTxt">FASHION </span>
        <i class="osh-font-light-arrow-left"></i><i class="osh-font-light-arrow-right"></i>
    </a>
    <div class="navLayerWrapper" style="width: 633px; display: none;">
        <div class="submenu">
            <div class="column">
                <div class="categories">
                    <a class="category" href="https://domain.com/fashion-by-/?sort=newest&amp;dir=desc&amp;viewType=gridView3">New Arrivals !</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/men-fashion/">Men</a>
                    <a class="subcategory" href="https://domain.com/mens-shoes/">Shoes</a>
                    <a class="subcategory" href="https://domain.com/mens-clothing/">Clothing</a>
                    <a class="subcategory" href="https://domain.com/mens-accessories/">Accessories</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/women-fashion/">Women</a>
                    <a class="subcategory" href="https://domain.com/womens-shoes/">Shoes</a>
                    <a class="subcategory" href="https://domain.com/womens-clothing/">Clothing</a>
                    <a class="subcategory" href="https://domain.com/womens-accessories/">Accessories</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/girls-boys-fashion/">Kids</a>
                    <a class="subcategory" href="https://domain.com/boys-fashion/">Boys</a>
                    <a class="subcategory" href="https://domain.com/girls/">Girls</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/maternity-clothes/">Maternity Clothes</a>
                </div>
            </div>
            <div class="column">
                <div class="categories">
                    <span class="category defaultCursor">Men Best Sellers</span>
                    <a class="subcategory" href="https://domain.com/mens-casual-shoes/">Casual Shoes</a>
                    <a class="subcategory" href="https://domain.com/mens-sneakers/">Sneakers</a>
                    <a class="subcategory" href="https://domain.com/mens-t-shirts/">T-shirts</a>
                    <a class="subcategory" href="https://domain.com/mens-polos/">Polos</a>
                </div>
                <div class="categories">
                    <span class="category defaultCursor">Women Best Sellers</span>
                    <a class="subcategory" href="https://domain.com/womens-sandals/">Sandals</a>
                    <a class="subcategory" href="https://domain.com/womens-sneakers/">Sneakers</a>
                    <a class="subcategory" href="https://domain.com/women-dresses/">Dresses</a>
                    <a class="subcategory" href="https://domain.com/women-tops/">Tops</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/womens-curvy-clothing/">Women's Curvy Clothing</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/fashion-bundles/v/">Fashion Bundles</a>
                </div>
                <div class="categories">
                    <a class="category" href="https://domain.com/hijab-fashion/">Hijab Fashion</a>
                </div>
            </div>
            <div class="column">
                <div class="categories">
                    <a class="category" href="https://domain.com/brands/fashion-by-/">SEE ALL BRANDS</a>
                    <a class="subcategory" href="https://domain.com/adidas/">Adidas</a>
                    <a class="subcategory" href="https://domain.com/converse/">Converse</a>
                    <a class="subcategory" href="https://domain.com/ravin/">Ravin</a>
                    <a class="subcategory" href="https://domain.com/dejavu/">Dejavu</a>
                    <a class="subcategory" href="https://domain.com/agu/">Agu</a>
                    <a class="subcategory" href="https://domain.com/activ/">Activ</a>
                    <a class="subcategory" href="https://domain.com/oxford--bellini--tie-house--milano/">Tie House</a>
                    <a class="subcategory" href="https://domain.com/shoe-room/">Shoe Room</a>
                    <a class="subcategory" href="https://domain.com/town-team/">Town Team</a>
                </div>
            </div>
        </div>
    </div>
</li>
HTML

How Do I Set Up the Web Scraping Project?

Set up a project following best practices for C# web scraping.

  1. Create a new Console App or add a new folder for the sample named "ShoppingSiteSample"
  2. Add a new class named "ShoppingScraper"
  3. Start by scraping site categories and their subcategories
  4. Install IronWebScraper via NuGet Package Manager or Package Manager Console:
Install-Package IronWebScraper
Install-Package IronWebScraper
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

What Data Model Should I Use for Categories?

Create a Categories Model that properly represents the hierarchical structure discovered:

public class Category
{
    /// <summary>
    /// Gets or sets the name.
    /// </summary>
    /// <value>
    /// The name.
    /// </value>
    public string Name { get; set; }

    /// <summary>
    /// Gets or sets the URL.
    /// </summary>
    /// <value>
    /// The URL.
    /// </value>
    public string URL { get; set; }

    /// <summary>
    /// Gets or sets the subcategories.
    /// </summary>
    /// <value>
    /// The subcategories.
    /// </value>
    public List<Category> SubCategories { get; set; }

    // Additional properties for enhanced data collection
    public int ProductCount { get; set; }
    public DateTime LastScraped { get; set; }
    public string CategoryType { get; set; }
}
public class Category
{
    /// <summary>
    /// Gets or sets the name.
    /// </summary>
    /// <value>
    /// The name.
    /// </value>
    public string Name { get; set; }

    /// <summary>
    /// Gets or sets the URL.
    /// </summary>
    /// <value>
    /// The URL.
    /// </value>
    public string URL { get; set; }

    /// <summary>
    /// Gets or sets the subcategories.
    /// </summary>
    /// <value>
    /// The subcategories.
    /// </value>
    public List<Category> SubCategories { get; set; }

    // Additional properties for enhanced data collection
    public int ProductCount { get; set; }
    public DateTime LastScraped { get; set; }
    public string CategoryType { get; set; }
}
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

How Do I Build the Basic Scraper Logic?

Build the scraper logic, remembering to apply your license key before running the scraper:

public class ShoppingScraper : WebScraper
{
    /// <summary>
    /// Initialize the web scraper, setting the start URLs and allowed/banned domains or URL patterns.
    /// </summary>
    public override void Init()
    {
        // Apply your license key - get one from https://ironsoftware.com/csharp/webscraper/licensing/
        License.LicenseKey = "LicenseKey";
        this.LoggingLevel = WebScraper.LogLevel.All;
        this.WorkingDirectory = AppSetting.GetAppRoot() + @"\ShoppingSiteSample\Output\";

        // Configure request settings for better performance
        this.Request("www.webSite.com", Parse);
    }

    /// <summary>
    /// Parses the HTML document of the response to scrap the necessary data.
    /// </summary>
    /// <param name="response">The HTTP Response object to parse.</param>
    public override void Parse(Response response)
    {
        var categoryList = new List<Category>();

        // Iterate through each link in the menu and extract the category data.
        foreach (var Links in response.Css("#menuFixed > ul > li > a"))
        {
            var cat = new Category
            {
                URL = Links.Attributes["href"],
                Name = Links.InnerText,
                LastScraped = DateTime.Now
            };
            categoryList.Add(cat);
        }

        // Save the scraped data into a JSONL file.
        Scrape(categoryList, "Shopping.jsonl");
    }
}
public class ShoppingScraper : WebScraper
{
    /// <summary>
    /// Initialize the web scraper, setting the start URLs and allowed/banned domains or URL patterns.
    /// </summary>
    public override void Init()
    {
        // Apply your license key - get one from https://ironsoftware.com/csharp/webscraper/licensing/
        License.LicenseKey = "LicenseKey";
        this.LoggingLevel = WebScraper.LogLevel.All;
        this.WorkingDirectory = AppSetting.GetAppRoot() + @"\ShoppingSiteSample\Output\";

        // Configure request settings for better performance
        this.Request("www.webSite.com", Parse);
    }

    /// <summary>
    /// Parses the HTML document of the response to scrap the necessary data.
    /// </summary>
    /// <param name="response">The HTTP Response object to parse.</param>
    public override void Parse(Response response)
    {
        var categoryList = new List<Category>();

        // Iterate through each link in the menu and extract the category data.
        foreach (var Links in response.Css("#menuFixed > ul > li > a"))
        {
            var cat = new Category
            {
                URL = Links.Attributes["href"],
                Name = Links.InnerText,
                LastScraped = DateTime.Now
            };
            categoryList.Add(cat);
        }

        // Save the scraped data into a JSONL file.
        Scrape(categoryList, "Shopping.jsonl");
    }
}
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

What Elements Am I Targeting in the Menu?

Scraping links from the menu requires precise CSS selectors. The API reference provides detailed information about available selector methods:

JSON file in Notepad showing e-commerce category structure with nested subcategories and URLs

How Do I Scrape Both Main Categories and Subcategories?

Update the code to scrape main categories and all sub-links. This approach ensures complete navigation structure capture:

public override void Parse(Response response)
{
    // List of Category Links (Root)
    var categoryList = new List<Category>();

    // Traverse each 'li' under the fixed menu
    foreach (var li in response.Css("#menuFixed > ul > li"))
    {
        // List of Main Links
        foreach (var Links in li.Css("a"))
        {
            var cat = new Category
            {
                URL = Links.Attributes["href"],
                Name = Links.InnerText,
                SubCategories = new List<Category>(),
                LastScraped = DateTime.Now
            };

            // List of Subcategories Links
            foreach (var subCategory in li.Css("a[class=subcategory]"))
            {
                var subcat = new Category
                {
                    URL = subCategory.Attributes["href"],
                    Name = subCategory.InnerText,
                    CategoryType = "Subcategory"
                };

                // Check if subcategory link already exists
                if (cat.SubCategories.Find(c => c.Name == subcat.Name && c.URL == subcat.URL) == null)
                {
                    // Add sublinks
                    cat.SubCategories.Add(subcat);
                }
            }

            // Update product count based on subcategories
            cat.ProductCount = cat.SubCategories.Count;

            // Add Main Category to the list
            categoryList.Add(cat);
        }
    }

    // Save the scraped data into a JSONL file.
    Scrape(categoryList, "Shopping.jsonl");
}
public override void Parse(Response response)
{
    // List of Category Links (Root)
    var categoryList = new List<Category>();

    // Traverse each 'li' under the fixed menu
    foreach (var li in response.Css("#menuFixed > ul > li"))
    {
        // List of Main Links
        foreach (var Links in li.Css("a"))
        {
            var cat = new Category
            {
                URL = Links.Attributes["href"],
                Name = Links.InnerText,
                SubCategories = new List<Category>(),
                LastScraped = DateTime.Now
            };

            // List of Subcategories Links
            foreach (var subCategory in li.Css("a[class=subcategory]"))
            {
                var subcat = new Category
                {
                    URL = subCategory.Attributes["href"],
                    Name = subCategory.InnerText,
                    CategoryType = "Subcategory"
                };

                // Check if subcategory link already exists
                if (cat.SubCategories.Find(c => c.Name == subcat.Name && c.URL == subcat.URL) == null)
                {
                    // Add sublinks
                    cat.SubCategories.Add(subcat);
                }
            }

            // Update product count based on subcategories
            cat.ProductCount = cat.SubCategories.Count;

            // Add Main Category to the list
            categoryList.Add(cat);
        }
    }

    // Save the scraped data into a JSONL file.
    Scrape(categoryList, "Shopping.jsonl");
}
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

How Do I Extract Product Information from Category Pages?

With links to all site categories available, start scraping products within each category. When dealing with product pages, thread safety becomes important for optimal performance. Navigate to any category and examine the content:

E-commerce product listing page showing shoes and accessories with prices, ratings, and filtering controls

What Does the Product HTML Structure Look Like?

Examine the HTML structure to understand product organization:

<section class="products">
    <div class="sku -gallery -validate-size " data-sku="AG249FA0T2PSGNAFAMZ" ft-product-sizes="41,42,43,44,45" ft-product-color="Multicolour">
        <a class="link" href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html">
            <div class="image-wrapper default-state">
                <img class="lazy image -loaded" alt="Bundle Of 2 Sneakers - Black & Navy Blue" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-sku="AG249FA0T2PSGNAFAMZ" data-src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg">
                <noscript><img src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript>
            </div>
            <h2 class="title">
                <span class="brand ">Agu&nbsp;</span>
                <span class="name" dir="ltr">Bundle Of 2 Sneakers - Black & Navy Blue</span>
            </h2>
            <div class="price-container clearfix">
                <span class="price-box">
                    <span class="price">
                        <span data-currency-iso="EGP">EGP</span>
                        <span dir="ltr" data-price="299">299</span>
                    </span>
                    <span class="price -old  -no-special"></span>
                </span>
            </div>
            <div class="rating-stars">
                <div class="stars-container">
                    <div class="stars" style="width: 62%"></div>
                </div>
                <div class="total-ratings">(30)</div>
            </div>
            <span class="shop-first-logo-container">
                <img src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" data-src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" class="lazy shop-first-logo-img -mbxs -loaded">
            </span>
            <span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span>
            <div class="list -sizes" data-selected-sku="">
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=41">41</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=42">42</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=43">43</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=44">44</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=45">45</span>
            </div>
        </a>
    </div>
    <div class="sku -gallery -validate-size " data-sku="LE047FA01SRK4NAFAMZ" ft-product-sizes="110,115,120,125,130,135" ft-product-color="Black">
        <a class="link" href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html">
            <div class="image-wrapper default-state">
                <img class="lazy image -loaded" alt="Genuine Leather Belt - Black" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-sku="LE047FA01SRK4NAFAMZ" data-src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg">
                <noscript><img src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript>
            </div>
            <h2 class="title"><span class="brand ">Leather Shop&nbsp;</span> <span class="name" dir="ltr">Genuine Leather Belt - Black</span></h2>
            <div class="price-container clearfix">
                <span class="sale-flag-percent">-29%</span>
                <span class="price-box">
                    <span class="price"><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="96">96</span> </span>
                    <span class="price -old"><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="135">135</span> </span>
                </span>
            </div>
            <div class="rating-stars">
                <div class="stars-container">
                    <div class="stars" style="width: 100%"></div>
                </div>
                <div class="total-ratings">(1)</div>
            </div>
            <span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span>
            <div class="list -sizes" data-selected-sku="">
                <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=110">110</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=115">115</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=120">120</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=125">125</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=130">130</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=135">135</span>
            </div>
        </a>
    </div>
</section>
<section class="products">
    <div class="sku -gallery -validate-size " data-sku="AG249FA0T2PSGNAFAMZ" ft-product-sizes="41,42,43,44,45" ft-product-color="Multicolour">
        <a class="link" href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html">
            <div class="image-wrapper default-state">
                <img class="lazy image -loaded" alt="Bundle Of 2 Sneakers - Black & Navy Blue" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-sku="AG249FA0T2PSGNAFAMZ" data-src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg">
                <noscript><img src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript>
            </div>
            <h2 class="title">
                <span class="brand ">Agu&nbsp;</span>
                <span class="name" dir="ltr">Bundle Of 2 Sneakers - Black & Navy Blue</span>
            </h2>
            <div class="price-container clearfix">
                <span class="price-box">
                    <span class="price">
                        <span data-currency-iso="EGP">EGP</span>
                        <span dir="ltr" data-price="299">299</span>
                    </span>
                    <span class="price -old  -no-special"></span>
                </span>
            </div>
            <div class="rating-stars">
                <div class="stars-container">
                    <div class="stars" style="width: 62%"></div>
                </div>
                <div class="total-ratings">(30)</div>
            </div>
            <span class="shop-first-logo-container">
                <img src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" data-src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" class="lazy shop-first-logo-img -mbxs -loaded">
            </span>
            <span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span>
            <div class="list -sizes" data-selected-sku="">
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=41">41</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=42">42</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=43">43</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=44">44</span>
                <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=45">45</span>
            </div>
        </a>
    </div>
    <div class="sku -gallery -validate-size " data-sku="LE047FA01SRK4NAFAMZ" ft-product-sizes="110,115,120,125,130,135" ft-product-color="Black">
        <a class="link" href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html">
            <div class="image-wrapper default-state">
                <img class="lazy image -loaded" alt="Genuine Leather Belt - Black" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-sku="LE047FA01SRK4NAFAMZ" data-src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg">
                <noscript><img src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript>
            </div>
            <h2 class="title"><span class="brand ">Leather Shop&nbsp;</span> <span class="name" dir="ltr">Genuine Leather Belt - Black</span></h2>
            <div class="price-container clearfix">
                <span class="sale-flag-percent">-29%</span>
                <span class="price-box">
                    <span class="price"><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="96">96</span> </span>
                    <span class="price -old"><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="135">135</span> </span>
                </span>
            </div>
            <div class="rating-stars">
                <div class="stars-container">
                    <div class="stars" style="width: 100%"></div>
                </div>
                <div class="total-ratings">(1)</div>
            </div>
            <span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span>
            <div class="list -sizes" data-selected-sku="">
                <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=110">110</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=115">115</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=120">120</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=125">125</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=130">130</span>
                <span class="js-link sku-size"  data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=135">135</span>
            </div>
        </a>
    </div>
</section>
HTML

Which Product Model Should I Create?

Build a product model for this content. When working with shopping website scraping, capture all relevant product details:

public class Product
{
    /// <summary>
    /// Gets or sets the name.
    /// </summary>
    /// <value>
    /// The name.
    /// </value>
    public string Name { get; set; }

    /// <summary>
    /// Gets or sets the price.
    /// </summary>
    /// <value>
    /// The price.
    /// </value>
    public string Price { get; set; }

    /// <summary>
    /// Gets or sets the image.
    /// </summary>
    /// <value>
    /// The image.
    /// </value>
    public string Image { get; set; }

    // Additional properties for comprehensive data collection
    public string Brand { get; set; }
    public string OldPrice { get; set; }
    public string Discount { get; set; }
    public float Rating { get; set; }
    public int ReviewCount { get; set; }
    public List<string> AvailableSizes { get; set; }
    public string ProductUrl { get; set; }
    public string SKU { get; set; }
    public DateTime ScrapedDate { get; set; }
}
public class Product
{
    /// <summary>
    /// Gets or sets the name.
    /// </summary>
    /// <value>
    /// The name.
    /// </value>
    public string Name { get; set; }

    /// <summary>
    /// Gets or sets the price.
    /// </summary>
    /// <value>
    /// The price.
    /// </value>
    public string Price { get; set; }

    /// <summary>
    /// Gets or sets the image.
    /// </summary>
    /// <value>
    /// The image.
    /// </value>
    public string Image { get; set; }

    // Additional properties for comprehensive data collection
    public string Brand { get; set; }
    public string OldPrice { get; set; }
    public string Discount { get; set; }
    public float Rating { get; set; }
    public int ReviewCount { get; set; }
    public List<string> AvailableSizes { get; set; }
    public string ProductUrl { get; set; }
    public string SKU { get; set; }
    public DateTime ScrapedDate { get; set; }
}
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

How Do I Add Product Scraping Functionality?

To scrape category pages, add a new scrape method with error handling and data validation:

public void ParseCategory(Response response)
{
    // List of Products
    var productList = new List<Product>();

    // Iterate through product links in the product section
    foreach (var Links in response.Css("section.products > div > a"))
    {
        try
        {
            var product = new Product
            {
                Name = Links.Css("h2.title > span.name").First().InnerText,
                Brand = Links.Css("h2.title > span.brand").FirstOrDefault()?.InnerText ?? "Unknown",
                Price = Links.Css("div.price-container > span.price-box > span.price > span[data-price]").First().InnerText,
                Image = Links.Css("div.image-wrapper.default-state > img").First().Attributes["src"],
                ProductUrl = Links.Attributes["href"],
                SKU = Links.ParentNode.Attributes["data-sku"],
                ScrapedDate = DateTime.Now
            };

            // Extract old price if available
            var oldPriceElement = Links.Css("span.price.-old > span[data-price]").FirstOrDefault();
            if (oldPriceElement != null)
            {
                product.OldPrice = oldPriceElement.InnerText;
            }

            // Extract discount percentage
            var discountElement = Links.Css("span.sale-flag-percent").FirstOrDefault();
            if (discountElement != null)
            {
                product.Discount = discountElement.InnerText;
            }

            // Extract rating information
            var ratingWidth = Links.Css("div.stars").FirstOrDefault()?.Attributes["style"];
            if (!string.IsNullOrEmpty(ratingWidth))
            {
                var width = System.Text.RegularExpressions.Regex.Match(ratingWidth, @"(\d+)%").Groups[1].Value;
                if (int.TryParse(width, out int ratingPercent))
                {
                    product.Rating = ratingPercent / 20.0f; // Convert percentage to 5-star scale
                }
            }

            // Extract review count
            var reviewText = Links.Css("div.total-ratings").FirstOrDefault()?.InnerText;
            if (!string.IsNullOrEmpty(reviewText))
            {
                var reviewCount = System.Text.RegularExpressions.Regex.Match(reviewText, @"\d+").Value;
                if (int.TryParse(reviewCount, out int count))
                {
                    product.ReviewCount = count;
                }
            }

            // Extract available sizes
            product.AvailableSizes = Links.Css("div.list.-sizes > span.sku-size")
                .Select(s => s.InnerText)
                .ToList();

            productList.Add(product);
        }
        catch (Exception ex)
        {
            // Log error and continue with next product
            Console.WriteLine($"Error parsing product: {ex.Message}");
        }
    }

    // Save the scraped product data into a JSONL file.
    Scrape(productList, "Products.jsonl");

    // Handle pagination if needed
    var nextPageLink = response.Css("a.pagination-next").FirstOrDefault();
    if (nextPageLink != null)
    {
        var nextPageUrl = nextPageLink.Attributes["href"];
        this.Request(nextPageUrl, ParseCategory);
    }
}
public void ParseCategory(Response response)
{
    // List of Products
    var productList = new List<Product>();

    // Iterate through product links in the product section
    foreach (var Links in response.Css("section.products > div > a"))
    {
        try
        {
            var product = new Product
            {
                Name = Links.Css("h2.title > span.name").First().InnerText,
                Brand = Links.Css("h2.title > span.brand").FirstOrDefault()?.InnerText ?? "Unknown",
                Price = Links.Css("div.price-container > span.price-box > span.price > span[data-price]").First().InnerText,
                Image = Links.Css("div.image-wrapper.default-state > img").First().Attributes["src"],
                ProductUrl = Links.Attributes["href"],
                SKU = Links.ParentNode.Attributes["data-sku"],
                ScrapedDate = DateTime.Now
            };

            // Extract old price if available
            var oldPriceElement = Links.Css("span.price.-old > span[data-price]").FirstOrDefault();
            if (oldPriceElement != null)
            {
                product.OldPrice = oldPriceElement.InnerText;
            }

            // Extract discount percentage
            var discountElement = Links.Css("span.sale-flag-percent").FirstOrDefault();
            if (discountElement != null)
            {
                product.Discount = discountElement.InnerText;
            }

            // Extract rating information
            var ratingWidth = Links.Css("div.stars").FirstOrDefault()?.Attributes["style"];
            if (!string.IsNullOrEmpty(ratingWidth))
            {
                var width = System.Text.RegularExpressions.Regex.Match(ratingWidth, @"(\d+)%").Groups[1].Value;
                if (int.TryParse(width, out int ratingPercent))
                {
                    product.Rating = ratingPercent / 20.0f; // Convert percentage to 5-star scale
                }
            }

            // Extract review count
            var reviewText = Links.Css("div.total-ratings").FirstOrDefault()?.InnerText;
            if (!string.IsNullOrEmpty(reviewText))
            {
                var reviewCount = System.Text.RegularExpressions.Regex.Match(reviewText, @"\d+").Value;
                if (int.TryParse(reviewCount, out int count))
                {
                    product.ReviewCount = count;
                }
            }

            // Extract available sizes
            product.AvailableSizes = Links.Css("div.list.-sizes > span.sku-size")
                .Select(s => s.InnerText)
                .ToList();

            productList.Add(product);
        }
        catch (Exception ex)
        {
            // Log error and continue with next product
            Console.WriteLine($"Error parsing product: {ex.Message}");
        }
    }

    // Save the scraped product data into a JSONL file.
    Scrape(productList, "Products.jsonl");

    // Handle pagination if needed
    var nextPageLink = response.Css("a.pagination-next").FirstOrDefault();
    if (nextPageLink != null)
    {
        var nextPageUrl = nextPageLink.Attributes["href"];
        this.Request(nextPageUrl, ParseCategory);
    }
}
IRON VB CONVERTER ERROR developers@ironsoftware.com
$vbLabelText   $csharpLabel

This comprehensive approach to scraping shopping websites ensures capture of all relevant product information while handling errors gracefully. For more advanced scenarios, explore the advanced web scraping features available in IronWebScraper.

Frequently Asked Questions

How do I extract product data from shopping websites in C#?

IronWebScraper makes it easy to extract product data from shopping websites by using CSS selectors. You can create a WebScraper class, override the Parse method, and use response.Css() to target specific HTML elements like product names, prices, and images. The extracted data can be saved to various formats including JSON and JSONL files.

What are the basic steps to create a shopping website scraper?

To create a shopping website scraper with IronWebScraper: 1) Create a Console App project, 2) Add a class that inherits from WebScraper, 3) Create data models for categories and products, 4) Override the Init() method to set your starting URL, 5) Override the Parse() method to extract data using CSS selectors, and 6) Run the scraper to save data to your preferred format.

How can I handle hierarchical category structures when scraping e-commerce sites?

IronWebScraper allows you to handle hierarchical structures by creating appropriate data models that reflect the parent-child relationships (like Fashion > Men > Shoes). You can navigate through nested HTML elements using CSS selectors and build your category tree structure programmatically, which is especially useful when working with IronWebScraper's advanced features.

What's the best way to analyze a shopping site's HTML structure before scraping?

Before using IronWebScraper to scrape a shopping site, use browser developer tools to inspect the HTML structure. Look for consistent patterns in CSS classes and element hierarchies. This analysis helps you identify the correct CSS selectors to use in your IronWebScraper Parse() method for accurately targeting product information, categories, and other data elements.

Can I extract both product listings and category navigation from the same page?

Yes, IronWebScraper enables you to extract multiple types of data from a single page. In your Parse() method, you can use different CSS selectors to target category links (like '.category-item') and product listings (like '.product-item') simultaneously, then save them to separate output files or data structures.

How do I save scraped product data to a file?

IronWebScraper provides a built-in Scrape() method that automatically saves extracted data. Simply pass your data object and filename to Scrape(item, "products.jsonl"). The library supports various output formats including JSON, JSONL, and CSV, making it easy to export your scraped e-commerce data for further processing.

Darrius Serrant
Full Stack Software Engineer (WebOps)

Darrius Serrant holds a Bachelor’s degree in Computer Science from the University of Miami and works as a Full Stack WebOps Marketing Engineer at Iron Software. Drawn to coding from a young age, he saw computing as both mysterious and accessible, making it the perfect medium for creativity ...

Read More
Ready to Get Started?
Nuget Downloads 126,948 | Version: 2025.12 just released