Name: IronWebScraper
Brand: Iron Software
Availability: InStock
Rating: 4.72 (37 reviews)

Cross Platform Support

Designed for C#, F#, & VB.NET running on .NET 9, 8, 7, 6, Core, Standard, or Framework

Get started: C# code examples

C# Web Scraper

using IronWebScraper;

public class Program
{
    private static void Main(string[] args)
    {
        var ScrapeJob = new BlogScraper();
        ScrapeJob.Start();
    }
}

public class BlogScraper : WebScraper
{
    public override void Init()
    {
        LoggingLevel = LogLevel.All;
        Request("https://www.zyte.com/blog/", Parse);
    }

    public override void Parse(Response response)
    {
        foreach (HtmlNode title_link in response.Css(".oxy-post-title"))
        {
            string strTitle = title_link.TextContentClean;
            Scrape(new ScrapedData() { { "Title", strTitle } });
        }

        if (response.CssExists("div.oxy-easy-posts-pages > a[href]"))
        {
            string next_page = response.Css("div.oxy-easy-posts-pages > a[href]")[0].Attributes["href"];
            Request(next_page, Parse);
        }
    }
}

Imports IronWebScraper

Public Class Program
	Public Shared Sub Main(ByVal args() As String)
		Dim ScrapeJob = New BlogScraper()
		ScrapeJob.Start()
	End Sub
End Class

Public Class BlogScraper
	Inherits WebScraper

	Public Overrides Sub Init()
		LoggingLevel = LogLevel.All
		Request("https://www.zyte.com/blog/", AddressOf Parse)
	End Sub

	Public Overrides Sub Parse(ByVal response As Response)
		For Each title_link As HtmlNode In response.Css(".oxy-post-title")
			Dim strTitle As String = title_link.TextContentClean
			Scrape(New ScrapedData() From {
				{ "Title", strTitle }
			})
		Next title_link

		If response.CssExists("div.oxy-easy-posts-pages > a[href]") Then
			Dim next_page As String = response.Css("div.oxy-easy-posts-pages > a[href]")(0).Attributes("href")
			Request(next_page, AddressOf Parse)
		End If
	End Sub
End Class

IronWebScraper provides a powerful framework to extract data and files from websites using C# code.

Install IronWebScraper to your Project using NuGet.
Create a Class Extending WebScraper.
Create an Init method that uses the Request method to parse at least one URL.
Create a Parse method to process the requests, and indeed Request more pages. Use response.Css to work with HTML elements using jQuery style CSS selectors.
In your application, create an instance of your web scraping class and call the Start(); method.
Read our C# webscraping tutorials to learn how to create advanced web crawlers using IronWebScraper.

Human Support related to The C# Webscraping Library

Human Support Directly From Our Development Team

Whether it's product, integration or licensing queries, the Iron product development team is on hand to support all of your questions. Get in touch and start a dialog with Iron to make the most of our library in your project.

Ask a Question

Powerful Scraping Engine related to The C# Webscraping Library

Powerful Scraping Engine Under Your Control

Just write a single C# web-scraper class to scrape thousands or even millions of web pages into C# Class Instances, JSON or Downloaded Files. IronWebScraper allows you to code concise, linear workflows simulating human browsing behavior. IronWebScraper will run your code as a swarm of virtual web browsers, massively paralleled, yet polite and fault tolerant.

Get Started with Documentation

Simple, Flexible Logic

IronWebScraper must be programmed to know how to handle each “type” of page it encounters. This is achieved in a very concise manner using CSS Selectors or XPath expressions and can be fully customized in C#. This freedom allows you to decide which pages to scrape within a website, and what to do with the data extracted. Each method can be debugged and watched neatly in Visual Studio.

Follow a Tutorial

Fast and Polite Behavior

IronWebScraper deals with multithreading and web-requests to allow for hundreds of concurrent threads without the developer needing to manage them. Politeness can be set to throttle requests, so reducing risk of excessive load on target web servers.

Up and Running with WebScraper

Create virtual user Identities

IronWebScraper can use one or multiple “identities” - sessions that simulate real world human requests. Each request may programmatically or randomly assign its own Identity, User Agent, Cookies, Logins and even IP addresses. Requests are set as auto-unique with a combination of URL, parse method and post variables.

See API Reference

Action Replay

IronWebScraper uses advanced caching to allow developers to change their code “on the fly” and replay every previous request without contacting the internet. Every scrape job is autosaved and can be resumed in the event of an exception or power outage.

WebScraper Setup Instructions

Visual Studio Library for PDF Creation and Content Editing.

Rapid Installation with Microsoft Visual Studio

IronWebScraper puts Web Scraping tools in your own hands quickly with a Visual Studio installer. Whether installing directly from NuGet within visual studio or downloading the DLL, you’ll be setup in no time. Just one DLL and no dependencies.

PM > Install-Package IronWebScraper Download DLL

Supports:

.NET Webscraping Community Tutorials

Tutorial + Code Examples Webscraping in .NET | VB.NET & ASP.NET PDF

VB C# ASP.NET

Web Scraping in C# and VB.NET Projects

See how Ahmed uses IronWebScraper in his projects to migrate content from one site to another. Sample Projects and Code provided for scraping ecommerce and blog websites