Namespace IronWebScraper
Classes
CommonUserAgents
Static helper class which lists common web-browser user-agent strings.
HtmlNode
The HtmlNode class represents a single DOM element in a HTML or XML document.
HtmlNodeExtensions
Extension methods for finding elements within with IEnumerable<HtmlNode>
HttpIdentity
A class defining the browsing 'identity' to be used to fetch a given Url. Contains Proxy, UserAgent and Http Header information.
License
Allows IronWebScraper license keys to be applied globally across an application.
MetaData
A flexible dictionary of object values which can be used to attach your own additional data or objects to any Request. Meta can contain objects of any Type including instances of classes, List and Dictionaries. This meta-data can then be accessed while Paring the Response and even passed forwards to the next Request.
Metadata send might include pagination page numbers, referrer Urls, User Ids etc.
E.g:Request["page-number"] = 2;
int pageNumber = Response.Request.Meta.Get<int>("page-number");
Request
Represents a http request to be made by IronWebScraper
Response
Represents a http response made by IronWebScraper
ScrapedData
A flexible dictionary of object values used to conveniently store scraped data of any Type in a key-value dictionary which can be saved as JSON using the Yield method. ScrapedData can hold data objects of any Type, including Classes.
E.g:var Data = new ScrapedData();
Data['title'] = "Page Title";
Data['date'] = DateTime.Now;
WebScraper
An easy to use base class which developers can extend to rapidly build custom web-scraping applications.
WebScraper.LogLevel
Level of WebScraper logging to the Console. Because this Enum is a Flag type options can be combined using a pipe: e.g. LogLevel.Critical | LogLevel.ScrapedData
WebScraper.Throttle
Throttle remote clients by their host name or by their public IP address.