Extraer contenido de un sitio web de compras
Seleccionamos un sitio de compras para raspar su contenido
Como se puede ver en la imagen, tenemos una barra izquierda que contiene enlaces para las categorías de productos del sitio
Así que nuestro primer paso es investigar el HTML del sitio y planificar cómo queremos rasparlo.
Las categorías de sitios de moda tienen subcategorías (Hombres, Mujeres, Niños)
<li class="menu-item" data-id="">
<a href="https://dominio.es/moda-por-/" class="main-category">
<i class="cat-icon osh-font-fashion"></i> <span class="nav-subTxt">FASHION </span> <i class="osh-font-light-arrow-left"></i><i class="osh-font-light-arrow-right"></i>
</a> <div class="navLayerWrapper" style="width: 633px; display: none;"><div class="submenu"><div class="column"><div class="categories"><a class="category" href="https://domain.com/fashion-by-/?sort=newest&dir=desc&viewType=gridView3">New Arrivals !</a> </div><div class="categories"><a class="category" href="https://domain.com/men-fashion/">Men</a> <a class="subcategory" href="https://domain.com/mens-shoes/">Shoes</a> <a class="subcategory" href="https://domain.com/mens-clothing/">Clothing</a> <a class="subcategory" href="https://domain.com/mens-accessories/">Accessories</a> </div><div class="categories"><a class="category" href="https://domain.com/women-fashion/">Women</a> <a class="subcategory" href="https://domain.com/womens-shoes/">Shoes</a> <a class="subcategory" href="https://domain.com/womens-clothing/">Clothing</a> <a class="subcategory" href="https://domain.com/womens-accessories/">Accessories</a> </div><div class="categories"><a class="category" href="https://domain.com/girls-boys-fashion/">Kids</a> <a class="subcategory" href="https://domain.com/boys-fashion/">Boys</a> <a class="subcategory" href="https://domain.com/girls/">Girls</a> </div><div class="categories"><a class="category" href="https://domain.com/maternity-clothes/">Maternity Clothes</a> </div></div><div class="column"><div class="categories"> <span class="category defaultCursor">Men Best Sellers</span> <a class="subcategory" href="https://domain.com/mens-casual-shoes/">Casual Shoes</a> <a class="subcategory" href="https://domain.com/mens-sneakers/">Sneakers</a> <a class="subcategory" href="https://domain.com/mens-t-shirts/">T-shirts</a> <a class="subcategory" href="https://domain.com/mens-polos/">Polos</a> </div><div class="categories"> <span class="category defaultCursor">Women Best Sellers</span> <a class="subcategory" href="https://domain.com/womens-sandals/">Sandals</a> <a class="subcategory" href="https://domain.com/womens-sneakers/">Sneakers</a> <a class="subcategory" href="https://domain.com/women-dresses/">Dresses</a> <a class="subcategory" href="https://domain.com/women-tops/">Tops</a> </div><div class="categories"><a class="category" href="https://domain.com/womens-curvy-clothing/">Women's Curvy Clothing</a> </div><div class="categories"><a class="category" href="https://domain.com/fashion-bundles/v/">Fashion Bundles</a> </div><div class="categories"><a class="category" href="https://domain.com/hijab-fashion/">Hijab Fashion</a> </div></div><div class="column"><div class="categories"><a class="category" href="https://domain.com/brands/fashion-by-/">SEE ALL BRANDS</a> <a class="subcategory" href="https://domain.com/adidas/">Adidas</a> <a class="subcategory" href="https://domain.com/converse/">Converse</a> <a class="subcategory" href="https://domain.com/ravin/">Ravin</a> <a class="subcategory" href="https://domain.com/dejavu/">Dejavu</a> <a class="subcategory" href="https://domain.com/agu/">Agu</a> <a class="subcategory" href="https://domain.com/activ/">Activ</a> <a class="subcategory" href="https://domain.com/oxford--bellini--tie-house--milano/">Tie House</a> <a class="subcategory" href="https://domain.com/shoe-room/">Shoe Room</a> <a class="subcategory" href="https://domain.com/town-team/">Town Team</a> </div></div></div></div>
</li>
<li class="menu-item" data-id="">
<a href="https://dominio.es/moda-por-/" class="main-category">
<i class="cat-icon osh-font-fashion"></i> <span class="nav-subTxt">FASHION </span> <i class="osh-font-light-arrow-left"></i><i class="osh-font-light-arrow-right"></i>
</a> <div class="navLayerWrapper" style="width: 633px; display: none;"><div class="submenu"><div class="column"><div class="categories"><a class="category" href="https://domain.com/fashion-by-/?sort=newest&dir=desc&viewType=gridView3">New Arrivals !</a> </div><div class="categories"><a class="category" href="https://domain.com/men-fashion/">Men</a> <a class="subcategory" href="https://domain.com/mens-shoes/">Shoes</a> <a class="subcategory" href="https://domain.com/mens-clothing/">Clothing</a> <a class="subcategory" href="https://domain.com/mens-accessories/">Accessories</a> </div><div class="categories"><a class="category" href="https://domain.com/women-fashion/">Women</a> <a class="subcategory" href="https://domain.com/womens-shoes/">Shoes</a> <a class="subcategory" href="https://domain.com/womens-clothing/">Clothing</a> <a class="subcategory" href="https://domain.com/womens-accessories/">Accessories</a> </div><div class="categories"><a class="category" href="https://domain.com/girls-boys-fashion/">Kids</a> <a class="subcategory" href="https://domain.com/boys-fashion/">Boys</a> <a class="subcategory" href="https://domain.com/girls/">Girls</a> </div><div class="categories"><a class="category" href="https://domain.com/maternity-clothes/">Maternity Clothes</a> </div></div><div class="column"><div class="categories"> <span class="category defaultCursor">Men Best Sellers</span> <a class="subcategory" href="https://domain.com/mens-casual-shoes/">Casual Shoes</a> <a class="subcategory" href="https://domain.com/mens-sneakers/">Sneakers</a> <a class="subcategory" href="https://domain.com/mens-t-shirts/">T-shirts</a> <a class="subcategory" href="https://domain.com/mens-polos/">Polos</a> </div><div class="categories"> <span class="category defaultCursor">Women Best Sellers</span> <a class="subcategory" href="https://domain.com/womens-sandals/">Sandals</a> <a class="subcategory" href="https://domain.com/womens-sneakers/">Sneakers</a> <a class="subcategory" href="https://domain.com/women-dresses/">Dresses</a> <a class="subcategory" href="https://domain.com/women-tops/">Tops</a> </div><div class="categories"><a class="category" href="https://domain.com/womens-curvy-clothing/">Women's Curvy Clothing</a> </div><div class="categories"><a class="category" href="https://domain.com/fashion-bundles/v/">Fashion Bundles</a> </div><div class="categories"><a class="category" href="https://domain.com/hijab-fashion/">Hijab Fashion</a> </div></div><div class="column"><div class="categories"><a class="category" href="https://domain.com/brands/fashion-by-/">SEE ALL BRANDS</a> <a class="subcategory" href="https://domain.com/adidas/">Adidas</a> <a class="subcategory" href="https://domain.com/converse/">Converse</a> <a class="subcategory" href="https://domain.com/ravin/">Ravin</a> <a class="subcategory" href="https://domain.com/dejavu/">Dejavu</a> <a class="subcategory" href="https://domain.com/agu/">Agu</a> <a class="subcategory" href="https://domain.com/activ/">Activ</a> <a class="subcategory" href="https://domain.com/oxford--bellini--tie-house--milano/">Tie House</a> <a class="subcategory" href="https://domain.com/shoe-room/">Shoe Room</a> <a class="subcategory" href="https://domain.com/town-team/">Town Team</a> </div></div></div></div>
</li>
Creemos un proyecto
- Cree una nueva aplicación de consola o añada una nueva carpeta para nuestra nueva muestra con el nombre "ShoppingSiteSample".
- Añadir nueva clase con el nombre "ShoppingScraper"
El primer paso será raspar las categorías del sitio y sus subcategorías
Vamos a crear un Modelo de Categorías:
public class Category
{
/// <summary>
/// Obtiene o establece el nombre.
/// </summary>
/// <value>
/// El nombre.
/// </value>
public string Name { get; set; }
/// <summary>
/// Obtiene o establece la URL.
/// </summary>
/// <value>
/// La URL.
/// </value>
public string URL { get; set; }
/// <summary>
/// Obtiene o establece las subcategorías.
/// </summary>
/// <value>
/// Las subcategorías.
/// </value>
public List<Category> SubCategories { get; set; }
}
public class Category
{
/// <summary>
/// Obtiene o establece el nombre.
/// </summary>
/// <value>
/// El nombre.
/// </value>
public string Name { get; set; }
/// <summary>
/// Obtiene o establece la URL.
/// </summary>
/// <value>
/// La URL.
/// </value>
public string URL { get; set; }
/// <summary>
/// Obtiene o establece las subcategorías.
/// </summary>
/// <value>
/// Las subcategorías.
/// </value>
public List<Category> SubCategories { get; set; }
}
IRON VB CONVERTER ERROR developers@ironsoftware.com
- Ahora vamos a construir nuestra lógica de raspado
public class ShoppingScraper : WebScraper
{
/// <summary>
/// Anula este método para inicializar tu web-scraper.
/// Tareas importantes serán Solicitar al menos una url de inicio... y establecer patrones de dominios o urls permitidos/prohibidos.
/// </summary>
public override void Init()
{
License.LicenseKey = "LicenseKey";
this.LoggingLevel = WebScraper.LogLevel.All;
this.WorkingDirectory = AppSetting.GetAppRoot() + @"\ShoppingSiteSample\Output\";
this.Request("www.webSite.com", Parse);
}
/// <summary>
/// Sustituya este método para crear el manejador de respuesta predeterminado para su raspador web.
/// Si tiene varios tipos de página, puede añadir métodos similares adicionales.
/// </summary>
/// <param name="response">The http Response object to parse</param>
public override void Parse(Response response)
{
var categoryList = new List<Category>();
foreach (var Links in response.Css("#menuFixed > ul > li > a "))
{
var cat = new Category();
cat.URL = Links.Attributes ["href"];
cat.Name = Links.InnerText;
categoryList.Add(cat);
}
Scrape(categoryList, "Shopping.Jsonl");
}
}
public class ShoppingScraper : WebScraper
{
/// <summary>
/// Anula este método para inicializar tu web-scraper.
/// Tareas importantes serán Solicitar al menos una url de inicio... y establecer patrones de dominios o urls permitidos/prohibidos.
/// </summary>
public override void Init()
{
License.LicenseKey = "LicenseKey";
this.LoggingLevel = WebScraper.LogLevel.All;
this.WorkingDirectory = AppSetting.GetAppRoot() + @"\ShoppingSiteSample\Output\";
this.Request("www.webSite.com", Parse);
}
/// <summary>
/// Sustituya este método para crear el manejador de respuesta predeterminado para su raspador web.
/// Si tiene varios tipos de página, puede añadir métodos similares adicionales.
/// </summary>
/// <param name="response">The http Response object to parse</param>
public override void Parse(Response response)
{
var categoryList = new List<Category>();
foreach (var Links in response.Css("#menuFixed > ul > li > a "))
{
var cat = new Category();
cat.URL = Links.Attributes ["href"];
cat.Name = Links.InnerText;
categoryList.Add(cat);
}
Scrape(categoryList, "Shopping.Jsonl");
}
}
Public Class ShoppingScraper
Inherits WebScraper
''' <summary>
''' Anula este método para inicializar tu web-scraper.
''' Tareas importantes serán Solicitar al menos una url de inicio... y establecer patrones de dominios o urls permitidos/prohibidos.
''' </summary>
Public Overrides Sub Init()
License.LicenseKey = "LicenseKey"
Me.LoggingLevel = WebScraper.LogLevel.All
Me.WorkingDirectory = AppSetting.GetAppRoot() & "\ShoppingSiteSample\Output\"
Me.Request("www.webSite.com", AddressOf Parse)
End Sub
''' <summary>
''' Sustituya este método para crear el manejador de respuesta predeterminado para su raspador web.
''' Si tiene varios tipos de página, puede añadir métodos similares adicionales.
''' </summary>
''' <param name="response">The http Response object to parse</param>
Public Overrides Sub Parse(ByVal response As Response)
Dim categoryList = New List(Of Category)()
For Each Links In response.Css("#menuFixed > ul > li > a ")
Dim cat = New Category()
cat.URL = Links.Attributes ("href")
cat.Name = Links.InnerText
categoryList.Add(cat)
Next Links
Scrape(categoryList, "Shopping.Jsonl")
End Sub
End Class
Extracción de enlaces del menú
Actualicemos nuestro código para raspar las Categorías Principales y todos sus subenlaces
public override void Parse(Response response)
{
// Lista de Categorías Enlaces (Raíz)
var categoryList = new List<Category>();
foreach (var li in response.Css("#menuFixed > ul > li"))
{
// Lista de enlaces principales
foreach (var Links in li.Css("a"))
{
var cat = new Category();
cat.URL = Links.Attributes ["href"];
cat.Name = Links.InnerText;
cat.SubCategories = new List<Category>();
// Lista de enlaces de subcategorías
foreach (var subCategory in li.Css("a [class=subcategory]"))
{
var subcat = new Category();
subcat.URL = Links.Attributes ["href"];
subcat.Name = Links.InnerText;
// Compruebe si el enlace existe antes
if (cat.SubCategories.Find(c=>c.Name== subcat.Name && c.URL == subcat.URL) == null)
{
// Añadir subenlaces
cat.SubCategories.Add(subcat);
}
}
// Añadir categorías
categoryList.Add(cat);
}
}
Scrape(categoryList, "Shopping.Jsonl");
}
public override void Parse(Response response)
{
// Lista de Categorías Enlaces (Raíz)
var categoryList = new List<Category>();
foreach (var li in response.Css("#menuFixed > ul > li"))
{
// Lista de enlaces principales
foreach (var Links in li.Css("a"))
{
var cat = new Category();
cat.URL = Links.Attributes ["href"];
cat.Name = Links.InnerText;
cat.SubCategories = new List<Category>();
// Lista de enlaces de subcategorías
foreach (var subCategory in li.Css("a [class=subcategory]"))
{
var subcat = new Category();
subcat.URL = Links.Attributes ["href"];
subcat.Name = Links.InnerText;
// Compruebe si el enlace existe antes
if (cat.SubCategories.Find(c=>c.Name== subcat.Name && c.URL == subcat.URL) == null)
{
// Añadir subenlaces
cat.SubCategories.Add(subcat);
}
}
// Añadir categorías
categoryList.Add(cat);
}
}
Scrape(categoryList, "Shopping.Jsonl");
}
Public Overrides Sub Parse(ByVal response As Response)
' Lista de Categorías Enlaces (Raíz)
Dim categoryList = New List(Of Category)()
For Each li In response.Css("#menuFixed > ul > li")
' Lista de enlaces principales
For Each Links In li.Css("a")
Dim cat = New Category()
cat.URL = Links.Attributes ("href")
cat.Name = Links.InnerText
cat.SubCategories = New List(Of Category)()
' Lista de enlaces de subcategorías
For Each subCategory In li.Css("a [class=subcategory]")
Dim subcat = New Category()
subcat.URL = Links.Attributes ("href")
subcat.Name = Links.InnerText
' Compruebe si el enlace existe antes
If cat.SubCategories.Find(Function(c) c.Name= subcat.Name AndAlso c.URL = subcat.URL) Is Nothing Then
' Añadir subenlaces
cat.SubCategories.Add(subcat)
End If
Next subCategory
' Añadir categorías
categoryList.Add(cat)
Next Links
Next li
Scrape(categoryList, "Shopping.Jsonl")
End Sub
Ahora que tenemos enlaces a todas las categorías del sitio, vamos a empezar a raspar los productos dentro de cada categoría
Naveguemos a cualquier categoría y comprobemos el contenido.
Veamos su código
<section class="products">
<div class="sku -gallery -validate-size " data-sku="AG249FA0T2PSGNAFAMZ" ft-product-sizes="41,42,43,44,45" ft-product-color="Multicolour">
<a class="link" href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html">
<div class="image-wrapper default-state">
<img class="lazy image -loaded" alt="Bundle Of 2 Sneakers - Black &amp; Navy Blue" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-sku="AG249FA0T2PSGNAFAMZ" data-src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg"><noscript><img src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript>
</div> <h2 class="title">
<span class="brand ">Agu </span>
<span class="name" dir="ltr">Bundle Of 2 Sneakers - Black & Navy Blue</span>
</h2><div class="price-container clearfix">
<span class="price-box">
<span class="price">
<span data-currency-iso="EGP">EGP</span>
<span dir="ltr" data-price="299">299</span>
</span> <span class="price -old -no-special"></span>
</span>
</div><div class="rating-stars"><div class="stars-container"><div class="stars" style="width: 62%"></div></div> <div class="total-ratings">(30)</div> </div> <span class="shop-first-logo-container"><img src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" data-src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" class="lazy shop-first-logo-img -mbxs -loaded"> </span>
<span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span>
<div class="list -sizes" data-selected-sku="">
<span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=41">41</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=42">42</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=43">43</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=44">44</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=45">45</span>
</div>
</a>
</div>
<div class="sku -gallery -validate-size " data-sku="LE047FA01SRK4NAFAMZ" ft-product-sizes="110,115,120,125,130,135" ft-product-color="Black">
<a class="link" href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html">
<div class="image-wrapper default-state"><img class="lazy image -loaded" alt="Genuine Leather Belt - Black" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-sku="LE047FA01SRK4NAFAMZ" data-src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg"><noscript><img src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript></div>
<h2 class="title"><span class="brand ">Leather Shop </span> <span class="name" dir="ltr">Genuine Leather Belt - Black</span></h2><div class="price-container clearfix">
<span class="sale-flag-percent">-29%</span> <span class="price-box"> <span class="price"><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="96">96</span> </span> <span class="price -old "><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="135">135</span> </span> </span>
</div><div class="rating-stars"><div class="stars-container"><div class="stars" style="width: 100%"></div></div> <div class="total-ratings">(1)</div> </div>
<span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span> <div class="list -sizes" data-selected-sku="">
<span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=110">110</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=115">115</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=120">120</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=125">125</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=130">130</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=135">135</span>
</div>
</a>
</div>
</section>
<section class="products">
<div class="sku -gallery -validate-size " data-sku="AG249FA0T2PSGNAFAMZ" ft-product-sizes="41,42,43,44,45" ft-product-color="Multicolour">
<a class="link" href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html">
<div class="image-wrapper default-state">
<img class="lazy image -loaded" alt="Bundle Of 2 Sneakers - Black &amp; Navy Blue" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-sku="AG249FA0T2PSGNAFAMZ" data-src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg"><noscript><img src="https://static.WebSite.com/p/agu-6208-488356-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript>
</div> <h2 class="title">
<span class="brand ">Agu </span>
<span class="name" dir="ltr">Bundle Of 2 Sneakers - Black & Navy Blue</span>
</h2><div class="price-container clearfix">
<span class="price-box">
<span class="price">
<span data-currency-iso="EGP">EGP</span>
<span dir="ltr" data-price="299">299</span>
</span> <span class="price -old -no-special"></span>
</span>
</div><div class="rating-stars"><div class="stars-container"><div class="stars" style="width: 62%"></div></div> <div class="total-ratings">(30)</div> </div> <span class="shop-first-logo-container"><img src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" data-src="http://www.WebSite.com/images/local/logos/shop_first/ShoppingSite/logo_normal.png" class="lazy shop-first-logo-img -mbxs -loaded"> </span>
<span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span>
<div class="list -sizes" data-selected-sku="">
<span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=41">41</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=42">42</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=43">43</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=44">44</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/agu-bundle-of-2-sneakers-black-navy-blue-653884.html?size=45">45</span>
</div>
</a>
</div>
<div class="sku -gallery -validate-size " data-sku="LE047FA01SRK4NAFAMZ" ft-product-sizes="110,115,120,125,130,135" ft-product-color="Black">
<a class="link" href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html">
<div class="image-wrapper default-state"><img class="lazy image -loaded" alt="Genuine Leather Belt - Black" data-image-vertical="1" width="210" height="262" src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-sku="LE047FA01SRK4NAFAMZ" data-src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" data-placeholder="placeholder_m_1.jpg"><noscript><img src="https://static.WebSite.com/p/leather-shop-1831-030217-1-catalog_grid_3.jpg" width="210" height="262" class="image" /></noscript></div>
<h2 class="title"><span class="brand ">Leather Shop </span> <span class="name" dir="ltr">Genuine Leather Belt - Black</span></h2><div class="price-container clearfix">
<span class="sale-flag-percent">-29%</span> <span class="price-box"> <span class="price"><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="96">96</span> </span> <span class="price -old "><span data-currency-iso="EGP">EGP</span> <span dir="ltr" data-price="135">135</span> </span> </span>
</div><div class="rating-stars"><div class="stars-container"><div class="stars" style="width: 100%"></div></div> <div class="total-ratings">(1)</div> </div>
<span class="osh-icon -ShoppingSite-local shop_local--logo -block -mbs -mts"></span> <div class="list -sizes" data-selected-sku="">
<span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=110">110</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=115">115</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=120">120</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=125">125</span> <span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=130">130</span>
<span class="js-link sku-size" data-href="http://www.WebSite.com/leather-shop-genuine-leather-belt-black-712030.html?size=135">135</span>
</div>
</a>
</div>
</section>
Construyamos nuestro modelo de producto para este contenido.
public class Product
{
/// <summary>
/// Obtiene o establece el nombre.
/// </summary>
/// <value>
/// El nombre.
/// </value>
public string Name { get; set; }
/// <summary>
/// Obtiene o fija el precio.
/// </summary>
/// <value>
/// El precio.
/// </value>
public string Price { get; set; }
/// <summary>
/// Obtiene o establece la imagen.
/// </summary>
/// <value>
/// La imagen.
/// </value>
public string Image { get; set; }
}
public class Product
{
/// <summary>
/// Obtiene o establece el nombre.
/// </summary>
/// <value>
/// El nombre.
/// </value>
public string Name { get; set; }
/// <summary>
/// Obtiene o fija el precio.
/// </summary>
/// <value>
/// El precio.
/// </value>
public string Price { get; set; }
/// <summary>
/// Obtiene o establece la imagen.
/// </summary>
/// <value>
/// La imagen.
/// </value>
public string Image { get; set; }
}
Public Class Product
''' <summary>
''' Obtiene o establece el nombre.
''' </summary>
''' <value>
''' El nombre.
''' </value>
Public Property Name() As String
''' <summary>
''' Obtiene o fija el precio.
''' </summary>
''' <value>
''' El precio.
''' </value>
Public Property Price() As String
''' <summary>
''' Obtiene o establece la imagen.
''' </summary>
''' <value>
''' La imagen.
''' </value>
Public Property Image() As String
End Class
Para raspar páginas de categorías, añadimos un nuevo método de raspado:
public void ParseCatgory(Response response)
{
// Lista de productos Enlaces (Raíz)
var productList = new List<Product>();
foreach (var Links in response.Css("body > main > section.osh-content > section.products > div > a"))
{
var product = new Product();
product.Name = Links.InnerText;
product.Image = Links.Css("div.image-wrapper.default-state > img")[0].Attributes ["src"];
productList.Add(product);
}
Scrape(productList, "Products.Jsonl");
}
public void ParseCatgory(Response response)
{
// Lista de productos Enlaces (Raíz)
var productList = new List<Product>();
foreach (var Links in response.Css("body > main > section.osh-content > section.products > div > a"))
{
var product = new Product();
product.Name = Links.InnerText;
product.Image = Links.Css("div.image-wrapper.default-state > img")[0].Attributes ["src"];
productList.Add(product);
}
Scrape(productList, "Products.Jsonl");
}
Public Sub ParseCatgory(ByVal response As Response)
' Lista de productos Enlaces (Raíz)
Dim productList = New List(Of Product)()
For Each Links In response.Css("body > main > section.osh-content > section.products > div > a")
Dim product As New Product()
product.Name = Links.InnerText
product.Image = Links.Css("div.image-wrapper.default-state > img")(0).Attributes ("src")
productList.Add(product)
Next Links
Scrape(productList, "Products.Jsonl")
End Sub