EDN Admin
Well-known member
The code I have been using to scrape a page has been working but as expected, the web page changed their html a bit.
I tried to modify my code to match but no avail.
Below is what needs scraped:
<div style="color:Black;background-color:White; <pre>
New Snowfall<span style="color:Blue; </<span style="color:#A31515; h3<span style="color:Blue; >
<span style="color:Blue; <<span style="color:#A31515; h1 <span style="color:Red; style<span style="color:Blue; =<span style="color:Blue; "padding:10px 0; font-size:1.3em;"<span style="color:Blue; >
11"-18"
<span style="color:Blue; </<span style="color:#A31515; h1<span style="color:Blue; >
[/code]
I need the 2 sets of digits, in this case 11 and 18
I am trying this code:
<div style="color:Black;background-color:White; <pre>
scrapestring = <span style="color:#A31515; @"(?<=New Snowfall</h3>s+<h1>)((.|n)*?)(?=""</h1)";
<span style="color:Blue; try
{
reqHTML = webClient.DownloadData(NPUrl);
}
<span style="color:Blue; catch (WebException ex)
{
StatusLabel1.Text = ex.Message;
}
UTF8Encoding objUTF8 = <span style="color:Blue; new UTF8Encoding();
Regex regex = <span style="color:Blue; new Regex(scrapestring, RegexOptions.IgnoreCase | RegexOptions.Multiline);
Match oM = <span style="color:Blue; null;
<span style="color:Green; //read the page
<span style="color:Blue; try
{
oM = regex.Match(objUTF8.GetString(reqHTML));
}
<span style="color:Blue; catch (WebException ex)
{
StatusLabel1.Text = ex.Message;
}
result = oM.Value;
Console.WriteLine(<span style="color:#A31515; "Result: " + result);
<span style="color:Blue; if (oM.Success) amount = Convert.ToDecimal(oM.Value);
StatusLabel1.Text = <span style="color:#A31515; "Snow Page Read OK. Data: " + oM.Value + <span style="color:#A31515; " Inches";
[/code]
My result is blank, so something is wrong in my scrape, and I need to find a way, that once scraped, I can splt the result without the " and dash -
like result1 = 11
result2 = 18
Can someone help with the scrape string and maybe point to how best parse the result?
View the full article
I tried to modify my code to match but no avail.
Below is what needs scraped:
<div style="color:Black;background-color:White; <pre>
New Snowfall<span style="color:Blue; </<span style="color:#A31515; h3<span style="color:Blue; >
<span style="color:Blue; <<span style="color:#A31515; h1 <span style="color:Red; style<span style="color:Blue; =<span style="color:Blue; "padding:10px 0; font-size:1.3em;"<span style="color:Blue; >
11"-18"
<span style="color:Blue; </<span style="color:#A31515; h1<span style="color:Blue; >
[/code]
I need the 2 sets of digits, in this case 11 and 18
I am trying this code:
<div style="color:Black;background-color:White; <pre>
scrapestring = <span style="color:#A31515; @"(?<=New Snowfall</h3>s+<h1>)((.|n)*?)(?=""</h1)";
<span style="color:Blue; try
{
reqHTML = webClient.DownloadData(NPUrl);
}
<span style="color:Blue; catch (WebException ex)
{
StatusLabel1.Text = ex.Message;
}
UTF8Encoding objUTF8 = <span style="color:Blue; new UTF8Encoding();
Regex regex = <span style="color:Blue; new Regex(scrapestring, RegexOptions.IgnoreCase | RegexOptions.Multiline);
Match oM = <span style="color:Blue; null;
<span style="color:Green; //read the page
<span style="color:Blue; try
{
oM = regex.Match(objUTF8.GetString(reqHTML));
}
<span style="color:Blue; catch (WebException ex)
{
StatusLabel1.Text = ex.Message;
}
result = oM.Value;
Console.WriteLine(<span style="color:#A31515; "Result: " + result);
<span style="color:Blue; if (oM.Success) amount = Convert.ToDecimal(oM.Value);
StatusLabel1.Text = <span style="color:#A31515; "Snow Page Read OK. Data: " + oM.Value + <span style="color:#A31515; " Inches";
[/code]
My result is blank, so something is wrong in my scrape, and I need to find a way, that once scraped, I can splt the result without the " and dash -
like result1 = 11
result2 = 18
Can someone help with the scrape string and maybe point to how best parse the result?
View the full article