searching an HTML file using C# and WebBrowser tool

  • Thread starter Thread starter Christ Kennedy
  • Start date Start date
C

Christ Kennedy

Guest
I'm downloading a .htm file and looking for a list of links that my Google browser shows me so that I can crawl along through them. as an example, I found a file that has a list of links that include one titled 'aardvark' and I can find it using the Google Browser's inspect context menu option. The image below shows you the map from the start of the .htm file to the word 'aardvark' but trying to navigate my way through the html document using c# debugging tools ... I am lost.

1473120.png

I've written something like a breadth-first-search routine to help me just fine it ... and found nothing

private void Web_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
tmrTimeOut.Enabled = false;
HtmlDocument htmlDoc = web.Document;
if (htmlDoc == null)
{
tmrNavigate.Enabled = true;
return;
}

for (int intCounter = 0; intCounter < htmlDoc.Body.Children.Count; intCounter++)
{
try
{
HtmlElement hChild = htmlDoc.Body.Children[intCounter];
string strPath = BFS_ChildNodes(ref hChild, "Body "+intCounter.ToString());
if (strPath.Length > 0)
break;
}
catch (Exception)
{
}
}

return;

}

string BFS_ChildNodes(ref HtmlElement hEle, string strHelper)
{
if (hEle == null) return "";
if (hEle.InnerText != null && string.Compare(hEle.InnerText.ToLower(), "aardvark") == 0)
return strHelper;

for (int intCounter = 0; intCounter < hEle.Children.Count; intCounter++)
{
try
{
strHelper += "+" + intCounter.ToString();
HtmlElement hChild = hEle.Children[intCounter];
string strRetVal = BFS_ChildNodes(ref hChild, strHelper);
if (strRetVal.Length > 0)
return strRetVal;
}
catch (Exception)
{
}
}
return "";
}







can someone give me a clue?



my code is perfect until i don't find a bug

Continue reading...
 
Back
Top