Better way to Query an XDocument (which is an HTML page)

EDN Admin

Well-known member
Joined
Aug 7, 2010
Messages
12,794
Location
In the Machine
I want to kook for a number of specific items in a web page. I am loading it into an XDocument, then parse it via LINQ queries. (not sure if this is the proper forum, please redirect me if there is a place more appropriate).
What I have so far is working and brings me the results I want, but I believe it would be better to have a single query, instead of 2 now. I should add that I want to get a few other items (Like links), and I dont want to have as many queries as type of
items I am looking for.
Here is what I have so far:
<pre class="prettyprint" style=" document.GetXDocument();
const string xmlns = "{http://www.w3.org/1999/xhtml}";
var allElements = from anyElement in document.FullPage.Descendants(xmlns + "div")
let xAttribute = anyElement.Attribute("id")
where xAttribute != null && xAttribute.Value == "maincolumn"
select anyElement;
// this first query bring only one LARGE Element, the subset I want from the whole page.

XDocument subdocument = new XDocument(allElements);

var myElements = from item in subdocument.Descendants(xmlns + "img")
let attribute = item.Attribute("src")
where attribute != null && attribute.Value.Contains("stories")
select item;

var xElements = myElements as List<XElement> ?? myElements.ToList();
foreach (var element in xElements)
{
var xAttribute = element.Attribute("src");
if (xAttribute != null)
{
Console.WriteLine(xAttribute.Value.Trim());
}
} [/code]
Any idea will be greatly appreciated!
Thanks.

<
Bernard Grosperrin, Chardonnay, France.
<br/>
<br/>

View the full article
 
Back
Top