C# - XML - Removing the nodes without using recursion

EDN Admin

Well-known member
Joined
Aug 7, 2010
Messages
12,794
Location
In the Machine
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px have the following recursive method which takes the an XHTML document and marks nodes based on certain
conditions and It is called like below for a number of HTML contents:-
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px <br/>

<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px XmlDocument document = new XmlDocument();<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px document.LoadXml(xmlAsString);<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px PrepNodesForDeletion(document.DocumentElement, document.DocumentElement);
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px The method definition is below
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px <br/>

<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px /// <summary><br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px /// Recursive function to identify and mark all unnecessary nodes so that they can be removed from the document.<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px /// </summary<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px /// <param name="nodeToCompareAgainst The node that we are recursively comparing all of its descendant nodes
against</param><br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px /// <param name="nodeInQuestion The node whose children we are comparing against the "nodeToCompareAgainst"
node</param>
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px static void PrepNodesForDeletion(XmlNode nodeToCompareAgainst, XmlNode nodeInQuestion)<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px {<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px if (infinityIndex++ > 100000)
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px throw;<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px foreach (XmlNode childNode in nodeInQuestion.ChildNodes)
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px // make sure we compare all of the childNodes descendants to the nodeToCompareAgainst
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px PrepNodesForDeletion(nodeToCompareAgainst, childNode);
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px if (AreNamesSame(nodeToCompareAgainst, childNode) && AllAttributesPresent(nodeToCompareAgainst,
childNode))
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px // the function AnyAttributesWithDifferingValues assumes that all
attributes are present between the two nodes
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px if (AnyAttributesWithDifferingValues(nodeToCompareAgainst, childNode)
&& InnerTextIsSame(nodeToCompareAgainst, childNode))
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px MarkNodeForDeletion(nodeToCompareAgainst);
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px else if (!AnyAttributesWithDifferingValues(nodeToCompareAgainst,
childNode))
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px MarkNodeForDeletion(childNode);
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px <br/>

<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px // make sure we compare all of the childNodes descendants to the childNode
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px PrepNodesForDeletion(childNode, childNode);
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px And then the following method which would delete the marked node:-
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px <br/>

<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px static void RemoveMarkedNodes(XmlDocument document)<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px // in order for us to make sure we remove everything we meant to remove, we need to do this
in a while loop<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px // for instance, if the original xml is = <a><a><b><a/></b> <a/> <br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px // this should result in the xml being passed into this function as:<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px // <a><b><a DeleteNode="TRUE" /></b><a DeleteNode="TRUE <b><a
DeleteNode="TRUE" /></b> <a DeleteNode="TRUE" /> <br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px // then this function (without the while) will not delete the last <a/>, even though it is marked
for deletion<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px // if we incorporate a while loop, then we can insure all nodes marked for deletion are removed<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px // TODO: understand the reason for this -- see http://groups.google.com/group/microsoft.public.dotnet.xml/browse_thread/thread/25df058a4efb5698/7dd0a8b71739216c?lnk=st&q=xmlnode+removechild+recursive&rnum=2&hl=en#7dd0a8b71739216c
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px XmlNodeList nodesToDelete = document.SelectNodes("//*[@DeleteNode=TRUE]");<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px while (nodesToDelete.Count > 0)<br/>
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px foreach (XmlNode nodeToDelete in nodesToDelete)
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px {
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px nodeToDelete.ParentNode.RemoveChild(nodeToDelete);
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="border-collapse:collapse; font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif; font-size:14px; line-height:18px nodesToDelete = document.SelectNodes("//*[@DeleteNode=TRUE]");
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px }
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px When I use the PrepNodesForDeletion method without the infinityIndex counter, I getOutOfMemoryException for
few HTML contents. However, If I use infinityIndex counter, It may not be deleting nodes for some HTML contents.
<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px <br/>

<span style="font-family:Arial,Liberation Sans,DejaVu Sans,sans-serif <span style="border-collapse:collapse; font-size:14px; line-height:18px Could anybody suggest any way to remove recursion. Also I am not familiar with the HtmlAgility pack. So, If
this can be done using that, could somebody provide some code sample.

<br/>

View the full article
 
Back
Top