posted on Thursday, March 18, 2004 3:51 PM
by
jdixon
XPath Performance in .NET (Revisited)
In a previous post, I mentioned the performance increase that can be had when using XPath queries in .NET. As I said then, you need to use an XPathDocument instead of an XmlDocument. Well, being the hard-headed person that I am, I had to learn this the hard way, AGAIN.
I needed to process an XML document that contains hundreds of nodes. These nodes themselves are quite large. (My test document was 11MB, but in real life, the documents could reach 80MB.) I apparently lost my mind, and wrote some code similar to this: (code changed to protect my client!)
ReturnValue = false;
foreach
(XmlNode Node in _SingleNodeList)
{
if (<PSEUDOCODE: node contains required data?>)
{
ReturnValue = true;
break;
}
}During testing, I found that processing my one test document required 22 minutes. WAY TOO LONG! I went home and slept on the problem.
The next day, I wrote something like this:
Result = XPathNav.Evaluate(count(<PSEUDOCODE: XPath expression that returns matching nodes>));
NodeCount = (double)Result;
ReturnValue = (NodeCount > 0);
I ran the test again. It took at total of 12 seconds. That's right, I reduced the processing time from 22 minutes to 12 seconds simply by using XPath queries properly instead of iterating over an XMLDocument. When I ran the REAL test, processing 2,029 files, it took 54 minutes. Looks like I learned a lesson again. Hopefully, this will save someone else the time that I wasted.