Sunday, June 24, 2012

Introduction to XPath for ASP.NET Developers




Tree like representation of an XML Document
In XPath, an XML document is represented as a tree of nodes. There is a parent node with one or more child nodes.

While a node in XPath can be of 7 types; for practical purposes, a node in XPath corresponds to an element or attribute within an XML document.

For example, have a look at following XML document:

<?xml version="1.0" encoding="utf-8" ?>
<userinfo>
 <username admin="true">someUserName1</username>
 <email>xyz@whatever.com</email>
</userinfo>

Following is XPath's tree representation of the above document:
  • userinfo
    • username
      • admin
    • email

Each node is related to the nodes above and below it. In our XML document above, userinfo is root node.

userinfo is the parent of username and email nodes. username and email nodes are siblings.

username and email are children of userinfo node. Similarly, admin is the child of username node.

Selecting Nodes within an XML Document

Now that we understand how elements and attributes are represented as nodes in XPath, we will focus on how to use XPath expressions to select one or more nodes within an XML document.

XPath Expressions
Following are some of the expressions that you can use to select one or more nodes from the XML document above:

/userinfo - Selects the root element.
/userinfo/username - Selects the username node which is the child of userinfo root node.
//email - Selects all the nodes in the document which match the name (email) irrespective of where they lie in the document.
//username[@admin] - Selects all nodes with the name of "username" which have an attribute; "admin".
/userinfo/username[1] - Selects the first username node that is the child of userinfo node.
/userinfo/username[last()] - Assuming userinfo had more than one username child nodes, it will return the last username node that is the child of userinfo node.

Practical Demonstration of XPath Expressions

We will now create a sample XML document and then use XPath expressions to select and display only few nodes from it.

Sample XML Document
Copy and paste following text in a new text file and save it as "sample.xml" in the /App_Data folder of your ASP.NET web application:

<?xml version="1.0" encoding="utf-8" ?>
<article>
 <author isadmin="true">Faisal Khan</author>
 <title>Sample XML Document</title>
 <body>
  <page>This is page #1.</page>
  <page>This is page #2.</page>
  <page>This is page #3.</page>
 </body>
</article>


XPath.aspx ASP.NET Page
Now, we will create the ASP.NET page which will read the above sample.xml file and selectively display its contents using XPath.

Copy and paste the following text into a new text file and save it as "XPath.aspx"
in your ASP.NET web application:


<%@ Page Language="C#" AutoEventWireup="true" %>

<%@ Import Namespace="System.Xml" %>

<script runat="server">
 protected void Page_Load(object source, EventArgs e)
 {
  XmlDocument doc = new XmlDocument();
  doc.Load(Server.MapPath("~/App_Data/sample.xml"));

  XmlNodeList nodes = doc.SelectNodes("/article/body/page");

  foreach (XmlNode node in nodes)
  {
   TableRow row = new TableRow();
   TableCell cell = new TableCell();
   cell.Text = node.FirstChild.InnerText;

   row.Cells.Add(cell);
   PagesTable.Rows.Add(row);
  }
 }
</script>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
    <title>Using XPath Expressions</title>
    <style type="text/css">
  body { font-family: Verdana; font-size: 9pt; }
  .name { background-color: #F7F7F7; }
    </style>
</head>
<body>
    <form id="form1" runat="server">
    <div>
  <asp:Table id="PagesTable" runat="server" />
    </div>
    </form>
</body>
</html>

We will look into its code a little later, for now when this ASP.NET page was run on my computer, it produced following result (displaying only page elements from the XML file):

Looking into the Code

We learned how to read an XML file and display its contents using XmlDocument class from System.Xml namespace in an ASP.NET page, in previous tutorial. We will focus in this tutorial on how to use XPath expressions to only selectively return the list of nodes we want to display to the user.

protected void Page_Load(object source, EventArgs e)
{
 XmlDocument doc = new XmlDocument();
 doc.Load(Server.MapPath("~/App_Data/sample.xml"));

 XmlNodeList nodes = doc.SelectNodes("/article/body/page");

 foreach (XmlNode node in nodes)
 {
  TableRow row = new TableRow();
  TableCell cell = new TableCell();
  cell.Text = node.FirstChild.InnerText;

  row.Cells.Add(cell);
  PagesTable.Rows.Add(row);
 }
}

We create a new instance of XmlDocument class and make it load our "sample.xml" file. Next, we want to only display the page elements so we use a simple XPath expression; "/article/body/page" to select only the page nodes.

XmlDocument doc = new XmlDocument();
doc.Load(Server.MapPath("~/App_Data/sample.xml"));

XmlNodeList nodes = doc.SelectNodes("/article/body/page");

Next, we iterate through the returned list of XmlNodes and insert its contents in our ASP.NET table. To get the text from the page element in XML file, we use XmlNode.FirstChild.InnerText property.

foreach (XmlNode node in nodes)
{
 TableRow row = new TableRow();
 TableCell cell = new TableCell();
 cell.Text = node.FirstChild.InnerText;

 row.Cells.Add(cell);
 PagesTable.Rows.Add(row);
}

Some More XPath Expressions
Had we only wanted to fetch the first page element from the XML document, what is the XPath expression we should have used? And what if we wanted to return only the last page element?

No comments:

Post a Comment