Basic XPath Theory

Basic XPath TheoryThis chapter will provide you with a basic understanding of XPath. Just enough to coverthe basic requirements for writing Selenium tests.XPath is the XML Path Language. Since all HTML, once loaded into a browser, becomeswell structured and can be viewed as an XML tree, we can use XPath to traverse it.XPath is a full programming language so you can perform calculations (e.g. last()-1),and use boolean operations ( e.g. or, and).NOTE:To help follow this section you might want to visit the web page:http://compendiumdev.co.uk/selenium/basic web page.htmlWith the page loaded, use the Firebug plugin FireFinder to try out the XPath statementslisted.I’ll include the listing of the XHTML for the basic web page.html here so you can followalong:<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN""http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html><head><title>Basic Web Page Title</title></head><body>A paragraph of textAnother paragraph of text</body></html>

15.1 XPath ExpressionsXPath expressions select “nodes” or “node-sets” from an XML document.e.g. The XPath expression //p would select the following node-set from the example inthe “Basic HTML Theory” section:161

Figure 15.1: FireFinder matching p tag

15.2 Node TypesXPath has different types of nodes. In the example XHTML these are:• Document node (the root of the XML tree): <html>

• Element node e.g.:– <head><title>Basic Web Page Title</title></head>

– <title>Basic Web Page Title</title>

– A paragraph of text

• Attribute node e.g.– id="para1"

15.3 Selections15.3.1 Start from rootStart selection from the document node with /, this allows you to create absolute pathexpressions e.g.

• /html/body/p

Matches all the paragraph element nodes.assertEquals(2,selenium.getXpathCount("/html/body/p"));

15.3.2 Start from AnywhereStart selection matching anywhere in the document with //, this allows you to createrelative path expressions. e.g.• //pMatches all paragraph element nodes.assertEquals(2,selenium.getXpathCount("//p"));

15.3.3 By Element AttributesSelect attribute elements with @ followed by an attribute name. e.g.• //@id

Would select all the attribute nodes.assertEquals(2,selenium.getXpathCount("//@id"));

15.4 PredicatesPredicates help make selections more specific and are surrounded by square brackets.15.4.1 Predicates can be indexes• //p[2]

Matches the second p element node in the node-setassertEquals("Another paragraph of text",selenium.getText("//p[2]"));

• //p[1]

Matches the first p element node.assertEquals("A paragraph of text",selenium.getText("//p[1]"));

15.4.2 Predicates can be attribute selections• //p[@id=’para1’]

Matches the p element node where the value of the attribute id is para1assertEquals("A paragraph of text", selenium.getText("//p[@id=’para1’]"));

• //p[@class=’main’]

Matches the p element node where the value of the attribute class is main.assertEquals("A paragraph of text",selenium.getText("//p[@class=’main’]"));

15.4.3 Predicates can be XPath functions• //p[last()]

Select the last paragraph.assertEquals("Another paragraph of text",selenium.getText("//p[last()]"));

15.4.4 Predicates can be comparative statements• //p[position()>1]

This returns all but the first p element.assertEquals("Another paragraph of text",selenium.getText("//p[position()>1]"));assertEquals("A paragraph of text",selenium.getText("//p[position()>0]"));

15.5 Combining Match QueriesYou can combine several selections by using “—” e.g.• //p | //head

Match any paragraph element node and also get the head element node.assertEquals(3,selenium.getXpathCount("//p | //head"));

15.6 Wild Card Matches

You can also use wild cards:15.6.1 node()node() matches any type of node (document, element, attribute) e.g• //node()[@id=’para1’]

This matches any node with an id of para1.assertEquals(1,selenium.getXpathCount("//node()[@id=’para1’]"));

//node() matches all the nodes (try it, you may not get the results you expect).//body/node() matches all the nodes in the body (again, try it to see if you get the valueyou expect).15.6.2 *Match anything depending on its position with * e.g.• @*

Matches any attribute.• //p[@*=’para1’]

Would match the first paragraphassertEquals(1,selenium.getXpathCount("//p[@*=’para1’]"));

* can match nodes e.g.• //*[@*]

Matches anything with any attributeassertEquals(2,selenium.getXpathCount("//*[@*]"));

• //*[@id]

Matches anything with an id attribute.

assertEquals(2,selenium.getXpathCount("//*[@id]"));

• /html/*

Matches all children of the document node.assertEquals(2,selenium.getXpathCount("/html/*"));

15.7 Boolean OperatorsYou can setup matches with multiple conditions.15.7.1 and• //p[starts-with(@id,’para’) and contains(.,’Another’)]

Find all paragraphs where the id starts with para and the text contains Another i.e. thesecond paragraph.assertEquals("Another paragraph of text",selenium.getText("//p[starts-with(@id,’para’) and contains(.,’Another’)]"));

15.7.2 or• //*[@id=’para1’ or @id=’para2’]

Find any node where the id is para1 or the id is para2 i.e. our two paragraphs.assertEquals(2,selenium.getXpathCount("//*[@id=’para1’ or @id=’para2’]"));

15.8 XPath FunctionsSince XPath is actually a programming language it has built in functions which we canuse in our XPath statements. Some common XPath functions are listed below

15.8.1 contains()contains() allows you to match the value of attributes and elements based on text anywherein the comparison item e.g.• //p[contains(.,’text’)]

Match any paragraph with text in the main paragraph e.g. Both our paragraphs

assertEquals(2,selenium.getXpathCount("//p[contains(.,’text’)]"));

• //p[contains(.,’Another’)]

Match any paragraph with Another in the paragraph text, in our example this would matchthe second paragraph.assertEquals("Another paragraph of text",selenium.getText("//p[contains(.,’Another’)]"));

• //p[contains(@id,’1’)]

This would match any paragraph where the id had 1 in it, in our example this is the firstparagraphassertEquals("A paragraph of text",selenium.getText("//p[contains(@id,’1’)]"));

15.8.2 starts-with()starts-with() allows you to match the value of attributes and elements based on text atthe start of the comparison item e.g.• //*[starts-with(.,’Basic’)]

Would match any node where the contents of that node start with Basic, in our examplethis would match the title.assertEquals("Basic Web Page Title",selenium.getText("//*[starts-with(.,’Basic’)]"));

• //*[starts-with(@id,’p’)]

This would match any node where the id name started with p, in our example this wouldmatch the paragraphs.assertEquals("Basic Web Page Title",selenium.getText("//*[starts-with(.,’Basic’)]"));

15.8.3 Many MoreThere are many XPath functions available to you, I have just picked a few of the mostcommon ones that I use. I recommend that you visit some of the web sites below to learnmore about XPath functions, and experiment with them.Recommended web sites for function references:• http://unow.be/rc/w3xpath1

• http://unow.be/rc/msxpathref2

• http://unow.be/rc/pitstop13

15.9 XPath optimisationFor our testing we typically want to get the shortest and least brittle XPath statement toidentify elements on a page.Some XPath optimisation strategies that I have used are:• use the id,• use a combination of attributes to make the XPath more specific,• start at the first unique elementWe have to make a trade off between handling change and false positives. So we wantthe XPath to return the correct item, but don’t want the test to break when simple changesare made to the application under test.

15.9.1 Use the IDIf the element has a known id then use that e.g.• //*[@id=’p2’]

Or you probably want to be even more specific and state the type e.g.• //p[@id=’p2’]1http://www.w3schools.com/XPath/xpath functions.asp2http://msdn.microsoft.com/en-us/library/ms256115.aspx3http://www.xmlpitstop.com/ListTutorials/DispContentType/XPath/PageNumber/1.aspx

15.9.2 Use the attributesIf it doesn’t have an id, but you can identify it with a combination of attributes then dothat. Our example XHTML doesn’t have enough nodes to make this clear, but we did thiswith our initial Search Engine testing.e.g. //input[@name=’q’ and @title=’Search’]

15.9.3 Start at the first unique elementIf there is really nothing to distinguish the element then look up the Ancestor chain andfind the first unique element.e.g. //form/table[1]/tbody/tr[1]/td[2]/input[2]

This approach starts to introduce the chance of false positives since a new input might beadded before the one we want, and the test would start using that instead.

15.10 Selenium XPath UsageSelenium uses XPath in locators to identify elements e.g.selenium.isElementPresent("xpath=//p[@id=’p1’]")

Because only XPath locators start with // it is possible to write XPath locators withoutadding xpath= on the front. e.g.selenium.isElementPresent("//p[@id=’p1’]")

The specific XPath command getXpathCount expects an XPath statement as its argumentso you should not use xpath= in front of the XPath locator. Possibly a good reason for notusing xpath= in any of your locators, but each of us has personal coding styles so you getto make a choice as to which you prefer. e.g.selenium.getXpathCount("//p"); //return a count of the p elements

You can combine the XPath statement in the getAttribute statement to get specific attributesfrom elements e.g.assertEquals("p2", selenium.getAttribute("xpath=//p[2]@id"));assertEquals("p2",selenium.getAttribute("//p[2]@id"));

The @id (or more specifically @<attribute-name>) means that the statement is not validXPath but Selenium parses the locator and knows to split off the @id on the end beforeusing it.

Documents

Basic XPath Theory