X-Path and Namespaces

What’s an X-Path

X-Path is a language that describes a way to locate and process items in XML documents by using an addressing syntax based on a path through the document’s logical structure or hierarchy. X-Path uses path expressions to select nodes or node-sets in an XML document.

Using the following XML document:

<item>
   <book>
      <title>Cheaper by the Dozen</title>
      <number type=”isbn”>1568491379</number>
      <author>
         <name>John Doe</name>
      </author>
   </book>
   <note>
      <p>This is a funny book!</p>
      <author>
         <name>Jake McEvoy</name>
      </author>
   </note>
</item>

Download XML document

You can use the X-Path expression “/item/book/author/name” to select the element

<item>
   <book>
    …
      <author>
         <name>John Doe</name>
       …
</item>

 

And the expression “/item/book/number/@type” to select the attribute type=”isbn”

<item>
   <book>
   …
      <number type=”isbn”>1568491379</number>
      …
</item>

 

Absolute vs relative X-Path

An absolute X-Path uses the complete path from the root element to the desired element (item>book>author>name). But, if you’d like to select both book’s author and note’s author, using a single X-Path query, you’ll have to use the relative X-Path syntax “//author/name

<item>
   …
   <author>
      <name>John Doe</name>
   </author>
   …
   <author>
      <name>Jake McEvoy</name>
   </author>
   …
</item>

 

A relative X-Path is a way to select an element no matter its location in the XML document.

 

Namespaces

XML namespaces are used for providing uniquely named elements and attributes in an XML document. An XML instance may contain element or attribute names from more than one XML vocabulary. If each vocabulary is given a namespace, the ambiguity between identically named elements or attributes can be resolved. In the following example, the prefix “lib” is used for the “library” vocabulary, and the “rev” prefix is used for the “review” vocabulary.

<item>
   <book xmlns:lib=”urn:vocabulary.library”>
      <title>Cheaper by the Dozen</title>
      <number type=”isbn”>1568491379</number>
      <lib:author>
         <lib:name>John Doe</lib:name>
      </lib:author>
   </book>
   <note xmlns:rev=”urn:vocabulary.review”>
      <p>This is a funny book!</p>
      <rev:author>
         <rev:name>Jake McEvoy</rev:name>
      </rev:author>
   </note>
</item>

Download XML document

X-Path and Namespaces

When a namespace is used in an XML document, you will have to consider the qualified name in an X-Path query to get the desired element. A qualified name contains the namespace-prefix and the name of the element or attribute.

Using the X-Path “//lib:author/lib:name”, you will only select the name element corresponding to the “library vocabulary”. It won’t select the “review’s author”.

<item>
   <book xmlns:lib=”urn:vocabulary.library”>
      <title>Cheaper by the Dozen</title>
      <number type=”isbn”>1568491379</number>
      <lib:author>
         <lib:name>John Doe</lib:name>
      </lib:author>
   </book>
   <note xmlns:rev=”urn:vocabulary.review”>
      <p>This is a funny book!</p>
      <rev:author>
         <rev:name>Jake McEvoy</rev:name>
      </rev:author>
   </note>
</item>

 

And, you can’t just ignore the prefix and use “//author/name”, because it would not match an existing element. There is a workaround explained later.

 

Default namespace

Sometimes, documents contain a declaration of one or more “default namespace”. A default namespace is declared without any prefix (xmlns=”…”, instead of xmlns:pfx=”…”). The scope of a default namespace declaration extends from the beginning of the start-tag in which it appears to the end of the corresponding end-tag, excluding the scope of any inner default namespace declarations. A default namespace declaration applies to all unprefixed element names within its scope.

<item>
   <book xmlns=”urn:vocabulary.library>
      <title>Cheaper by the Dozen</title>
      <number type=”isbn”>1568491379</number>
      <author>
         <name>John Doe</name>
      </author>
   </book>
   <note xmlns=”urn:vocabulary.review>
      <p>This is a funny book!</p>
      <author>
         <name>Jake McEvoy</name>
      </author>
   </note>
</item>

Download XML document

In this particular case, no prefix is used to explicitly distinguish identically named elements or attributes. But only prefixes mapped to namespaces can be used in X-Path queries. This means that if you want to query against a namespace in an XML document, even if it is the default namespace, you need to define a prefix for it. ref: https://docs.microsoft.com/en-us/dotnet/standard/data/xml/xpath-queries-and-namespaces

That’s why the X-Path “//author/name” would not return any value. A prefix must be bound to prevent ambiguity when querying documents with some nodes, not in a namespace, and some in a default namespace.

The software will add “temporary” namespace automatically for each declared default namespace in your documents. Those temporary namespaces will be “ns1, ns2, ns3,…”. So, after loading the XML document in Caristix software, you will see something like:

<item>
   <ns1:book xmlns=”urn:vocabulary.library”>
      <ns1:title>Cheaper by the Dozen</ns1:title>
      <ns1:number type=”isbn”>1568491379</ns1:number>
      <ns1:author>
         <ns1:name>John Doe</ns1:name>
      </ns1:author>
   </book>
   <ns2:note xmlns=”urn:vocabulary.review”>
      <ns2:p>This is a funny book!</ns2:p>
      <ns2:author>
         <ns2:name>Jake McEvoy</ns2:name>
      </ns2:author>
   </ns2:note>
</item>

 

The “ns1” is the temporary namespace prefix for the “urn:vocabulary.library” namespace and “ns2” is the temporary namespace prefix for the “urn:vocabulary.review” namespace. That way, you can select “//ns1:author/ns1:name” and “//ns2:author/ns2:name” without ambiguity.

But, what if I want to select both in a single request?

Take a look at the X-Path syntax reference so see what can be done
https://www.w3schools.com/xml/xpath_intro.asp
https://devhints.io/xpath

Using those references, you can use the existing functions to build an X-Path that will match both elements “//*[local-name()=’author’]/*[local-name()=’name’]”. In this particular case, the local-name() function returns the element name, without the prefix.

<item>
   <ns1:book xmlns=”urn:vocabulary.library”>
      <ns1:title>Cheaper by the Dozen</ns1:title>
      <ns1:number type=”isbn”>1568491379</ns1:number>
      <ns1:author>
         <ns1:name>John Doe</ns1:name>
      </ns1:author>
   </ns1:book>
   <ns2:note xmlns =”urn:vocabulary.review”>
      <ns2:p>This is a <ns2:i>funny</ns2:i> book!</ns2:p>
      <ns2:author>
         <ns2:name>Jake McEvoy</ns2:name>
      </ns2:author>
   </ns2:note>
</item>