Windows Phone Change Resiliency – XPath

In the previous post, I laid out the reasoning behind making your application resilient to change. In this post, I will look at the easiest way to build resiliency into your application – by using XPath expressions.

XPath was originally devised as a query language for XML – a lightweight way of finding 1379272_54360187elements/attributes inside an XML document. Since then, it’s been co-opted for other technologies such as HTML. This post assumes you understand rudimentary XPath syntax. I will explain some of the stuff that goes inside it, but I am assuming some very basic proficiency in it.

Important note on standards: The XPath processor we wrote is not standards compliant – it implements many more functions than are defined in the spec and skips on implementing some others. It also has other capabilities that the standards XPath processor does not have.

Since the Windows Phone SDK does not come with an XPath parser, we had to write one ourselves. Consider an app that retrieves weather data from a web service. The XML retrieved contains the following fragment (note – this is made up):

XML fragment – weather
  1. <link>http://us.rd.yahoo.com/dailynews/rss/weather/Sunnyvale__CA/*http://weather.yahoo.com/forecast/USCA1116_f.html</link&gt;
  2. <pubDate>Sat, 01 Sep 2012 1:55 pm PDT</pubDate>
  3. <yweather:condition unit=”F” text=”Fair” code=”34″ temp=”69″ date=”Sat, 01 Sep 2012 1:55 pm PDT” />
  4. <yweather:condition unit=”C” text=”Fair” code=”34″ temp=”21″ date=”Sat, 01 Sep 2012 1:55 pm PDT” />

To parse this XML, you could have used Linq for XML – finding the yweather:conditionelement that has the relevant unit attribute (F or C – depending on your user’s settings).

The Linq query is fairly simple and would have looked something like this:

Linq
  1. var temperature = xml.Descendants(XName.Get(“condition”, NS.YWeather)).Where(x => x.Attribute(“unit”).Value == “F”).Attributes(“temp”).Value;

This will go through the descendants of the document, finding the relevant element and retrieve the temperature for the desired unit.

Making your app more adaptable

The problem, of course, is that it’s not trivial to update the LINQ expression on the fly* w/o shipping an app update. Instead, you could use XPath to retrieve the relevant value – the XPath string would look a little like this – your app will String.Format() it with the appropriate value (“C”/”F”):

XPath
  1. var xpath = “//yweather:condition[@unit=’{0}’]/@temp”;

The XPath searches for all condition elements in the XML which have a unit attribute set to the desired unit (F or C). The final axis on the XPath expression tells the processor to select the @temp element.

Using the xpath in C# is very easy as well.

Using an XPath
  1. XDocument doc;
  2. // ….
  3. var temp = doc.Root.Select(xpath).Value;

Now, the really interesting part is that XPath is just a string – it is easily updateable from a server – if something changes in the format of the XML, it’s easy to adjust. In the case that the temp attribute is changed to “temperature” instead of “temp”, you simply need to send the following update to your app:

XPath
  1. //yweather:condition[@unit=’{0}’]/@temperature

If, for example, the temperature was changed to be an element, that’s easy too – all you need to do is change the XPath to look like this:

XPath
  1. //yweather:condition[@unit=’{0}’]/@temperature
* There are projects on the web that will let you serialize and deserialize LINQ expressions – most of them are abandoned or don’t really work that well on WP or Silverlight.

More severe changes

What if the changes to the XML are more severe? What if, for example, the unit attribute was changed to contain “Fahrenheit” and “Celsius” instead of “F” and “C”?

You would then need to change your XPath to the following:

XPath
  1. //yweather:condition[@unit=if(‘{0}’=’C’, ‘Celsius’, ‘Fahrenheit’)]/@temp

You are now using an XPath function called ifto place the correct string in the query string, dependent on the value passed in.

But wait… There’s more… What if something really bad happened? What if the API got rid of Celsius completely? What if the value was not there at all? How could you handle that without an App update?

Easy…

XPath
  1. if (‘{0}’=’F’, Number(//yweather:condition[@unit=’F’)]/@temp), (5 div 9) * (Number(//yweather:condition[@unit=’F’)]/@temp) – 32)))

Wait.. What?

Generally speaking, XPath returns nodes. However, the XPath parser we built knows how to also return discrete values – specifically – numbers, strings and Booleans.

In this case, the expression again uses the if()function to check on what’s requested – F or C. If F is requested, it simply returns the relevant temperature. However, in the case of C, it needs to actually calculate the value – which is what is done by this sub-expression:

(5/9) * (Number(//yweather:condition[@unit=’F’)]/@temp) – 32)

XPath essentially becomes a very powerful expression engine.

What about entry points?

In the first post in this series, I discussed how entry points (the URIs for the web services) can also change significantly. In this example, our URI was this:

http://weather.yahooapis.com/forecastrss?w={0}

Which we used with String.Format()to properly format to get the proper URI to parse, where w is the full zip code for the location wanted.

However, what if the URI changes – say the API suddenly stops supporting the full Zip code and requires just the basic one – the calls will fail because the URI is malformed.

So, instead of using a simple String.Format()to get our URI, we can do one better – use XPath as an expression for the URI.

“Just hold on a minute! How do I use an XPath if I don’t have an XML document? All I want is a URI string…” I hear you asking.. To which I answer – you do have an XML: <xml/>. The XPath expression can run on this mock XML document. Here’s an example of the XPath that will give the URI above:

XPath
  1. concat(‘http://weather.yahooapis.com/forecastrss?w=’, ‘{0}’)

As you can see, this XPath actually has nothing to do with the XML it’s running on it. It simply uses the various XPath functions. How would you then change this expression to return just the basic zip code (5 digits), given that you actually pass it 7 digits? Again – easy:

Code Snippet
  1. concat(‘http://weather.yahooapis.com/forecastrss?w=’, substring(‘{0}’, 0, 5))

After passing this through a String.Format() call and running the expression, it will take the string in {0} and return only the first 5 characters. As you can imagine, you can go fairly far with customizing the URI to your liking.

But wait.. Yahoo would never break the API in such a way…

You are absolutely right – the chances of Yahoo! Breaking the APIs in the manners I describe are probably nil. However… What if you were working against a less-used API made by a less-central company? What if Yahoo decided to deprecate their APIs or if Yahoo would be down? What would you do then?

In this case, there’s a good chance you could go and find a service that’s close enough to this service and tweak your URI and Selection XPaths to use the new service.

Relying on HTML

Sometimes you need to rely on HTML output for your application. Astronomy Picture Of The Day is a good example. Again – using something like the Html Agility Pack is an amazing tool for parsing HTML and looking at it – however, HTML documents are notoriously fickle and can change often. You can use the same mechanisms described above to handle HTML output. In this case, the XPath would look like this (to get the image off the page): html

XPath for HTML
  1. //p/a/img/@src

(And of course, you can use all the tricks described above to produce much more elaborate XPath parsing logic).

On the 4th post, I will show you many more interesting XPath functions built into our parsers such as getting inner/outer HTML of an element, running RegEx on sub elements or values, variable-like functions for more contracted XPath expressions as well as other tip and tricks.

How do I prepare?

You probably want to have all of your REST-like XML calls and HTML scraping done through XPath. From the URI entry points (of which there may be many) to the expressions that return values from the fetched documents – each one should have an expression that is associated with it and each one should be updateable by your server when needed.

If you are parsing lists (such as an RSS feed with multiple items, or an HTML that contains list of repeating items), you will probably want the following XPaths set up:

  1. Expression for each entry point that returns the URI (sometimes you may have multiple entry points giving the same results)
  2. XPath expression that returns all the parentnodes for each entry.
  3. XPath expression for each “property” of the parent.

For example – in the case of an RSS feed that contains meta-data about where each story originated and perhaps a perma link as well as the story itself, you will want the following five expressions:

  1. Expression for the RSS news feed URI.
  2. Expression for the items in the RSS feed (i.e. “/rss/channel/item”)
  3. Expression for the perma-link (i.e. “link”)
  4. Expression for the source of the story (i.e. “if(attribution/@name = ‘’, ‘N/A’, attribution/@name)”)
  5. Expression for the story (i.e. “description”)

Now, your flow is as follows:

  1. Download the XML by using the result of the URI expression.
  2. Run the items XPath on the XML (this will return a bunch of nodes)
  3. Do a foreach()on the returned nodes and for each:
    1. Run the perma-link expression and store the value.
    2. Run the source expression and store the value.
    3. Run the description expression and store the value.

For example:

Code Snippet
  1. // Resolve the entry point.
  2. Uri uri = ScriptHelper.EvaluateXPath(Settings.RssEntryPoint);
  3. XDocument doc;
  4. // Load the URI into an XDocument
  5. // Get the list of items.
  6. var items = doc.Root.SelectNodes(Settings.ItemXPath);
  7. // Iterate all items.
  8. foreach (var item in items)
  9. {
  10. Entry entry = new Entry();
  11. entry.PermaLink = item.SelectNodes(Settings.PermaLinkXPath).FirstOrDefault().Value;
  12. entry.Attribution = item.SelectNodes(Settings.AttributionXPath).FirstOrDefault().Vaue;
  13. entry.Description = item.SelectNodes(Settings.DescriptionXPath).FirstOrDefault().Value;
  14. Entries.Add(entry); }

What’s next

XPath is great and can solve many problems… However, at the end of the day, it’s a fairly simple and straightforward expression – there’s a ton of things it cannot do. That’s where we find ourselves going full-crazy and using Javascript to actually help us with resiliency.

This entry was posted in Dev, Resilient, WindowsPhone and tagged , , , . Bookmark the permalink.

Leave a comment