Monday, 17 February 2014

Using ElasticObject to parse arbitrarily large XML documents into object graphs

ElasticObject is a great implementation of a DynamicObject, that allows the developer to work with an arbitrarily large XML document, using an object graph. The code below access the Norwegian Weather Forecast service Yr.no to get some forecast data for the location where I grew up in Norway, where a weather station nearby automatically collects meteorological data such as wind, temperature, pressure and wind direction, humidity and so on. The data is available among other ways as XML to download and ElasticObject can be used to handle the XML. When handling XML, it is possible to create an XSD from sample XML data and in for example Visual Studio create an XSD file through XML->Create Schema option in Visual Studio after opening the sample XML file. Using the Visual Studio Command line, it is possible to generate a class through the command:

xsd /classes myfile.xsd

This generates a file called myfile.cs, which will be a data contract.

The data contract generated can be used to deserialize the XML into an object. Often this is a preferred strategy, since one gets a strongly typed object (graph) to work with. ElasticObject is a dynamic object, and allows the developer to avoid having to generate a serialization class before deserializing the received XML. Sometimes, this also allows changes in the XML to occur without affecting the code. Often it is a limited part of the XML tree, which is the received XML the developer needs to retrieve for further processing. Also, not having to generate classes from the XML schema is also convenient. Still, the developer needs to refer more often to the XML to understand the structure of the object which is created. The Intellisense of ElasticObject is rather poor, so the developer needs to query the object graph using Immediate Window and query the object to find the right "path" to the information in the XML document. Similar techniques using System.Linq.Xml and XPath queries can be used. To start using ElasticObject, install first the NuGet package in Visual Studio. Type the following command in Package Manager Console.

PM> install-package AmazedSaint.ElasticObject
Installing 'AmazedSaint.ElasticObject 1.2.0'.
Successfully installed 'AmazedSaint.ElasticObject 1.2.0'.
Adding 'AmazedSaint.ElasticObject 1.2.0' to TestElasticObject.
Successfully added 'AmazedSaint.ElasticObject 1.2.0' to TestElasticObject.

The following code then illustrates the use:

using AmazedSaint.Elastic;
using System;
using System.IO;
using System.Net;
using System.Xml.Linq;

namespace TestElasticObject
{
    class Program
    {

        static void Main(string[] args)
        {
            var wc = new WebClient();
          
            using (StreamReader sr = new StreamReader(wc.OpenRead(@"http://www.yr.no/stad/Norge/Nord-Tr%C3%B8ndelag/Steinkjer/S%C3%B8ndre%20Egge/varsel.xml")))
            {
                var data = sr.ReadToEnd();
                IterateForecasts(data);          
            }

            Console.WriteLine("Press any key to continue ...");
            Console.ReadKey(); 

        }

        private static void IterateForecasts(string data)
        {
            dynamic weatherdata = XElement.Parse(data).ToElastic();
            foreach (var node in weatherdata.forecast.text.location[null])
            {
                Console.WriteLine(string.Format("Fra-Til: {0} - {1}", node.from, node.to)); 
                Console.WriteLine(~node.body);
                Console.WriteLine();
            }
        }

    }
}


The code above uses WebClient to download the target XML data, then uses a StreamReader to read the XML file into a string. Then using XElement and the ToElastic extension method in AmazedSaint.Elastic namespace, this is stored into a dynamic variable which can be worked on. One important gotcha here is how to drill down into the object graph of the ElasticObject, which I could not figure out following the documentation, since it only contained more simple examples but I found on StackOverFlow: To drill further down into the object graph than its immediate child node, type the path to the element which contains the data to work with using a dot separated syntax and in addition: use the null index to get to the child element which contains the data to process - for example to output. When traversing the child element in the foreach loop, its child elements will be stored into the loop variable node and then it is possible to get to the attributes of the node. To get the value inside the element, use the tilde (~) operator. ElasticObject implements some operators to make it easier to work with XML and object graphs. For example, to cast an ElasticObject to XML, the > operator can be used:

            dynamic store = new ElasticObject("Store");
            store.Name = "Acme Store";
            store.Location.Address = "West Avenue, Heaven Street Road, LA";
            store.Products.Count = 2;
            store.Owner.FirstName = "Jack";
            store.Owner.SecondName = "Reacher";
            store.Owner <<= "this is some internal content for owner";

            var p1 = store.Products.Product();
            p1.Name = "Acme Floor Cleaner";
            p1.Price = 20;

            var p2 = store.Products.Product();
            p2.Name = "Acme Bun";
            p2.Price = 22; 

            XElement el = store > FormatType.Xml; 
            System.Console.WriteLine(el.ToString());

The ElasticObject is defined dynamically and the <<= operator is used to set the value inside the property, which will be shown when converting the ElasticObject to XML. In addition, the code above shows how to create child elements, as the Products property contains two child Product objects, which will be Product XML elements. The < operator is used to "pipe" the object into an XElement variable, which is the converted XML object from the ElasticObject object graph. To convert the object from an XML document to an ElasticObject, the ToElastic() extension method can be used to convert the XML document or XElement variable into the ElasticObject - the object graph again. Using ElasticObject, it is possible to work with XML in object graphs and since it is dynamic, it is not necessary to create new types, such as serialization data contracts. ElasticObject source code is available on GitHub here: https://github.com/amazedsaint/ElasticObject
Its creator is Anoop Madhusudanan, AmazedSaint profile on GitHub. It originally was a project hosted on CodePlex. ElasticObject should be considered as an alternative when working with XML and object graphs. It is very tempting to avoid having to create new objects to work with the XML but work with a dynamic object that can be extended and changed. It should also be efficient. An alternative can be to use anonymous types, but why go the hard way when one can go the easy - elastic way?

No comments:

Post a Comment