Wednesday 8 July 2015

Logging the SQL of Entity Framework exceptions in EF 6

If you use EF 6, it is possible to add logging functionality that will reveal why an exception in the data layer of your app or system occured and in addition creating a runnable SQL that you might try out in your testing/production environment inside a transaction that is rollbacked for quicker diagnosis-cause-fix cycle! First off, create a class that implements the interface IDbCommandInterceptor in the System.Data.Entity.Infrastructure.Interception namespace. This class is then added using in your ObjectContext / DbContext class (this is a usually a partial class that you can extend) using the DbInterception.Add method. I add this class in the static constructor of my factory class inside a try-catch block. The important part is that you call the DbInterception.Add method and instantiate the class you create. Let's consider a code example of this. I am only focusing on logging exceptions, other kind of interceptions can of course be performed. Here is the sample class for logging exceptions, I have replaced the namespaces of the system of mine with the more generic "Acme":

using System;
using System.Data;
using System.Data.Common;
using System.Data.Entity.Infrastructure.Interception;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using MySoftware.Common.Log;



namespace Acme.Data.EntityFramework
{
    
    /// <summary>
    /// Intercepts exceptions that is raised by the database running an operation and propagated to the eventlog for logging 
    /// </summary>
    public class AcmeDbCommandInterceptor : IDbCommandInterceptor
    {

        public void NonQueryExecuted(DbCommand command, DbCommandInterceptionContext<int> interceptionContext)
        {
            LogIfError(command, interceptionContext);
        }

        public void NonQueryExecuting(DbCommand command, DbCommandInterceptionContext<int> interceptionContext)
        {
            LogIfError(command, interceptionContext);          
        }

        public void ReaderExecuted(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)
        {
            LogIfError(command, interceptionContext);          
        }

        public void ReaderExecuting(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)
        {
            LogIfError(command, interceptionContext);
        }

        public void ScalarExecuted(DbCommand command, DbCommandInterceptionContext<object> interceptionContext)
        {
            LogIfError(command, interceptionContext);            
        }

        public void ScalarExecuting(DbCommand command, DbCommandInterceptionContext<object> interceptionContext)
        {
            LogIfError(command, interceptionContext);           
        }

        private void LogIfError<TResult>(DbCommand command, DbCommandInterceptionContext<TResult> interceptionContext)
        {
            try
            {
                if (interceptionContext != null && interceptionContext.Exception != null)
                {
                    bool isLogged = false;
                    try
                    {
                        LogInterpolatedEfQueryString(command, interceptionContext);
                        isLogged = true; 
                    }
                    catch (Exception err)
                    {
                        LogRawEfQueryString(command, interceptionContext, err);
                    }
                    if (!isLogged)
                        LogRawEfQueryString(command, interceptionContext, null);
                   
                }
            }
            catch (Exception err)
            {
                Debug.WriteLine(err.Message);
            }
        }

        /// <summary>
        /// Logs the raw EF query string 
        /// </summary>
        /// <typeparam name="TResult"></typeparam>
        /// <param name="command"></param>
        /// <param name="interceptionContext"></param>
        /// <param name="err"></param>
        private static void LogRawEfQueryString<TResult>(DbCommand command, DbCommandInterceptionContext<TResult> interceptionContext,
            Exception err)
        {
            if (err != null)
                Debug.WriteLine(err.Message);

            string queryParameters = LogEfQueryParameters(command);

            EventLogProvider.Log(
                string.Format(
                    "Acme serverside DB operation failed: Exception: {0}. Parameters involved in EF query: {1}. SQL involved in EF Query: {2}",
                    Environment.NewLine + interceptionContext.Exception.Message,
                    Environment.NewLine + queryParameters,
                    Environment.NewLine + command.CommandText
                    ), EventLogProviderEnum.Warning);
        }

        /// <summary>
        /// Return a string with the list of EF query parameters 
        /// </summary>
        /// <param name="command"></param>
        /// <returns></returns>
        private static string LogEfQueryParameters(DbCommand command)
        {
            var sb = new StringBuilder(); 
            for (int i = 0; i < command.Parameters.Count; i++)
            {
                if (command.Parameters[i].Value != null)
                    sb.AppendLine(string.Format(@"Query param {0}: {1}", i, command.Parameters[i].Value));
            }
            return sb.ToString();
        }

        private static DbType[] QuoteRequiringTypes
        {
            get
            {
                return new[]
                {
                        DbType.AnsiString, DbType.AnsiStringFixedLength, DbType.String,
                            DbType.StringFixedLength, DbType.Date, DbType.Date, DbType.DateTime,
                            DbType.DateTime2, DbType.Guid
                };
            }
        }


        private static void LogInterpolatedEfQueryString<TResult>(DbCommand command,
            DbCommandInterceptionContext<TResult> interceptionContext)
        {
            var paramRegex = new Regex("@\\d+");
            string interpolatedSqlString = paramRegex.Replace(command.CommandText,
                m => GetInterpolatedString(command, m));

            EventLogProvider.Log(string.Format(
                "Acme serverside DB operation failed: Exception: {0}. SQL involved in EF Query: {1}",
                Environment.NewLine + interceptionContext.Exception.Message,
                Environment.NewLine + interpolatedSqlString),
                EventLogProviderEnum.Warning);

        }

        private static string GetInterpolatedString(DbCommand command, Match m)
        {
            try
            {
                int matchIndex;
                if (string.IsNullOrEmpty(m.Value))
                    return m.Value;
                int.TryParse(m.Value.Replace("@", ""), out matchIndex);
                    //Entity framework will usually build parametrized queries with @1, @2 and so on .. 
                if (matchIndex < 0 || matchIndex >= command.Parameters.Count)
                    return m.Value;

                //Ok matchIndex from here 
                DbParameter dbParameter = command.Parameters[matchIndex];
                var dbParameterValue = dbParameter.Value;
                if (dbParameterValue == null)
                    return m.Value;

                try
                {
                    return GetAdjustedDbParameterValue(dbParameter, dbParameterValue);
                }
                catch (Exception err)
                {
                    Debug.WriteLine(err.Message);
                }
            }
            catch (Exception err)
            {
                Debug.WriteLine(err.Message);
            }

            return m.Value;
        }

        /// <summary>
        /// There are some cases where one have to adjust the Db Parametre value in case it is a boolean 
        /// </summary>
        /// <param name="dbParameter"></param>
        /// <param name="dbParameterValue"></param>
        /// <returns></returns>
        private static string GetAdjustedDbParameterValue(DbParameter dbParameter, object dbParameterValue)
        {
            if (QuoteRequiringTypes.Contains(dbParameter.DbType))
                return string.Format("'{0}'", dbParameterValue); //Remember to put quotes on parameter value 

            if (dbParameter.DbType == DbType.Boolean)
            {
                bool dbParameterBitValue;
                bool.TryParse(dbParameterValue.ToString(), out dbParameterBitValue);
                return dbParameterBitValue ? "1" : "0"; //BIT
            }

            return dbParameterValue.ToString(); //Default case (not a quoted value and not a bit value)
        }
    }
}


The code above uses a class EventLogProvider that will record to the Event Log of the system running the Entity Framework code in the data layer, usually the application server of your system. Here is the relevant code for logging to the Eventlog:

using System.Diagnostics;
using System;
using Acme.Common.Security;

namespace Acme.Common.Log
{
    public enum EventLogProviderEnum
    {
        Warning, Error
    }

    public static class EventLogProvider
    {
        private static string _applicationSource = "Acme";
        private static string _applicationEventLogName = "Application";

        public static void Log(Exception exception, EventLogProviderEnum logEvent)
        {
            Log(ErrorUtil.ConstructErrorMessage(exception), logEvent);
        }

        public static void Log(string logMessage, EventLogProviderEnum logEvent)
        {
            try
            {
                if (!EventLog.SourceExists(_applicationSource))
                    EventLog.CreateEventSource(_applicationSource, _applicationEventLogName);

                EventLog.WriteEntry(_applicationSource, logMessage, GetEventLogEntryType(logEvent));
            }
            catch { } // If the event log is unavailable, don't crash.
        }

        private static EventLogEntryType GetEventLogEntryType(EventLogProviderEnum logEvent)
        {
            switch(logEvent)
            {
                case EventLogProviderEnum.Error:
                    return EventLogEntryType.Error;
                case EventLogProviderEnum.Warning:
                default:
                    return EventLogEntryType.Warning;
            }
        }
    }
}


It is also necessary to add an instance of the db interception class in your ObjectContext or DbContext class as noted. Example:

try {
DbInterception.Add(new AcmeDbCommandInterceptor());
}
catch (Exception err){
 //Log error here (consider using the EventLogProvider above for example)
}

The interpolated string will often be the one that is interesting when EF queries fail. Entity Framework (EF) uses stored procedures and parameters to prevent SQL injection attacks. To get a SQL you can actually run, you will usually interpolate the EF query CommandText and look at the parameters, that is named as @0, @1, @2 and so on.. I use a Regex here to search after this. Note that my code uses a lot of try-catch in case something goes wrong. You also do NOT want to run any heavy code here, as the DbInterception will run on ANY query. I only do further processing IF an exception has occured, to avoid bogging down the system with performance drain. I also first try to get the interpolated EF query string that I can run in my Production or Test environment, usually inside a BEGIN TRAN.. and ROLLBACK statement just to see why the database call failed. In addition, the code will try to log the Raw EF Query in form of logging the EF query CommandText and the command parameters, but without the interpolation technique. I have also done some adjustment, by adding single quotes around strings and considering booleans as the value 0 or 1 (BIT). This code is new and there might be some additional adjustments here. The bottom line to note here is that it is important to LOG the EF query SQL to INFER the real REASON why the SQL query FAILED, i.e. a quicker DIAGNOSE-INFER-FIX cycle leading to more success on your projects, if you use .NET and Entity Framework (version 6 or newer)!

Tuesday 26 May 2015

Recording network traffic between clients and WCF services to a database

Developers working with WCF services might have come accross the excellent tool Fiddler - which is a HTTP web debugging proxy. In some production environments, it could be interesting to be able to log network traffic in the form of requests and responses to database.There are several approaches possible to this, one is to implement a WCF Message Inspector, which will also act as a WCF servicebehavior.

RecordingMessageBehavior

The following class implements the two interfaces IDispatchMessageInspector, IServiceBehavior. Please note that the code contains some domain-specific production code. The code can be adjusted to work on other code bases of course. The core structure of the code is the required code for recording network traffic. The code below contains some claims-specific code handling that is not required, but perhaps your code base also use some similar code?
In other words, you have to adjusted the code below to make it work with your code base, but the core structure of the code should be of general interest for WCF developers.


using System;
using System.Configuration;
using System.ServiceModel;
using System.ServiceModel.Channels;
using System.ServiceModel.Description;
using System.ServiceModel.Dispatcher;

using Microsoft.IdentityModel.Claims;
using System.Diagnostics;

namespace Hemit.OpPlan.Service.Host
{
    public class RecordingMessageBehavior : IDispatchMessageInspector, IServiceBehavior
    {

        private readonly bool _useTrafficRecording;

        private static DateTime _lastClearTime = DateTime.Now; 

        private readonly ITrafficLogManager _trafficLogManager;


        public RecordingMessageBehavior()
        {
            _trafficLogManager = new TrafficLogManager(); 

            try
            {
                if (ConfigurationManager.AppSettings[Constants.UseTrafficRecording] != null)
                {
                    _useTrafficRecording = bool.Parse(ConfigurationManager.AppSettings[Constants.UseTrafficRecording]); 
                }
            }
            catch (Exception err)
            {
                Debug.WriteLine(err.Message);
            }
        }


        #region IDispatchMessageInspector Members

        public object AfterReceiveRequest(ref Message request, IClientChannel channel, InstanceContext instanceContext)
        {
            MessageBuffer buffer = request.CreateBufferedCopy(Int32.MaxValue);
            request = buffer.CreateMessage();
            Message messageToWorkWith = buffer.CreateMessage(); 

            if (IsTrafficRecordingDeactivated())
                return request;
            InspectMessage(messageToWorkWith, true);
            return request;
        }

        public void BeforeSendReply(ref Message reply, object correlationState)
        {
            MessageBuffer buffer = reply.CreateBufferedCopy(Int32.MaxValue);
            reply = buffer.CreateMessage();
            Message messageToWorkWith = buffer.CreateMessage(); 

            if (IsTrafficRecordingDeactivated())
                return;

            InspectMessage(messageToWorkWith, false);
        }

        private bool IsTrafficRecordingDeactivated()
        {
            return !_useTrafficRecording || !IsCurrentUserLoggedIn();
        }

        #endregion

        private void InspectMessage(Message message, bool isRequest)
        {
            try
            {
                string serializedContent = message.ToString();
                string remoteIp = null;

                if (isRequest)
                {
                    try
                    {
                        RemoteEndpointMessageProperty remoteEndpoint = message.Properties[RemoteEndpointMessageProperty.Name] as RemoteEndpointMessageProperty;
                        if (remoteEndpoint != null) remoteIp = remoteEndpoint.Address + ":" + remoteEndpoint.Port;
                    }
                    catch (Exception err)
                    {
                        Debug.WriteLine(err.Message);
                    }
                }

                var trafficLogData = new TrafficLogItemDataContract
                {
                    IsRequest = isRequest,
                    RequestIP = remoteIp,
                    Recorded = DateTime.Now,
                    SoapMessage = serializedContent,
                    ServiceMethod = GetSoapActionPart(message.Headers.Action, 1),
                    ServiceName = GetSoapActionPart(message.Headers.Action, 2)
                };

                _trafficLogManager.InsertTrafficLogItem(trafficLogData, ResolveCurrentClaimsIdentity());

                if (DateTime.Now.Subtract(_lastClearTime).TotalHours > 1)
                {
                    _trafficLogManager.ClearOldTrafficLog(ResolveCurrentClaimsIdentity()); //check if clearing old traffic logs older than an hour 
                    _lastClearTime = DateTime.Now; 
                }
            }
            catch (Exception ex)
            {
                InterfaceBinding.GetInstance<ILog>().WriteError(ex.Message, ex);

            }
        }

        private string GetSoapActionPart(string soapAction, int rightOffset)
        {
            if (string.IsNullOrEmpty(soapAction) || !soapAction.Contains("/"))
                return soapAction;

            string[] soapParts = soapAction.Split('/');
            int soapPartCount = soapParts.Length;
            if (rightOffset >= soapPartCount)
                return soapAction;
            else
                return soapParts[soapPartCount - rightOffset]; 
        }

        private IClaimsIdentity ResolveCurrentClaimsIdentity()
        {
            if (ServiceSecurityContext.Current != null && ServiceSecurityContext.Current.AuthorizationContext != null)
            {
                var claimsIdentity = ServiceSecurityContext.Current.PrimaryIdentity as IClaimsIdentity; //Retrieval of current claims identity from WCF ServiceSecurityContext
                return claimsIdentity;
            }
            return null;
        }

        private bool IsCurrentUserLoggedIn()
        {
            IClaimsIdentity claimsIdentity = ResolveCurrentClaimsIdentity();
            if (claimsIdentity == null || claimsIdentity.Claims == null || claimsIdentity.Claims.Count == 0 
                || claimsIdentity.FindClaim(WellKnownClaims.AuthorizedForOrganizationalUnit) == null)
                return false;
            return true;
        }

        #region IServiceBehavior Members

        public void AddBindingParameters(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase, System.Collections.ObjectModel.Collection<ServiceEndpoint> endpoints, BindingParameterCollection bindingParameters)
        {
        }

        public void ApplyDispatchBehavior(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase)
        {
            foreach (var channelDispatcherBase in serviceHostBase.ChannelDispatchers)
            {
                var chdisp = channelDispatcherBase as ChannelDispatcher;
                if (chdisp == null)
                    continue; 

                foreach (var endpoint in chdisp.Endpoints)
                {
                    endpoint.DispatchRuntime.MessageInspectors.Add(new RecordingMessageBehavior()); 
                }
            }
        }

        public void Validate(ServiceDescription serviceDescription, ServiceHostBase serviceHostBase)
        {
        }

        #endregion
    }
}

To turn on and off this WCF recording message behavior, toggle the app setting UseTrafficRecording to true in web.config:



In the code above, the first thing to do in the two methods AfterReceiveRequest and BeforeSendReply, is to clone the messages. This is because WCF messages are just like messages in MSMQ message queues - they can be consumed once. Actually we create a copy of a copy here to avoid errors.

   public object AfterReceiveRequest(ref Message request, IClientChannel channel, InstanceContext instanceContext)
        {

            MessageBuffer buffer = request.CreateBufferedCopy(Int32.MaxValue);
            request = buffer.CreateMessage();
            Message messageToWorkWith = buffer.CreateMessage(); 

            if (IsTrafficRecordingDeactivated())
                return request;
            InspectMessage(messageToWorkWith, true);
            return request;
        }

        public void BeforeSendReply(ref Message reply, object correlationState)
        {
 
            MessageBuffer buffer = reply.CreateBufferedCopy(Int32.MaxValue);
            reply = buffer.CreateMessage();
            Message messageToWorkWith = buffer.CreateMessage(); 


            if (IsTrafficRecordingDeactivated())
                return;

            InspectMessage(messageToWorkWith, false);
        }


Retrieving the serialized SOAP XML contents of the WCF messsage is easy, just running the ToString() method on the Message object. In addition, the code above requires some data access layer logic that will persist the data to database. Using EntityFramework and SQL Server her, logging is performed to the table TrafficLog. Please note that the code below will retain 24 hours of network traffic. In a production environment, database will fill up quickly if the traffic and intensity of data transfer in the form of request and response from and to the clients by the WCF services, will quickly grow into Gigabytes (GB). Usually you only want to record traffic when security demands this, or for error tracking and diagnostics.






using System;
using System.Collections.Generic;
using System.Linq;

using Microsoft.IdentityModel.Claims;


namespace Hemit.OpPlan.Service.Implementation
{

    public class TrafficLogManager : ITrafficLogManager
    {

        public List<TrafficLogItemDataContract> GetTrafficLog(DateTime startWindow, DateTime endWindow, IClaimsIdentity claimsIdentity)
        {
            using (var dbContext = ObjectContextManager.DbContextFromClaimsPrincipal(claimsIdentity))
            {
                var trafficLogs = dbContext.TrafficLogs.Where(tl => tl.Recorded >= startWindow && tl.Recorded <= endWindow).ToList();
                return trafficLogs.ForEach(tl =>
                {
                    return CreateTrafficLogDataContract(tl);
                });
            }
        }

        public void ClearOldTrafficLog(IClaimsIdentity claimsIdentity)
        {
            AggregateExceptionExtensions.CallActionLogAggregateException(() =>
            {
                System.Threading.ThreadPool.QueueUserWorkItem((object state) =>
                {
                    try
                    {
                        //Run on new thread 
                        using (var dbContext = ObjectContextManager.DbContextFromClaimsPrincipal(claimsIdentity))
                        {
                            //Use your own DbContext or ObjectContext here (EF)
                            DateTime obsoleteLogDate = DateTime.Now.Date.AddDays(-1);

                            var trafficLogs = dbContext.TrafficLogs.Where(tl => tl.Recorded <= obsoleteLogDate).ToList();
                            foreach (var tl in trafficLogs)
                            {
                                dbContext.TrafficLogs.DeleteObject(tl);
                            }
                            dbContext.SaveChanges();
                        }
                    }
                    catch (Exception err)
                    {
                        InterfaceBinding.GetInstance<ILog>().WriteError(err.Message); 
                    }
                });
            });
        }

        public void InsertTrafficLogItem(TrafficLogItemDataContract trafficLogItemdata, IClaimsIdentity claimsIdentity)
        {

            //string defaultConnectionString =  ObjectContextManager.GetConnectionString();

            using (var dbContext = ObjectContextManager.DbContextFromClaimsPrincipal(claimsIdentity))
            {
                var trafficLog = new TrafficLog();
                MapTrafficLogFromDataContract(trafficLogItemdata, trafficLog);
                dbContext.TrafficLogs.AddObject(trafficLog);
                dbContext.SaveChanges(); 
            }
        }

        private static void MapTrafficLogFromDataContract(TrafficLogItemDataContract trafficLogItem, TrafficLog trafficLog)
        {
            trafficLog.Recorded = trafficLogItem.Recorded;
            trafficLog.RequestIP = trafficLogItem.RequestIP;
            trafficLog.ServiceName = trafficLogItem.ServiceName;
            trafficLog.ServiceMethod = trafficLogItem.ServiceMethod;
            trafficLog.SoapMessage = trafficLogItem.SoapMessage;
            trafficLog.TrafficLogId = trafficLogItem.TrafficLogId;
            trafficLog.IsRequest = trafficLogItem.IsRequest;
        }

        private static TrafficLogItemDataContract CreateTrafficLogDataContract(TrafficLog trafficLog)
        {
            return new TrafficLogItemDataContract
            {
                IsRequest = trafficLog.IsRequest,
                Recorded = trafficLog.Recorded,
                ServiceMethod = trafficLog.ServiceMethod,
                ServiceName = trafficLog.ServiceName,
                SoapMessage = trafficLog.SoapMessage,
                TrafficLogId = trafficLog.TrafficLogId
            };
        }

    }    
   
}


In addition, we need to add the WCF message inspector to our service. I use a ServiceHostFactory and a class implementing ServiceHost class and override the OnOpening method:

        protected override void OnOpening()
        {
            ..


            if (Description.Behaviors.Find<RecordingMessageBehavior>() == null)
                Description.Behaviors.Add(new RecordingMessageBehavior()); 


            base.OnOpening();
        }

Here we register the service behavior for each WCF service to intercept the network traffic, clone the messages as required and log the contents to a database table, effectively gaining a way to get an overview of the requests and responses and to perform security audits, inspections and logging capabilities. You will need to do some adjustments to the code above to make it work against your codebase, but I hope this article is of general interest. The capability of recording WCF network traffic between services and clients is a very powerful feature giving the developer teams more control of their running environments. In addition, this is an important security boost. However, note that in many production environments, huge amounts of data will accumulate. Therefore, this implementation will clear the contents of the traffic log table every 24 hours. The following SQL query is a very convenient one to display data in SQL Management Studio. An XML column can be displayed and read as a documents conveniently.



select top 100 convert(xml, SoapMessage) as Payload, * from trafficlog

Obviously adjusting the Recorded datestamp column critera will pinpoint the returned data from this table more effectively. Retrieving and analysing the data might be problematic, if the amount of data reaches several gigabytes. Here is a powershell to retrieve data in a chunked fashion. Here, the data is being retrieved as 1-minute sized segments, which in a specific case chunked data up into 40 MB sized XML-files. This makes it possible to search the data using a text editor.


Powershell for retrieving traffic log data contents in a chunked fashion


#Traffic Log Bulk Copy Util

$databaseName='MYDATABASE_INSTANCENAME'
$databaseServer='MYDATABASE_SERVER'
$dateStamp = '130415'
$starthours = 12
$minuteSegmentSize = 1
$minuteSegmentNumbers = 160
$dateToInvestigate = '13.04.2015 00:00:00'
$outputFolder='SOMESHARE_UNC_PATH'

Write-Host "Traffic Log Request Bulk Output Logger"


for($i=0; $i -le $minuteSegmentNumbers; $i++){
 $dt = [datetime]::ParseExact($dateToInvestigate, "dd.MM.yyyy hh:mm:ss", $null)
 $hoursToAdd = $i / 60
 $minutesToAdd = $i % 60
 $dt = $dt.AddHours($hoursToAdd + $starthours).AddMinutes($minutesToAdd)
 $dtEnd = $dt.AddMinutes($minuteSegmentSize).AddSeconds(59)
 $startWindow = $dt.ToString("yyyy-MM-dd HH:mm:ss")
 $endWindow = $dtEnd.ToString("yyyy-MM-dd HH:mm:ss")
 $destinationFile = Join-Path -Path $outputFolder $ChildPath 
 $destinationFile += "trafficLog_$dateStamp_bulkcopy_" + $i + ".xml"
 Write-Host $("StartWindow: " + $startWindow + " EndWindow: " + $endWindow + "Destination file: " + $destinationFile)


 $bcpCmd = "bcp ""SELECT SoapMessage FROM TrafficLog WHERE IsRequest=1 AND Recorded >= '$startWindow' and Recorded <= '$endWindow'"" queryout $destinationFile -S $databaseServer -d $databaseName -c -t"","" -r""\n"" -T -x"

  Write-Host $("Bulk copy cmd: ")
  Write-Host $bcpCmd
  Write-Host "GO!"

  Invoke-Expression $bcpCmd

  Write-Host "DONE!"
 
}

By following this article, you should be now ready to write your WCF message inspector and record network traffic to your database, for increased security boost, logging and security audit capabilities and better overview of the raw data involved in the form of SOAP XML. Please read my previous article, if you want to gain insight into how to compress SOAP XML and sending it as binary data (byte arrays), even with GZip compression! Happy coding!

Sunday 24 May 2015

Programatically compress data contracts saving bandwidth between clients and WCF

This article will explain how to achieve compression when transferring data contracts between a WCF service and a client. Many developers using WCF have used the tool Fiddler or similar to inspect the network traffic. The default transmission of data between services and clients are data contracts that are serialized using the SOAP XML protocol. Note that this will usually send data uncompressed as XML of course. This is not an issue for smaller data contracts, but when you start transmitting much data, data contracts can soon grow into MegaBytes (MB) of data and the TTF (Time-To-Transfer) gets noticably, esecially on low-bandwidth devices such as smart phones! It is possible to configure compression in IIS, but more control can be achieved with a mini-framework of mine I have developed and I will present next. First we need some code to be able to compress data. The following class, GzipByteArrayCompressionutility, uses the MemoryStream and BufferedStream in the System.IO namespace and the GZipStream class in the System.IO.Compression namespace:

GzipByteArrayCompressionUtility



using System;
using System.IO;
using System.IO.Compression;

namespace SomeAcme.SomeProduct.Common.Compression
{

    public static class GzipByteArrayCompressionUtility
    {

        private static readonly int bufferSize = 64 * 1024; //64kB

        public static byte[] Compress(byte[] inputData)
        {
            if (inputData == null)
                throw new ArgumentNullException("inputData must be non-null");

            using (var compressIntoMs = new MemoryStream())
            {
                using (var gzs = new BufferedStream(new GZipStream(compressIntoMs,
                 CompressionMode.Compress), bufferSize))
                {
                    gzs.Write(inputData, 0, inputData.Length);
                }
                return compressIntoMs.ToArray();
            }
        }

        public static byte[] Decompress(byte[] inputData)
        {
            if (inputData == null)
                throw new ArgumentNullException("inputData must be non-null");

            using (var compressedMs = new MemoryStream(inputData))
            {
                using (var decompressedMs = new MemoryStream())
                {
                    using (var gzs = new BufferedStream(new GZipStream(compressedMs,
                     CompressionMode.Decompress), bufferSize))
                    {
                        gzs.CopyTo(decompressedMs);
                    }
                    return decompressedMs.ToArray();
                }
            }
        }

        //private static void Pump(Stream input, Stream output)
        //{
        //    byte[] bytes = new byte[4096];
        //    int n;
        //    while ((n = input.Read(bytes, 0, bytes.Length)) != 0)
        //    {
        //        output.Write(bytes, 0, n); 
        //    }
        //}

    }


}



Further, it is necessary to have some utility methods to handle data contracts and performing both the compression and decompression. I have also added some handy methods in this class for cloning data contracts.

DataContractUtility



using System;
using System.Collections.Generic;
using System.Configuration;
using System.IO;
using System.Runtime.Serialization;
using System.Text;
using System.Xml;
using System.Xml.Serialization;


namespace SomeAcme.SomeProduct.Common
{

    /// <summary>
    /// Utility methods for serializing, deserializing and cloning data contracts
    /// </summary>
    /// <remarks>The serialization, deserialization and cloning here is 
    /// Limited to data contracts, as DataContractSerializer is being used for these operations</remarks>
    public static class DataContractUtility
    {

        public static byte[] ToByteArray<T>(T dataContract) where T : class
        {
            if (dataContract == null)
            {
                throw new ArgumentNullException();
            }

            using (MemoryStream memoryStream = new MemoryStream())
            {
                var serializer = new DataContractSerializer(typeof(T));
                serializer.WriteObject(memoryStream, dataContract);
                memoryStream.Position = 0;
                return memoryStream.ToArray();                
            }
        }

        public static List<TResult> GetDataContractsFromDataContractContainer<TContainer, TResult>(TContainer container, CompressionSetupDataContract setup) 
            where TContainer : class, IDataContractContainer<TResult>  
            where TResult : class
        {
            if (container == null)
                return null;
            if (!container.IsBinary || !IsUseNetworkCompression(setup))
                return container.SerialPayload;
            else
            {
                byte[] unzippedData = GzipByteArrayCompressionUtility.Decompress(container.BinaryPayload);
                var dataContracts = DataContractUtility.DeserializeFromByteArray<List<TResult>>(unzippedData);
                return dataContracts;
            }
        }

        private static bool IsUseNetworkCompression(CompressionSetupDataContract setup)
        {
            if (setup != null)
                return setup.UseNetworkCompression;

            if (ConfigurationManager.AppSettings[Constants.UseNetworkCompression] == null)
                return false; 

            bool isUseNetworkCompression = false;
            if (!bool.TryParse(ConfigurationManager.AppSettings[Constants.UseNetworkCompression], out isUseNetworkCompression))
                return false;
            else
                return isUseNetworkCompression;            
        }

        private static int GetNetworkCompressionItemTreshold(CompressionSetupDataContract setup)
        {
            if (setup != null)
                return setup.NetworkCompressionItemTreshold; 

            int standardNetworkCompressionItemTreshold = 100;
            if (ConfigurationManager.AppSettings[Constants.NetworkCompressionItemTreshold] == null)
                return standardNetworkCompressionItemTreshold;
            int networkCompressionItemTreshold = standardNetworkCompressionItemTreshold;
            if (!int.TryParse(ConfigurationManager.AppSettings[Constants.NetworkCompressionItemTreshold], out networkCompressionItemTreshold))
                return standardNetworkCompressionItemTreshold;
            else
                return networkCompressionItemTreshold;
        }



        public static TContainer GetContainerInstanceAfterResult<TContainer, TResult>(List<TResult> result, CompressionSetupDataContract setup) 
            where TContainer : class, IDataContractContainer<TResult>, new()
            where TResult : class
        {
            TContainer container = new TContainer();

            if (IsUseNetworkCompression(setup) && (result != null && result.Count >= Math.Min(container.BinaryTreshold, GetNetworkCompressionItemTreshold(setup))))
            {
                if (container.IsGzipped)
                    container.BinaryPayload = GzipByteArrayCompressionUtility.Compress(DataContractUtility.ToByteArray(result));
                else 
                    container.BinaryPayload = DataContractUtility.ToByteArray(result); 
                
                container.IsBinary = true;
            }
            else
            {
                container.SerialPayload = result;
            }

            return container; 
        }

        public static string SerializeObject<T>(T dataContract) where T : class
        {
            using (var memoryStream = new MemoryStream())
            {
                using (var streamReader = new StreamReader(memoryStream))
                {
                    var serializer = new DataContractSerializer(typeof(T));
                    serializer.WriteObject(memoryStream, dataContract);
                    memoryStream.Position = 0;
                    return streamReader.ReadToEnd();
                }
            }
        }

        public static T DeserializeObject<T>(string serializedContent) where T : class
        {
            return DeserializeObject<T>(serializedContent, Encoding.UTF8);
        }

        public static T DeserializeObject<T>(string serializedContent, Encoding encoding) where T : class
        {
            T result = null;
            using (Stream memoryStream = new MemoryStream())
            {
                var serializer = new DataContractSerializer(typeof(T));
                byte[] data = encoding.GetBytes(serializedContent);
                memoryStream.Write(data, 0, data.Length);
                memoryStream.Position = 0;
                result = (T)serializer.ReadObject(memoryStream);
            }
            return result;
        }

        public static T DeserializeFromByteArray<T>(byte[] data) where T : class
        {
            T result = null;
            using (Stream memoryStream = new MemoryStream())
            {
                var serializer = new DataContractSerializer(typeof(T));
                memoryStream.Write(data, 0, data.Length);
                memoryStream.Position = 0;
                result = (T)serializer.ReadObject(memoryStream);
            }
            return result;
        }

        public static T CloneObject<T>(T dataContract) where T : class
        {
            T result = null;
            var serializedContent = SerializeObject<T>(dataContract);
            result = DeserializeObject<T>(serializedContent);
            return result;
        }

    }

}



Note that I use two app settings here to control the config of compression in web.config:

    <add key="UseNetworkCompression" value="true" />
    <add key="NetworkCompressionItemTreshold" value="100" />

The appsetting UseNetworkCompression turns on and off the compression. If compression is turned off, we can switch to default SOAP XML serialization. But if it is turned on, the data will be packet into a compressed byte array. The appsetting NetworkCompressionItemTreshold is the minimum number of data contract items that will trigger compression. Here we use two constants also:

        public const string UseNetworkCompression = "UseNetworkCompression";
        public const string NetworkCompressionItemTreshold = "NetworkCompressionItemTreshold";

That was some of code to handle the compression and decompression of data contracts, next I show sample code how to use this. First we need to create a "container data class" for a demo of this mini framework. Each class that will be a "container data contract" will implement the interface IDataContractContainer:

using System.Collections.Generic;



    public interface IDataContractContainer<TResult> where TResult : class
    {

        /// 
        /// Set to true if the binary payload is to be used
        /// 
        bool IsBinary { get; set; }

        /// 
        /// If not is binary, the serial payload contains the data (SOAP XML based data contracts)
        /// 
        List<TResult> SerialPayload { get; set; }

        /// 
        /// Byte array which is the binary payload. Usually the byte array is also Gzipped.
        /// 
        byte[] BinaryPayload { get; set; }

        /// 
        /// The limit when the use of binary payload should be honored.
        /// 
        int BinaryTreshold { get; set; }

        /// 
        /// If true, the binary payload is Gzipped (compressed)
        /// 
        bool IsGzipped { get; set; }

    }



The following class implements the IDataContractContainer interface:

using System.Collections.Generic;
using System.Runtime.Serialization; 


   
    [DataContract(Namespace=Constants.DataContractNamespace20091001)]
    public class OperationItemsContainerDataContract : IDataContractContainer<OperationItemDataContract>
    {

        public OperationItemsContainerDataContract()
        {
            BinaryTreshold = 100;
            IsGzipped = true;
        }

        [DataMember(Order = 1)]
        public byte[] BinaryPayload { get; set; }

        [DataMember(Order = 2)]
        public List<OperationItemDataContract> SerialPayload { get; set; }

        [DataMember(Order = 3)]
        public bool IsBinary { get; set; }

        [DataMember(Order = 4)]
        public int BinaryTreshold { get; set; }

        [DataMember(Order = 5)]
        public bool IsGzipped { get; set; }

    }




Next up is some example code how to use this code:


        public OperationItemsContainerDataContract GetOperationItems(OperationsRequestDataContract request)
        {

..
            var operations = SomeManager.GetSomeData(rquest);
..


            OperationItemsContainerDataContract result =
                DataContractUtility.GetContainerInstanceAfterResult<OperationItemsContainerDataContract, OperationItemDataContract>(operations, null);
            
            return result;
        }


The code above includes some production code, but what matters here is the call to DataContractUtility.GetContainerInstanceAfterResult. This will compress the data contracts into a byte array, if the container instance defines this. Note that the compression will actually pack the data contract items into a byte array and apply GZip compression if the container class set this up. See the constructor of the OperationItemsContainer class, where we set GZip compression to be used if we have activated compression in web.config and the result set is above the limit set up in web.config On the client side, it is necessary retrieve the data. On the server side we have used the GetContainerInstanceAfterResult method, we will now use the GetDataContractsFromDataContractContainer method.

class OperationItemProvider:

  private void LoadDailyOperationsByCurrentTheater(Collection operationItems)
        {
           
            try
            {
                var container = SomeAgent.GetSomeOperationItems(request);


                var operations = DataContractUtility.GetDataContractsFromDataContractContainer<OperationItemsContainerDataContract, OperationItemDataContract>(container, Context.CompressionSetup);

        
            }
            catch (SomeAcmeClientException ex)
            {
                DispatcherUtil.InvokeAction(() => EventAggregator.GetEvent<ErrorMessageEvent>().Publish(new ErrorMessageEventArg { ErrorMessage = ex.Message }));
            }

..
        
        }

Note that the code above includes some production code. The important part is the use of the method GetDataContractsFromDataContractContainer. The object Context.CompressionSetup contains our web.config and is a simple retrieval of the configured settings on our serverside (the serverside method uses the ConfigurationManager. Here is the WCF server method:


        public CompressionSetupDataContract GetCompressionSetup()
        {
            bool standardUseNetworkCompression = false;
            int standardNetworkCompressionItemTreshold = 100;       
           
            var setup = new CompressionSetupDataContract
            {
                UseNetworkCompression = standardUseNetworkCompression, 
                NetworkCompressionItemTreshold = standardNetworkCompressionItemTreshold
            };
            if (ConfigurationManager.AppSettings[Constants.UseNetworkCompression] != null)
            {
                bool useNetworkCompression = standardUseNetworkCompression;
                setup.UseNetworkCompression = bool.TryParse(ConfigurationManager.AppSettings[SomeAcme.SomeProduct.Common.Constants.UseNetworkCompression], out useNetworkCompression) ?
                    useNetworkCompression : false;
            }

            if (ConfigurationManager.AppSettings[Constants.NetworkCompressionItemTreshold] != null)
            {
                int networkCompressionItemTreshold = standardNetworkCompressionItemTreshold;
                setup.NetworkCompressionItemTreshold = 
                    int.TryParse(ConfigurationManager.AppSettings[SomeAcme.SomeProduct.Common.Constants.NetworkCompressionItemTreshold], out networkCompressionItemTreshold) ?
                    networkCompressionItemTreshold : standardNetworkCompressionItemTreshold;
            }          

            return setup; 
        }

We also need to control the use of compression to be able to run integration tests on this:

        [ServiceLog]
        [AuthorizedRole(SomeAcmeRoles.Administrator)]
        public bool SetUseNetworkCompression(CompressionSetupDataContract compressionSetup)
        {
            if (compressionSetup == null)
                throw new ArgumentNullException(GetName.Of(() =< compressionSetup));
            try
            {
                //ADDITIONAL SECURITY CHECKS OMITTED FROM PUBLIC DISPLAY.

                System.Configuration.Configuration webConfigApp = WebConfigurationManager.OpenWebConfiguration("~");

                webConfigApp.AppSettings.Settings[Common.Constants.UseNetworkCompression].Value = compressionSetup.UseNetworkCompression
                    ? "true"
                    : "false";

                webConfigApp.AppSettings.Settings[Common.Constants.NetworkCompressionItemTreshold].Value = compressionSetup.NetworkCompressionItemTreshold.ToString(); 

                webConfigApp.Save();
                return true;
            }
            catch (Exception err)
            {
                InterfaceBinding.GetInstance().WriteError(err.Message);
                return false;
            }
        }

Okay, so no we have the necessary code to test out the compression. Of course to apply this mini-framework to your codebase, you will need to add at least the two app settings above and the two utility classes presented, plus the IDataContractContainer, an implementation of this and use the two methods on the service side and the client side of DataContractUtility noted - To note again:

On the client side, it is necessary retrieve the data. On the server side we have used the GetContainerInstanceAfterResult method, we will now use the GetDataContractsFromDataContractContainer method.



Here is an integration test for testing out our mini framework:

        [Test]
        [Category(TestCategories.IntegrationTest)]
        public void GetOperationItemsWithCompressionEnabledDoesNotReturnEmpty()
        {
            var compressionSetup = new CompressionSetupDataContract
            {
                UseNetworkCompression = true,
                NetworkCompressionItemTreshold = 1
            };

            SystemServiceAgent.SetUseNetworkCompression(compressionSetup);

            //Arrange 
            var operationsRequest = new OperationsRequestDataContract()
            {
                .. // some init
            };

            //Act 
            OperationItemsContainerDataContract operationsRetrieved = ConcreteServiceAgent.GetSomeOperationItems(operationsRequest);

            //Assert 
            Assert.IsNotNull(operationsRetrieved);

            var operationItems =
                DataContractUtility.GetDataContractsFromDataContractContainer<OperationItemsContainerDataContract, OperationItemDataContract>(
                    operationsRetrieved, compressionSetup);

            Assert.IsNotNull(operationItems);
            CollectionAssert.IsNotEmpty(operationItems);

            compressionSetup.UseNetworkCompression = false;
            compressionSetup.NetworkCompressionItemTreshold = 100;
            SystemServiceAgent.SetUseNetworkCompression(compressionSetup);
        }


Note that the compression will mean that more CPU is used on the service side. This can in many cases mean that while saving bandwidth, you spend more CPU resources on an already busy application server running the WCF services. At the same time, more and more clients of WCF services are on low-bandwidth devices, such as smart phones. I have run some tests and retrieved about 1000 items of relateively large data contracts and witnessed a saving from 5-6 MegaBytes (MB) of data being transferred down to 300 kB, a saving of about 20x! WCF serialization using SOAP XML results often in gigantic amounts of data being transferred. If you are on a Gigabit Ethernet network, it might not be noticably on a developer computer, but if you are creating a system with many simultaneous users, network bandwidth usage is starting to get really important, even on high-bandwidth clients! At the end, I must note that the code presented here is very generic. It can be used not only for sending data between WCF services to clients, but a requirement is that the data you are about to send is data contract items (classes and properties using the DataContract and DataMember attributes.