Tuesday, 10 June 2025

Database logging Entity Framework queries to SeriLog

Logging database queries from Entity Framework can be done using Sql Server Profiler or using Profiling tools inside VS 2022. A problem with this approach is that Sql server profiler tends to log very much other events than just the queries to the database and if you use the Profiling tool, they will truncate large queries. Note that database logging the traffic will quickly grow into large logs, of several gigabyte. Database logging can also reveal sensitive data, another challenge. But having a tailored database log can for developers be a helpful tool to both profile and check queries, especially when using an ORM like Entity Framework that can hide the actual queries being sent to the database since that part is abstracted away via IQueryable. Once you have the queries you want to inspect, you can profile them inside SQL Server Management Studio and check if the use of indexes in database are optimal or if more indexes should be added. In case you as a developer want to inspect queries that your application does, maybe activated via an app setting in config, you might be better off with a tailored way of logging the database queries that the application performs. This article will look at Entity Framework 6 for .NET Framework and Db interceptor that logs compacted, one-lined, interpolated queries that Entity Framework runs. Note that also later version of Entity Framework support db interceptors, such as EF Core 8 (for .NET 8). Let's first look at the repo for the code of this article :

https://github.com/toreaurstadboss/BulkOperationsEntityFramework

SerilogCommandInterceptor

The db interceptor looks like this, it will log to a file that uses SeriLog as a logging framework and to format the logging file(s).


using Serilog;
using System;
using System.Configuration;
using System.Data.Common;
using System.Data.Entity.Infrastructure.Interception;

namespace BulkOperationsEntityFramework.Test
{

    /// <summary>
    /// Intercepts Entity Framework database commands and logs them using Serilog.
    /// 
    /// This interceptor captures and logs SQL command text for NonQuery, Reader, and Scalar operations.
    /// Logging is configured via Serilog and can be customized using <c>App.config</c> or <c>Web.config</c> settings.
    /// 
    /// <para>
    /// <b>Configuration via AppSettings:</b>
    /// </para>
    /// <list type="table">
    ///   <item>
    ///     <term>serilog:write-to:File.path</term>
    ///     <description>Path to the log file. Default: <c>databaselogs\log.txt</c></description>
    ///   </item>
    ///   <item>
    ///     <term>serilog:write-to:File.rollingInterval</term>
    ///     <description>Rolling interval for log files (e.g., <c>Day</c>, <c>Hour</c>). Default: <c>Day</c></description>
    ///   </item>
    ///   <item>
    ///     <term>serilog:minimum-level</term>
    ///     <description>Minimum log level (e.g., <c>Information</c>, <c>Warning</c>). Default: <c>Information</c></description>
    ///   </item>
    ///   <item>
    ///     <term>serilog:write-to:File.retainedFileCountLimit</term>
    ///     <description>Number of log files to retain. Default: <c>21</c></description>
    ///   </item>
    /// </list>
    /// 
    /// <para>
    /// <b>Example App.config override:</b>
    /// </para>
    /// <code language="xml">
    /// <appSettings>
    ///   <add key="serilog:write-to:File.path" value="C:\Logs\mydb.log" />
    ///   <add key="serilog:write-to:File.rollingInterval" value="Hour" />
    ///   <add key="serilog:minimum-level" value="Warning" />
    ///   <add key="serilog:write-to:File.retainedFileCountLimit" value="10" />
    /// </appSettings>
    /// </code>
    /// </summary>
    public class SerilogCommandInterceptor : IDbCommandInterceptor
    {

        private static bool _isInitialized = false;

        public SerilogCommandInterceptor()
        {
            if (_isInitialized)
            {
                return;
            }

            var logPath = ConfigurationManager.AppSettings["serilog:write-to:File.path"] ?? "databaselogs\\log.txt";
            var logIntervalRaw = ConfigurationManager.AppSettings["serilog:write-to:File.rollingInterval"];
            var logInterval = Enum.TryParse(logIntervalRaw, true, out RollingInterval interval) ? interval : RollingInterval.Day;
            var minLevelRaw = ConfigurationManager.AppSettings["serilog:minimum-level"];
            var logLevel = Enum.TryParse(minLevelRaw, true, out Serilog.Events.LogEventLevel level) ? level : Serilog.Events.LogEventLevel.Information;
            var retainedCountRaw = ConfigurationManager.AppSettings["serilog:write-to:File.retainedFileCountLimit"];
            var retainedCount = int.TryParse(retainedCountRaw, out int count) ? count : 21;

            //Set up Serilog logging for the database logging interceptor - set up the minimum level to Information and
            //write to a file with rolling intervals. Set up the file size limit to 500 MB per file
            //in case the log file grows too large on a given day, it will roll over to a new file with the with a running number suffixed to it 
            //the logs will be stored in the "databaselogs" subfolder, or configured path in config file
            //logs will be kept or a maximum number of 21 days or specified number of days
            Log.Logger = new LoggerConfiguration()
                .MinimumLevel.Is(logLevel)
                .WriteTo.File(
                    logPath, 
                    rollingInterval: logInterval, 
                    rollOnFileSizeLimit: true,
                    fileSizeLimitBytes: 500 * 1000 * 1000,
                    outputTemplate: "[{Timestamp:yyyy-MM-dd HH:mm:ss} {Level:u3}] [SQL] {Message:lj}{NewLine}",
                    retainedFileCountLimit: retainedCount
                )
                .CreateLogger();

            _isInitialized = true;
        }

        public void NonQueryExecuted(DbCommand command, DbCommandInterceptionContext<int> interceptionContext) { }

        public void NonQueryExecuting(DbCommand command, DbCommandInterceptionContext<int> interceptionContext) =>
            Log.Information("{Tag} {Sql}", GetSqlTag(command.CommandText), CompactAndInterpolateSql(command));

        public void ReaderExecuted(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext) { }

        public void ReaderExecuting(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext) =>
            Log.Information("{Tag} {Sql}", GetSqlTag(command.CommandText), CompactAndInterpolateSql(command));

        public void ScalarExecuted(DbCommand command, DbCommandInterceptionContext<object> interceptionContext) { }

        public void ScalarExecuting(DbCommand command, DbCommandInterceptionContext<object> interceptionContext) =>
            Log.Information("{Tag} {Sql}", GetSqlTag(command.CommandText), CompactAndInterpolateSql(command));

        private string CompactAndInterpolateSql(DbCommand dbCommand)
        {
            string sql = InterpolateSql(dbCommand);
            return CompactSql(sql);
        }

        private string CompactSql(string sql) =>
            sql.Replace(Environment.NewLine, " ").Replace("\n", "").Replace("\r", " ").Trim();

        private string GetSqlTag(string sql)
        {
            if (sql.StartsWith("SELECT", StringComparison.OrdinalIgnoreCase)) return "[SELECT]";
            if (sql.StartsWith("INSERT", StringComparison.OrdinalIgnoreCase)) return "[INSERT]";
            if (sql.StartsWith("UPDATE", StringComparison.OrdinalIgnoreCase)) return "[UPDATE]";
            if (sql.StartsWith("DELETE", StringComparison.OrdinalIgnoreCase)) return "[DELETE]";
            return "[SQL]";
        }

        private string InterpolateSql(DbCommand dbCommand)
        {
            string sql = dbCommand.CommandText;
            foreach (DbParameter parameter in dbCommand.Parameters)
            {
                string value = FormatParameterValue(parameter.Value);
                sql = sql.Replace(parameter.ParameterName, value);
            }
            return sql;
        }

        private string FormatParameterValue(object value)
        { 
            if (value == null || value == DBNull.Value)
            {
                return "NULL";
            }
            if (value is string || value is DateTime || value is Guid)
            {
                return $"'{value}'"; // Wrap strings, DateTime, and Guid in single quotes
            }
            if (value is bool b)
            {
                return b ? "1" : "0";
            }

            return value.ToString();
        }        

    }
}


Adding a db interceptor should be preferably done once, for example in a static constructor, to ensure the db interceptor is available right away. Example of a simple db context that adds the interceptor shown above.


using BulkOperationsEntityFramework.Models;
using BulkOperationsEntityFramework.Test;
using System.ComponentModel.DataAnnotations.Schema;
using System.Data.Entity;
using System.Data.Entity.Infrastructure.Interception;

namespace BulkOperationsEntityFramework
{

    public class ApplicationDbContext : DbContext
    {

        static ApplicationDbContext()
        {
            DbInterception.Add(new SerilogCommandInterceptor());
        }

        public ApplicationDbContext() : base("name=App")
        {
        }

        public DbSet<User> Users { get; set; }

        protected override void OnModelCreating(DbModelBuilder modelBuilder)
        {
            modelBuilder.Entity<User>().HasKey(u => u.Id);
            modelBuilder.Entity<User>().Property(u => u.Id).HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);
        }

    }

}




Since Serilog is used, a great flexibility to the logging is possible to do. For example, logging database queries can be done with rolling intervals, such as logging per hour or per day or even per minute for example. Also, a max size for logs can be set. The logging within the same time interval for the logging is split into multiple log files, when log files reaches size limit. Example app config settings are here:


<appSettings>
	<add key="serilog:write-to:File.path" value="logs\dblogv3.txt" />
	<add key="serilog:minimum-level" value="Information" />
	<add key="serilog:write-to:File.rollingInterval" value="Day" />
	<add key="serilog:file:retentionDays" value="21" />
</appSettings>


If you use a tool like TailBlazer, opening very large logs of many gigabytes is fairly fast and you can define highlighting rules to discern between different types of queries.
A screenshot of the highlighting rules of Tail Blazor for the previous screen shot shown is shown below:



Tail Blazer tool

The Tail Blazer tool can be downloaded from GitHub here:

https://github.com/RolandPheasant/TailBlazer

Sunday, 1 June 2025

Using SqlBulkCopy with EntityFramework

This article tests out variying methods of ways of doing bulk inserts in EntityFramework. The code shown in this article uses .NET Framework 4.8
and Entity Framework 6.5.0. Support is better in Entity Framework Core, so the code shown here for SqlBulkCopy must be adjusted a little bit to also work with EntityFramework Core. A Github repo has been added here:

https://github.com/toreaurstadboss/BulkOperationsEntityFramework

A benchmark has been run to test out the different approaches. Not suprisingly, SqlBulkCopy is the fastest and most economic way of performing the bulk inserts (uses the least memory). The code for handling SqlBulkCopy is provided via extension methods listed below:

BulkInsertExtensions.cs




using System.Collections.Generic;
using System.Data;
using System.Data.Entity;
using System.Data.SqlClient;
using System.Threading.Tasks;

namespace BulkOperationsEntityFramework.Lib.Extensions
{

    /// <summary>
    /// Provides extension methods for performing bulk insert operations using SqlBulkCopy.
    /// </summary>
    public static class BulkInsertExtensions
    {
        /// <summary>
        /// Performs a bulk insert of the specified entities into the database.
        /// </summary>
        /// <typeparam name="T">The type of the entity.</typeparam>
        /// <param name="context">The DbContext instance.</param>
        /// <param name="entities">The collection of entities to insert.</param>
        /// <param name="tableName">
        /// Optional: The name of the destination table. If null, the entity type name is used.
        /// </param>
        /// <param name="columnMappings">
        /// Optional: Custom column mappings (propertyName → columnName).
        /// 
        /// <example>
        /// Example 1: Rename columns
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "Name", "ProductName" },
        ///     { "Price", "UnitPrice" }
        /// };
        /// await context.BulkInsertAsync(products, "Products", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 2: Ignore properties using attributes
        /// <code>
        /// public class Customer
        /// {
        ///     public int Id { get; set; }
        ///     public string FullName { get; set; }
        /// 
        ///     [BulkIgnore]
        ///     public string TempNotes { get; set; }
        /// 
        ///     [NotMapped]
        ///     public string CalculatedField { get; set; }
        /// }
        /// 
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FullName", "CustomerName" }
        /// };
        /// await context.BulkInsertAsync(customers, "Customers", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 3: Snake_case mapping
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FirstName", "first_name" },
        ///     { "LastName", "last_name" },
        ///     { "Email", "email_address" }
        /// };
        /// await context.BulkInsertAsync(users, "user_accounts", mappings);
        /// </code>
        /// </example>
        /// </param>
        /// <param name="bulkCopyTimout">The bulk copy timeout in seconds. Default is set to 30.</param>
        public static void BulkInsert<T>(this DbContext context, IEnumerable<T> entities, string tableName = null,
            Dictionary<string, string> columnMappings = null, int bulkCopyTimeout = 30)
            where T : class
        {
            var dataTable = entities.ToBulkDataTable(columnMappings, out var finalMappings);
            BulkCopy(context, dataTable, tableName ?? typeof(T).Name, finalMappings, bulkCopyTimeout);
        }

        /// <summary>
        /// Asynchronously performs a bulk insert of the specified entities into the database.
        /// </summary>
        /// <typeparam name="T">The type of the entity.</typeparam>
        /// <param name="context">The DbContext instance.</param>
        /// <param name="entities">The collection of entities to insert.</param>
        /// <param name="tableName">
        /// Optional: The name of the destination table. If null, the entity type name is used.
        /// </param>
        /// <param name="columnMappings">
        /// Optional: Custom column mappings (propertyName → columnName).
        /// 
        /// <example>
        /// Example 1: Rename columns
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "Name", "ProductName" },
        ///     { "Price", "UnitPrice" }
        /// };
        /// await context.BulkInsertAsync(products, "Products", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 2: Ignore properties using attributes
        /// <code>
        /// public class Customer
        /// {
        ///     public int Id { get; set; }
        ///     public string FullName { get; set; }
        /// 
        ///     [BulkIgnore]
        ///     public string TempNotes { get; set; }
        /// 
        ///     [NotMapped]
        ///     public string CalculatedField { get; set; }
        /// }
        /// 
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FullName", "CustomerName" }
        /// };
        /// await context.BulkInsertAsync(customers, "Customers", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 3: Snake_case mapping
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FirstName", "first_name" },
        ///     { "LastName", "last_name" },
        ///     { "Email", "email_address" }
        /// };
        /// await context.BulkInsertAsync(users, "user_accounts", mappings);
        /// </code>
        /// </example>
        /// </param>
        /// <param name="bulkCopyTimout">The bulk copy timeout in seconds. Default is set to 30.</param>
        public static async Task BulkInsertAsync<T>(this DbContext context, IEnumerable<T> entities, string tableName = null,
            Dictionary<string, string> columnMappings = null, int bulkCopyTimout = 30)
            where T : class
        {
            var dataTable = entities.ToBulkDataTable(columnMappings, out var finalMappings);
            await BulkCopyAsync(context, dataTable, tableName ?? typeof(T).Name, finalMappings, 30);
        }

        private static void BulkCopy(DbContext context, DataTable table, string tableName,
            Dictionary<string, string> finalMappings, int bulkCopyTimeout = 30)
        {
            var connection = (SqlConnection)context.Database.Connection;
            var wasClosed = connection.State == ConnectionState.Closed;

            if (wasClosed)
                connection.Open();

            using (var bulkCopy = new SqlBulkCopy(connection))
            {
                bulkCopy.DestinationTableName = tableName;
                bulkCopy.BulkCopyTimeout = bulkCopyTimeout;

                foreach (var map in finalMappings)
                {
                    bulkCopy.ColumnMappings.Add(map.Key, map.Value);
                }
                bulkCopy.WriteToServer(table);
            }

            if (wasClosed)
                connection.Close(); // Ensure the connection is closed after the operation, if it was closed before
        }

        private static async Task BulkCopyAsync(DbContext context, DataTable table, string tableName, Dictionary<string, string> mappings,
            int bulkCopyTimeout = 30)
        {
            var connection = (SqlConnection)context.Database.Connection;
            var wasClosed = connection.State == ConnectionState.Closed;

            if (wasClosed)
                await connection.OpenAsync();

            using (var bulkCopy = new SqlBulkCopy(connection))
            {
                bulkCopy.DestinationTableName = tableName;
                bulkCopy.BulkCopyTimeout = bulkCopyTimeout;
                foreach (var map in mappings)
                {
                    bulkCopy.ColumnMappings.Add(map.Key, map.Value);
                }
                await bulkCopy.WriteToServerAsync(table);
            }

            if (wasClosed)
                connection.Close();  //Ensure the connection is closed after the operation, if it was closed before
        }
    }
}





As the code shows, we use the DbContext from EntityFramework and get the connection string from the Database.Connection.ConnectionString property. We also use the Database.Connection property and cast it to SqlConnection to get the connection object. As we see, SqlBulkCopy is part of ADO.Net and relies upon column mappings. The code above will ignore the properties marked by the BulkIgnore attribute and the NotMapped attribute

BulkIgnoreAttribute.cs



using System;

namespace BulkOperationsEntityFramework.Lib.Attributes
{

    /// 
    /// Indicates that a property should be ignored during bulk insert operations.
    /// 
    [AttributeUsage(AttributeTargets.Property)]
    public class BulkIgnoreAttribute : Attribute
    {
    }

}


The following helper class creates a DataTable for a collection of entities (an IEnumerable) and uses reflection to make such a mapping. Note that column mappings can also be provided in the BulkInsertExtensions methods shown above to override property to column mapping, if desired. Default sql bulk copy timeout is set to 30 seconds (it is also the default timeout)

DataTableExtensions.cs


The following code provides code for creating data table for a collection of objects (IEnumerable). The BulkInsertExtensions uses the specific method ToBulkDataTable. This data table skips properties that are marked with [NotMapped] or [BulkIgnore] attributes. Both ToDataTable methods shown here allows column mappings to customize the property to column mappings.

using BulkOperationsEntityFramework.Lib.Attributes;
using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.Reflection;

namespace BulkOperationsEntityFramework.Lib.Extensions
{

    public static class DatatableExtensions
    {

        /// <summary>
        /// Converts an IEnumerable of type T to a DataTable.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="data"></param>
        /// <param name="columnMappings"></param>
        /// <returns></returns>
        public static DataTable ToDataTable<T>(this IEnumerable<T> data, Dictionary<string, string> columnMappings = null)
        {
            var dataTable = new DataTable();
            var properties = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance);

            foreach (var prop in properties)
            {
                var columnName = columnMappings != null && columnMappings.ContainsKey(prop.Name) ? columnMappings[prop.Name] : prop.Name;
                dataTable.Columns.Add(columnName, Nullable.GetUnderlyingType(prop.PropertyType) ?? prop.PropertyType);
            }

            foreach (var item in data)
            {
                var values = properties.Select(p => p.GetValue(item) ?? DBNull.Value).ToArray();
                dataTable.Rows.Add(values);
            }

            return dataTable;
        }

        /// <summary>
        /// Converts an IEnumerable of type T to a DataTable with specified column mappings. Tailored for Bulk operations.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="entities"></param>
        /// <param name="columnMappings"></param>
        /// <param name="finalMappings"></param>
        /// <returns></returns>
        public static DataTable ToBulkDataTable<T>(this IEnumerable<T> entities, Dictionary<string, string> columnMappings, out Dictionary<string, string> finalMappings)
        {
            var dataTable = new DataTable();
            var properties = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance)
                .Where(p =>
                    !Attribute.IsDefined(p, typeof(System.ComponentModel.DataAnnotations.Schema.NotMappedAttribute)) &&
                    !Attribute.IsDefined(p, typeof(BulkIgnoreAttribute)))
                .ToArray();

            finalMappings = new Dictionary<string, string>();

            foreach (var prop in properties)
            {
                var columnName = columnMappings != null && columnMappings.ContainsKey(prop.Name)
                    ? columnMappings[prop.Name]
                    : prop.Name;

                dataTable.Columns.Add(columnName, Nullable.GetUnderlyingType(prop.PropertyType) ?? prop.PropertyType);
                finalMappings[prop.Name] = columnName;
            }

            foreach (var entity in entities)
            {
                var values = properties.Select(p => p.GetValue(entity) ?? DBNull.Value).ToArray();
                dataTable.Rows.Add(values);
            }

            return dataTable;
        }

    }
}




BulkInsertBenchmark

The following benchmark is available in the solution for benchmarking the different approaches to bulk copy with EntityFramework.
  • EF - add one and save in a loop. Not suprisingly, performs the worst due to the many roundtrips to the database
  • EF - add one by one and save at the end. Better performance, since we have one roundtrip. Will handle poor cases where we try to add many items in a batch.
  • EF - addrange and save at the end. Similar to the one above, one roundtrip to database
  • Dapper - add as batch and save. Minor better speed than the previous two. One roundtrip to database.
  • SqlBulkCopy - clearly the fastest way to insert a batch of entities to the database



using BenchmarkDotNet.Attributes;
using Bogus;
using BulkOperationsEntityFramework.Lib.Extensions;
using BulkOperationsEntityFramework.Models;
using Dapper;
using System.Configuration;
using System.Data.SqlClient;
using System.Linq;
using System.Threading.Tasks;

namespace BulkOperationsEntityFramework.Benchmarks
{

    [MemoryDiagnoser]
    public class BulkInsertBenchmark
    {

        private static readonly Faker Faker = new Faker();

        [Params(100)]
        public int Size { get; set; }

        //First benchmark - Naive approach - add one and one entity and save changes everytime with round trip to database

        [Benchmark(Description = "EFAddOneAndSave - Add one user and then save. Then add another one. Results in many round-trips to database.")]
        public async Task EFAddOneAndSave()
        {
            using (var context = new ApplicationDbContext())
            {
                foreach (var user in GetUsers())
                {
                    context.Users.Add(user);
                    await context.SaveChangesAsync();
                }
            }
        }

        [Benchmark(Description = "EFAddOneByOneAndSave - Add one by one user, but save once after adding them. Results in one round-trip to database")]
        public async Task EFAddOneByOneAndSave()
        {
            using (var context = new ApplicationDbContext())
            {
                foreach (var user in GetUsers())
                {
                    context.Users.Add(user);

                }
                await context.SaveChangesAsync();
            }
        }

        [Benchmark(Description = "EFAddRange - Adds the users, then does the save. Results in one round-trip to database")]
        public async Task EFAddRange()
        {
            using (var context = new ApplicationDbContext())
            {
                var users = GetUsers();
                context.Users.AddRange(users);
                await context.SaveChangesAsync();
            }
        }

        [Benchmark(Description = "DapperInsertRange - Uses Dapper to insert the users. Results in one round-trip to database")]
        public async Task DapperInsertRange()
        {
            var connectionString = ConfigurationManager.ConnectionStrings["App"].ConnectionString;

            string sql = @"
                INSERT INTO Users (Email, FirstName, LastName, PhoneNumber)
                VALUES (@Email, @FirstName, @LastName, @PhoneNumber)
            ".Trim();

            using (var connection = new SqlConnection(connectionString))
            {
                var users = GetUsers().Select(u => new 
                {
                    u.Email,
                    u.FirstName,
                    u.LastName,
                    u.PhoneNumber
                }).ToArray();

                await connection.ExecuteAsync(sql, users);
            }
        }

        [Benchmark(Description = "SqlBulkCopy - Uses SqlBulkCopy to insert the users. Results in one round-trip to database")]
        public async Task SqlBulkCopy()
        {
            using (var context = new ApplicationDbContext())
            {
                await context.BulkInsertAsync(GetUsers(), "Users");
            }
        }



        private User[] GetUsers() =>
            Enumerable.Range(1, Size).Select(i => new User
            {
                Email = Faker.Internet.Email(),
                FirstName = Faker.Name.FirstName(),
                LastName = Faker.Name.LastName(),
                PhoneNumber = Faker.Phone.PhoneNumber()
            }).ToArray();

    }

}


The following screenshots shows the benchmark results after running Benchmark.NET for the mentioned approaches and comparing the performance.

Results

Benchmark for a given batch size of 100

Benchmark for a given batch size of 100

Sunday, 18 May 2025

Displaying weather data using Seaborn with Python

Using the library Seaborn, built on top of MatPlotLib, displaying weather data is convenient. Using Anaconda and Jupyter Notebook, working on the data is user friendly. First off, let's look at a data set containing some Norwegian weather data. The following data set, contains weather data from Norway from 2020-2021 for 55 meterological stations. (13,61 MB in size), available on DbCL v1.0 license (meaning, free of use , 'as-is' warranty) on the Kaggle.com website.

https://www.kaggle.com/datasets/annbengardt/noway-meteorological-data/data

First off, the following imports are done in the Jupyter Notebook. This is a free IDE part of the Anaconda distribution that is prepared for data visualization, that runs in a browser.

NorwayMeteoDemo1.pynb


import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

Next off, using Pandas, Python's Data Analysis Library, the data is prepared from the mentioned dataset. The format is in CSV format. Also, columns are added dynamically to the dataset. A moving 14-days average of the daily maxmimum air temperature is added using the rolling method and setting window to 14. Also, a date column is added. Our dataset contains three int64 values day, month and year. We combine these to create a date column.

NorwayMeteoDemo1.pynb


df = pd.read_csv("datasets/weather/NorwayMeteoDataCompleted.csv")
df['moving_max_air_temp_avg'] = df['max(air_temperature P1D)'].rolling(window = 14, min_periods = 1).mean()
df['date'] = pd.to_datetime(df[['year', 'month', 'day']])

Note the use of min_periods set to 1 for the moving average. Or else, you will get NaN in the start of the data of your created moving average column and the way Python works, it will cause NaN for all the next periods too ! Next, choosing what data to display. The following data will be shown in the demo.
  • Station id : SN69100 (This is Værnes - Trondheim Airport weather station by the way,
  • Year 2020
The station ids can be looked up here on this user's Github GIST: https://gist.github.com/ofaltins/c1f0158f1766c8bd695b2c8d545c052c The following code set up the filtered data and saves it into a variable

NorwayMeteoDemo1.pynb


filtered_df_2020= df[ 
    (df['sourceId'] == 'SN69100')  &
    (df['year'] == 2020)
]

Next up, the data is then displayed. The demo will show two lineplots, first the maximum air temperature as a lineplot using Seaborn. In addition, another line plot with the moving 14 days average of the daily maximum air temperature. Also, a bar plot is added below these two line plots. Note that the bar plot used here is bar and not barplot. Barplot is available in Seaborn, while Bar is available in the MatPlotLib.

NorwayMeteoDemo1.pynb



sns.set_style('whitegrid') # set the style to 'whitegrid' 

# Create a 2x1 subplot

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 10), sharex=True)

sns.lineplot(data = filtered_df_2020, x = filtered_df_2020['date'], y = filtered_df_2020['max(air_temperature P1D)'], label='Daily max air temperature (C)', linewidth = 1.5, color = 'pink', ax = ax1)

sns.lineplot(data = filtered_df_2020, x = filtered_df_2020['date'], y = filtered_df_2020['moving_max_air_temp_avg'], label='14-day moving average Daily max air temperature (C)', color = 'red', linewidth = 1.8, ax = ax1)

ax2.bar(filtered_df_2020['date'], filtered_df_2020['sum(precipitation_amount P1D)'], data = filtered_df_2020, color = 'blue', label = 'Daily sum precipitation (mm)')

ax1.set_title('Værnes - Weather data - 2020')

ax1.set_xlabel('Date of year')
ax1.set_ylabel('Daily max air temperature (C)')

ax2.set_xlabel('Date of year')
ax2.set_ylabel('Daily sum precipitation (mm)')



The code above shows the resulting figure consisting of a 2x1 subplots layout, the upper plot shows the 2020 daily maximum air temperature combined with a moving 14-days average as a smoothing or trending function to show the general temperature shifts every second week of the year in average. The lower plot shows a bar plot, using MatPlotLib, since Seaborn's bar plot does not handle dates as x-axis (in our dataset, datetime64 is used). Seaborn and MatPlotLib offers a ton of plotting functionality. Maybe it also could be of interest for .NET Developers to use it more often? My previous article showed how it is possible to render in the backend images using MatPlotLib and then display in a Blazor serverside app, combining both
Python and .NET. That demo used Python.Net library for the Python interop with .NET. The screenshot shows the Jupyter Notebook IDE, part of Anaconda distribution of Python tailored for data analysis and data science. It displays the plots described from the Python script above.