Sunday, 1 June 2025

Using SqlBulkCopy with EntityFramework

This article tests out variying methods of ways of doing bulk inserts in EntityFramework. The code shown in this article uses .NET Framework 4.8
and Entity Framework 6.5.0. Support is better in Entity Framework Core, so the code shown here for SqlBulkCopy must be adjusted a little bit to also work with EntityFramework Core. A Github repo has been added here:

https://github.com/toreaurstadboss/BulkOperationsEntityFramework

A benchmark has been run to test out the different approaches. Not suprisingly, SqlBulkCopy is the fastest and most economic way of performing the bulk inserts (uses the least memory). The code for handling SqlBulkCopy is provided via extension methods listed below:

BulkInsertExtensions.cs




using System.Collections.Generic;
using System.Data;
using System.Data.Entity;
using System.Data.SqlClient;
using System.Threading.Tasks;

namespace BulkOperationsEntityFramework.Lib.Extensions
{

    /// <summary>
    /// Provides extension methods for performing bulk insert operations using SqlBulkCopy.
    /// </summary>
    public static class BulkInsertExtensions
    {
        /// <summary>
        /// Performs a bulk insert of the specified entities into the database.
        /// </summary>
        /// <typeparam name="T">The type of the entity.</typeparam>
        /// <param name="context">The DbContext instance.</param>
        /// <param name="entities">The collection of entities to insert.</param>
        /// <param name="tableName">
        /// Optional: The name of the destination table. If null, the entity type name is used.
        /// </param>
        /// <param name="columnMappings">
        /// Optional: Custom column mappings (propertyName → columnName).
        /// 
        /// <example>
        /// Example 1: Rename columns
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "Name", "ProductName" },
        ///     { "Price", "UnitPrice" }
        /// };
        /// await context.BulkInsertAsync(products, "Products", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 2: Ignore properties using attributes
        /// <code>
        /// public class Customer
        /// {
        ///     public int Id { get; set; }
        ///     public string FullName { get; set; }
        /// 
        ///     [BulkIgnore]
        ///     public string TempNotes { get; set; }
        /// 
        ///     [NotMapped]
        ///     public string CalculatedField { get; set; }
        /// }
        /// 
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FullName", "CustomerName" }
        /// };
        /// await context.BulkInsertAsync(customers, "Customers", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 3: Snake_case mapping
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FirstName", "first_name" },
        ///     { "LastName", "last_name" },
        ///     { "Email", "email_address" }
        /// };
        /// await context.BulkInsertAsync(users, "user_accounts", mappings);
        /// </code>
        /// </example>
        /// </param>
        /// <param name="bulkCopyTimout">The bulk copy timeout in seconds. Default is set to 30.</param>
        public static void BulkInsert<T>(this DbContext context, IEnumerable<T> entities, string tableName = null,
            Dictionary<string, string> columnMappings = null, int bulkCopyTimeout = 30)
            where T : class
        {
            var dataTable = entities.ToBulkDataTable(columnMappings, out var finalMappings);
            BulkCopy(context, dataTable, tableName ?? typeof(T).Name, finalMappings, bulkCopyTimeout);
        }

        /// <summary>
        /// Asynchronously performs a bulk insert of the specified entities into the database.
        /// </summary>
        /// <typeparam name="T">The type of the entity.</typeparam>
        /// <param name="context">The DbContext instance.</param>
        /// <param name="entities">The collection of entities to insert.</param>
        /// <param name="tableName">
        /// Optional: The name of the destination table. If null, the entity type name is used.
        /// </param>
        /// <param name="columnMappings">
        /// Optional: Custom column mappings (propertyName → columnName).
        /// 
        /// <example>
        /// Example 1: Rename columns
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "Name", "ProductName" },
        ///     { "Price", "UnitPrice" }
        /// };
        /// await context.BulkInsertAsync(products, "Products", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 2: Ignore properties using attributes
        /// <code>
        /// public class Customer
        /// {
        ///     public int Id { get; set; }
        ///     public string FullName { get; set; }
        /// 
        ///     [BulkIgnore]
        ///     public string TempNotes { get; set; }
        /// 
        ///     [NotMapped]
        ///     public string CalculatedField { get; set; }
        /// }
        /// 
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FullName", "CustomerName" }
        /// };
        /// await context.BulkInsertAsync(customers, "Customers", mappings);
        /// </code>
        /// </example>
        /// 
        /// <example>
        /// Example 3: Snake_case mapping
        /// <code>
        /// var mappings = new Dictionary<string, string>
        /// {
        ///     { "FirstName", "first_name" },
        ///     { "LastName", "last_name" },
        ///     { "Email", "email_address" }
        /// };
        /// await context.BulkInsertAsync(users, "user_accounts", mappings);
        /// </code>
        /// </example>
        /// </param>
        /// <param name="bulkCopyTimout">The bulk copy timeout in seconds. Default is set to 30.</param>
        public static async Task BulkInsertAsync<T>(this DbContext context, IEnumerable<T> entities, string tableName = null,
            Dictionary<string, string> columnMappings = null, int bulkCopyTimout = 30)
            where T : class
        {
            var dataTable = entities.ToBulkDataTable(columnMappings, out var finalMappings);
            await BulkCopyAsync(context, dataTable, tableName ?? typeof(T).Name, finalMappings, 30);
        }

        private static void BulkCopy(DbContext context, DataTable table, string tableName,
            Dictionary<string, string> finalMappings, int bulkCopyTimeout = 30)
        {
            var connection = (SqlConnection)context.Database.Connection;
            var wasClosed = connection.State == ConnectionState.Closed;

            if (wasClosed)
                connection.Open();

            using (var bulkCopy = new SqlBulkCopy(connection))
            {
                bulkCopy.DestinationTableName = tableName;
                bulkCopy.BulkCopyTimeout = bulkCopyTimeout;

                foreach (var map in finalMappings)
                {
                    bulkCopy.ColumnMappings.Add(map.Key, map.Value);
                }
                bulkCopy.WriteToServer(table);
            }

            if (wasClosed)
                connection.Close(); // Ensure the connection is closed after the operation, if it was closed before
        }

        private static async Task BulkCopyAsync(DbContext context, DataTable table, string tableName, Dictionary<string, string> mappings,
            int bulkCopyTimeout = 30)
        {
            var connection = (SqlConnection)context.Database.Connection;
            var wasClosed = connection.State == ConnectionState.Closed;

            if (wasClosed)
                await connection.OpenAsync();

            using (var bulkCopy = new SqlBulkCopy(connection))
            {
                bulkCopy.DestinationTableName = tableName;
                bulkCopy.BulkCopyTimeout = bulkCopyTimeout;
                foreach (var map in mappings)
                {
                    bulkCopy.ColumnMappings.Add(map.Key, map.Value);
                }
                await bulkCopy.WriteToServerAsync(table);
            }

            if (wasClosed)
                connection.Close();  //Ensure the connection is closed after the operation, if it was closed before
        }
    }
}





As the code shows, we use the DbContext from EntityFramework and get the connection string from the Database.Connection.ConnectionString property. We also use the Database.Connection property and cast it to SqlConnection to get the connection object. As we see, SqlBulkCopy is part of ADO.Net and relies upon column mappings. The code above will ignore the properties marked by the BulkIgnore attribute and the NotMapped attribute

BulkIgnoreAttribute.cs



using System;

namespace BulkOperationsEntityFramework.Lib.Attributes
{

    /// 
    /// Indicates that a property should be ignored during bulk insert operations.
    /// 
    [AttributeUsage(AttributeTargets.Property)]
    public class BulkIgnoreAttribute : Attribute
    {
    }

}


The following helper class creates a DataTable for a collection of entities (an IEnumerable) and uses reflection to make such a mapping. Note that column mappings can also be provided in the BulkInsertExtensions methods shown above to override property to column mapping, if desired. Default sql bulk copy timeout is set to 30 seconds (it is also the default timeout)

DataTableExtensions.cs


The following code provides code for creating data table for a collection of objects (IEnumerable). The BulkInsertExtensions uses the specific method ToBulkDataTable. This data table skips properties that are marked with [NotMapped] or [BulkIgnore] attributes. Both ToDataTable methods shown here allows column mappings to customize the property to column mappings.

using BulkOperationsEntityFramework.Lib.Attributes;
using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.Reflection;

namespace BulkOperationsEntityFramework.Lib.Extensions
{

    public static class DatatableExtensions
    {

        /// <summary>
        /// Converts an IEnumerable of type T to a DataTable.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="data"></param>
        /// <param name="columnMappings"></param>
        /// <returns></returns>
        public static DataTable ToDataTable<T>(this IEnumerable<T> data, Dictionary<string, string> columnMappings = null)
        {
            var dataTable = new DataTable();
            var properties = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance);

            foreach (var prop in properties)
            {
                var columnName = columnMappings != null && columnMappings.ContainsKey(prop.Name) ? columnMappings[prop.Name] : prop.Name;
                dataTable.Columns.Add(columnName, Nullable.GetUnderlyingType(prop.PropertyType) ?? prop.PropertyType);
            }

            foreach (var item in data)
            {
                var values = properties.Select(p => p.GetValue(item) ?? DBNull.Value).ToArray();
                dataTable.Rows.Add(values);
            }

            return dataTable;
        }

        /// <summary>
        /// Converts an IEnumerable of type T to a DataTable with specified column mappings. Tailored for Bulk operations.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="entities"></param>
        /// <param name="columnMappings"></param>
        /// <param name="finalMappings"></param>
        /// <returns></returns>
        public static DataTable ToBulkDataTable<T>(this IEnumerable<T> entities, Dictionary<string, string> columnMappings, out Dictionary<string, string> finalMappings)
        {
            var dataTable = new DataTable();
            var properties = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance)
                .Where(p =>
                    !Attribute.IsDefined(p, typeof(System.ComponentModel.DataAnnotations.Schema.NotMappedAttribute)) &&
                    !Attribute.IsDefined(p, typeof(BulkIgnoreAttribute)))
                .ToArray();

            finalMappings = new Dictionary<string, string>();

            foreach (var prop in properties)
            {
                var columnName = columnMappings != null && columnMappings.ContainsKey(prop.Name)
                    ? columnMappings[prop.Name]
                    : prop.Name;

                dataTable.Columns.Add(columnName, Nullable.GetUnderlyingType(prop.PropertyType) ?? prop.PropertyType);
                finalMappings[prop.Name] = columnName;
            }

            foreach (var entity in entities)
            {
                var values = properties.Select(p => p.GetValue(entity) ?? DBNull.Value).ToArray();
                dataTable.Rows.Add(values);
            }

            return dataTable;
        }

    }
}




BulkInsertBenchmark

The following benchmark is available in the solution for benchmarking the different approaches to bulk copy with EntityFramework.
  • EF - add one and save in a loop. Not suprisingly, performs the worst due to the many roundtrips to the database
  • EF - add one by one and save at the end. Better performance, since we have one roundtrip. Will handle poor cases where we try to add many items in a batch.
  • EF - addrange and save at the end. Similar to the one above, one roundtrip to database
  • Dapper - add as batch and save. Minor better speed than the previous two. One roundtrip to database.
  • SqlBulkCopy - clearly the fastest way to insert a batch of entities to the database



using BenchmarkDotNet.Attributes;
using Bogus;
using BulkOperationsEntityFramework.Lib.Extensions;
using BulkOperationsEntityFramework.Models;
using Dapper;
using System.Configuration;
using System.Data.SqlClient;
using System.Linq;
using System.Threading.Tasks;

namespace BulkOperationsEntityFramework.Benchmarks
{

    [MemoryDiagnoser]
    public class BulkInsertBenchmark
    {

        private static readonly Faker Faker = new Faker();

        [Params(100)]
        public int Size { get; set; }

        //First benchmark - Naive approach - add one and one entity and save changes everytime with round trip to database

        [Benchmark(Description = "EFAddOneAndSave - Add one user and then save. Then add another one. Results in many round-trips to database.")]
        public async Task EFAddOneAndSave()
        {
            using (var context = new ApplicationDbContext())
            {
                foreach (var user in GetUsers())
                {
                    context.Users.Add(user);
                    await context.SaveChangesAsync();
                }
            }
        }

        [Benchmark(Description = "EFAddOneByOneAndSave - Add one by one user, but save once after adding them. Results in one round-trip to database")]
        public async Task EFAddOneByOneAndSave()
        {
            using (var context = new ApplicationDbContext())
            {
                foreach (var user in GetUsers())
                {
                    context.Users.Add(user);

                }
                await context.SaveChangesAsync();
            }
        }

        [Benchmark(Description = "EFAddRange - Adds the users, then does the save. Results in one round-trip to database")]
        public async Task EFAddRange()
        {
            using (var context = new ApplicationDbContext())
            {
                var users = GetUsers();
                context.Users.AddRange(users);
                await context.SaveChangesAsync();
            }
        }

        [Benchmark(Description = "DapperInsertRange - Uses Dapper to insert the users. Results in one round-trip to database")]
        public async Task DapperInsertRange()
        {
            var connectionString = ConfigurationManager.ConnectionStrings["App"].ConnectionString;

            string sql = @"
                INSERT INTO Users (Email, FirstName, LastName, PhoneNumber)
                VALUES (@Email, @FirstName, @LastName, @PhoneNumber)
            ".Trim();

            using (var connection = new SqlConnection(connectionString))
            {
                var users = GetUsers().Select(u => new 
                {
                    u.Email,
                    u.FirstName,
                    u.LastName,
                    u.PhoneNumber
                }).ToArray();

                await connection.ExecuteAsync(sql, users);
            }
        }

        [Benchmark(Description = "SqlBulkCopy - Uses SqlBulkCopy to insert the users. Results in one round-trip to database")]
        public async Task SqlBulkCopy()
        {
            using (var context = new ApplicationDbContext())
            {
                await context.BulkInsertAsync(GetUsers(), "Users");
            }
        }



        private User[] GetUsers() =>
            Enumerable.Range(1, Size).Select(i => new User
            {
                Email = Faker.Internet.Email(),
                FirstName = Faker.Name.FirstName(),
                LastName = Faker.Name.LastName(),
                PhoneNumber = Faker.Phone.PhoneNumber()
            }).ToArray();

    }

}


The following screenshots shows the benchmark results after running Benchmark.NET for the mentioned approaches and comparing the performance.

Results

Benchmark for a given batch size of 100

Benchmark for a given batch size of 100

Sunday, 18 May 2025

Displaying weather data using Seaborn with Python

Using the library Seaborn, built on top of MatPlotLib, displaying weather data is convenient. Using Anaconda and Jupyter Notebook, working on the data is user friendly. First off, let's look at a data set containing some Norwegian weather data. The following data set, contains weather data from Norway from 2020-2021 for 55 meterological stations. (13,61 MB in size), available on DbCL v1.0 license (meaning, free of use , 'as-is' warranty) on the Kaggle.com website.

https://www.kaggle.com/datasets/annbengardt/noway-meteorological-data/data

First off, the following imports are done in the Jupyter Notebook. This is a free IDE part of the Anaconda distribution that is prepared for data visualization, that runs in a browser.

NorwayMeteoDemo1.pynb


import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

Next off, using Pandas, Python's Data Analysis Library, the data is prepared from the mentioned dataset. The format is in CSV format. Also, columns are added dynamically to the dataset. A moving 14-days average of the daily maxmimum air temperature is added using the rolling method and setting window to 14. Also, a date column is added. Our dataset contains three int64 values day, month and year. We combine these to create a date column.

NorwayMeteoDemo1.pynb


df = pd.read_csv("datasets/weather/NorwayMeteoDataCompleted.csv")
df['moving_max_air_temp_avg'] = df['max(air_temperature P1D)'].rolling(window = 14, min_periods = 1).mean()
df['date'] = pd.to_datetime(df[['year', 'month', 'day']])

Note the use of min_periods set to 1 for the moving average. Or else, you will get NaN in the start of the data of your created moving average column and the way Python works, it will cause NaN for all the next periods too ! Next, choosing what data to display. The following data will be shown in the demo.
  • Station id : SN69100 (This is Værnes - Trondheim Airport weather station by the way,
  • Year 2020
The station ids can be looked up here on this user's Github GIST: https://gist.github.com/ofaltins/c1f0158f1766c8bd695b2c8d545c052c The following code set up the filtered data and saves it into a variable

NorwayMeteoDemo1.pynb


filtered_df_2020= df[ 
    (df['sourceId'] == 'SN69100')  &
    (df['year'] == 2020)
]

Next up, the data is then displayed. The demo will show two lineplots, first the maximum air temperature as a lineplot using Seaborn. In addition, another line plot with the moving 14 days average of the daily maximum air temperature. Also, a bar plot is added below these two line plots. Note that the bar plot used here is bar and not barplot. Barplot is available in Seaborn, while Bar is available in the MatPlotLib.

NorwayMeteoDemo1.pynb



sns.set_style('whitegrid') # set the style to 'whitegrid' 

# Create a 2x1 subplot

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 10), sharex=True)

sns.lineplot(data = filtered_df_2020, x = filtered_df_2020['date'], y = filtered_df_2020['max(air_temperature P1D)'], label='Daily max air temperature (C)', linewidth = 1.5, color = 'pink', ax = ax1)

sns.lineplot(data = filtered_df_2020, x = filtered_df_2020['date'], y = filtered_df_2020['moving_max_air_temp_avg'], label='14-day moving average Daily max air temperature (C)', color = 'red', linewidth = 1.8, ax = ax1)

ax2.bar(filtered_df_2020['date'], filtered_df_2020['sum(precipitation_amount P1D)'], data = filtered_df_2020, color = 'blue', label = 'Daily sum precipitation (mm)')

ax1.set_title('Værnes - Weather data - 2020')

ax1.set_xlabel('Date of year')
ax1.set_ylabel('Daily max air temperature (C)')

ax2.set_xlabel('Date of year')
ax2.set_ylabel('Daily sum precipitation (mm)')



The code above shows the resulting figure consisting of a 2x1 subplots layout, the upper plot shows the 2020 daily maximum air temperature combined with a moving 14-days average as a smoothing or trending function to show the general temperature shifts every second week of the year in average. The lower plot shows a bar plot, using MatPlotLib, since Seaborn's bar plot does not handle dates as x-axis (in our dataset, datetime64 is used). Seaborn and MatPlotLib offers a ton of plotting functionality. Maybe it also could be of interest for .NET Developers to use it more often? My previous article showed how it is possible to render in the backend images using MatPlotLib and then display in a Blazor serverside app, combining both
Python and .NET. That demo used Python.Net library for the Python interop with .NET. The screenshot shows the Jupyter Notebook IDE, part of Anaconda distribution of Python tailored for data analysis and data science. It displays the plots described from the Python script above.

Monday, 5 May 2025

Using MatPlotLib from .NET

MatPlotLib is a powerful library for data visualization. It provides graphing for scientific computing. It can be used for doing both mathematical calculations and statistics. Together with additional libraries like NumPy or Numerical Python, it is clear that Python as a programming language and ecosystem provides a lot of powerful functionality that is also free to use. MatplotLib has a BSD license, which means it can be ued for personal, academic or commercial purposes without restrictions. This article will look at using MatplotLib from .NET. First off an image that displays the demo and example of using MatplotLib.

The source code shown in this article is available on Github here:

https://github.com/toreaurstadboss/SeabornBlazorVisualizer



Using MatPlotLib from .NET

First off, install Anaconda. Anaconda is a Python distribution that contains a large collection of data visualization libraries. A compatible version with the lastest version of Python.net Nuget library. The demo displayed here uses Anaconda version 2023.03.

Anaconda archived versions 2023.03 can be installed from here. Windows users can download the file: https://repo.anaconda.com/archive/Anaconda3-2023.03-1-Windows-x86_64.exe

https://repo.anaconda.com/archive/

Next up, install also Python 3.10 version. It will be used together with Anaconda. A 64-bit installer can be found here:

Python 3.10 installer (Windows 64-bits) The correct versions of NumPy and MatPlotLib can be checked against this list :

https://github.com/toreaurstadboss/SeabornBlazorVisualizer/blob/main/SeabornBlazorVisualizer/conda_list_loading_matplotlib_working_1st_May_2025.txt

Calculating the determinite integral of a function

The demo in this article show in the link at the top has got an appsettings.json file, you can adjust to your environment.

appsettings.json Application configuration file




{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "PythonConfig": {
    "PythonDllPath": "C:\\Python310\\Python310.dll",
    "PythonHome": "C:\\Programdata\\anaconda3",
    "PythonSitePackages":  "C:\\Programdata\\anaconda3\\lib\\site-packages",
    "PythonVersion": "3.10"
  },
  "AllowedHosts": "*"
}


Clone the source code and run the application. It is a Blazor server app. You can run it from VS 2022 for example. The following code shows how Python.net is set up to start using Python. Both Python 3.10 and Anaconda site libs are used here. The Python runtime and engine is set up using this helper class.

PythonInitializer.cs



using Microsoft.Extensions.Options;
using Python.Runtime;

namespace SeabornBlazorVisualizer.Data
{

    /// <summary>
    /// Helper class to initialize the Python runtime
    /// </summary>
    public static class PythonInitializer
    {

        private static bool runtime_initialized = false;

        /// <summary>
        /// Perform one-time initialization of Python runtime
        /// </summary>
        /// <param name="pythonConfig"></param>
        public static void InitializePythonRuntime(IOptions<PythonConfig> pythonConfig)
        {
            if (runtime_initialized)
                return;
            var config = pythonConfig.Value;

            // Set environment variables
            Environment.SetEnvironmentVariable("PYTHONHOME", config.PythonHome, EnvironmentVariableTarget.Process);
            Environment.SetEnvironmentVariable("PYTHONPATH", config.PythonSitePackages, EnvironmentVariableTarget.Process);
            Environment.SetEnvironmentVariable("PYTHONNET_PYDLL", config.PythonDllPath);
            Environment.SetEnvironmentVariable("PYTHONNET_PYVER", config.PythonVersion);

            PythonEngine.Initialize();

            PythonEngine.PythonHome = config.PythonHome ?? Environment.GetEnvironmentVariable("PYTHONHOME", EnvironmentVariableTarget.Process)!;
            PythonEngine.PythonPath = config.PythonDllPath ?? Environment.GetEnvironmentVariable("PYTHONNET_PYDLL", EnvironmentVariableTarget.Process)!;

            PythonEngine.BeginAllowThreads();
            AddSitePackagesToPythonPath(pythonConfig);
            runtime_initialized = true;
        }

        private static void AddSitePackagesToPythonPath(IOptions<PythonConfig> pythonConfig)
        {
            if (!runtime_initialized)
            {
                using (Py.GIL())
                {
                    dynamic sys = Py.Import("sys");
                    sys.path.append(pythonConfig.Value.PythonSitePackages);
                    Console.WriteLine(sys.path);

                    //add folders in solution this too with scripts
                    sys.path.append(@"Data/");
                }
            }
        }

    }
}



The following helper class sets up the site libraries we will use.

PythonHelper.cs



using Python.Runtime;

namespace SeabornBlazorVisualizer.Data
{

    /// <summary>
    /// Helper class to initialize the Python runtime
    /// </summary>
    public static class PythonHelper
    {

        /// <summary>
        /// Imports Python modules. Returned are the following modules:
        /// <para>np (numpy)</para>
        /// <para>os (OS module - standard library)</para>
        /// <para>scipy (scipy)</para>
        /// <para>mpl (matplotlib)</para>
        /// <para>plt (matplotlib.pyplot </para>
        /// </summary>
        /// <returns>Tuple of Python modules</returns>
        public static (dynamic np, dynamic os, dynamic scipy, dynamic mpl, dynamic plt) ImportPythonModules()
        {

            dynamic np = Py.Import("numpy");
            dynamic os = Py.Import("os");
            dynamic mpl = Py.Import("matplotlib");
            dynamic plt = Py.Import("matplotlib.pyplot");
            dynamic scipy = Py.Import("scipy");

            mpl.use("Agg");

            return (np, os, scipy, mpl, plt);
        }

    }
}



The demo is a Blazor server app. The following service will generate the plot of a determinite integral using MatPlotLib. The service saves the plot into a PNG file. This PNG file is saved into the folder wwwroot. The Blazor server app displays the image that was generated and saved.

MatPlotImageService.cs



using Microsoft.Extensions.Options;
using Python.Runtime;

namespace SeabornBlazorVisualizer.Data
{
    public class MatplotPlotImageService
    {

        private IOptions<PythonConfig>? _pythonConfig;

        private static readonly object _lock = new object();

        public MatplotPlotImageService(IOptions<PythonConfig> pythonConfig)
        {
            _pythonConfig = pythonConfig;
            PythonInitializer.InitializePythonRuntime(_pythonConfig);
        }

        public Task<string> GenerateDefiniteIntegral(string functionExpression, int lowerBound, int upperBound)
        {

            string? result = null;

            using (Py.GIL()) // Ensure thread safety for Python calls
            {
                dynamic np = Py.Import("numpy");
                dynamic plt = Py.Import("matplotlib.pyplot");

                dynamic patches = Py.Import("matplotlib.patches"); // Import patches module

                // Create a Python execution scope
                using (var scope = Py.CreateScope())
                {
                    // Define the function inside the scope
                    scope.Exec($@"
import numpy as np
def func(x):
    return {functionExpression}
");

                    // Retrieve function reference from scope
                    dynamic func = scope.Get("func");

                    // Define integration limits
                    double a = lowerBound, b = upperBound;

                    // Generate x-values
                    dynamic x = np.linspace(0, 10, 100); //generate evenly spaced values in range [0, 20], 100 values (per 0.1)
                    dynamic y = func.Invoke(x);

                    // Create plot figure
                    var fig = plt.figure();
                    var ax = fig.add_subplot(111);

                    // set title to function expression
                    plt.title(functionExpression);

                    ax.plot(x, y, "r", linewidth: 2);
                    ax.set_ylim(0, null);

                    // Select range for integral shading
                    dynamic ix = np.linspace(a, b, 100);
                    dynamic iy = func.Invoke(ix);

                    // **Fix: Separate x and y coordinates properly**
                    List<double> xCoords = new List<double> { a }; // Start at (a, 0)
                    List<double> yCoords = new List<double> { 0 };

                    int length = (int)np.size(ix);
                    for (int i = 0; i < length; i++)
                    {
                        xCoords.Add((double)ix[i]);
                        yCoords.Add((double)iy[i]);
                    }

                    xCoords.Add(b); // End at (b, 0)
                    yCoords.Add(0);

                    // Convert x and y lists to NumPy arrays
                    dynamic npVerts = np.column_stack(new object[] { np.array(xCoords), np.array(yCoords) });

                    // **Correctly Instantiate Polygon Using NumPy Array**
                    dynamic poly = patches.Polygon(npVerts, facecolor: "0.6", edgecolor: "0.2");
                    ax.add_patch(poly);

                    // Compute integral area
                    double area = np.trapezoid(iy, ix);
                    ax.text(0.5 * (a + b), 30, "$\\int_a^b f(x)\\mathrm{d}x$", ha: "center", fontsize: 20);
                    ax.text(0.5 * (a + b), 10, $"Area = {area:F2}", ha: "center", fontsize: 12);

                    plt.show();


                    result = SavePlot(plt, dpi: 150);
                }
            }
            return Task.FromResult(result);
        }

        public Task<string> GenerateHistogram(List<double> values, string title = "Provide Plot title", string xlabel = "Provide xlabel title", string ylabel = "Provide ylabel title")
        {
            string? result = null;
            using (Py.GIL()) //Python Global Interpreter Lock (GIL)
            {
                var (np, os, scipy, mpl, plt) = PythonHelper.ImportPythonModules();

                var distribution = np.array(values.ToArray());

                //// Ensure clearing the plot
                //plt.clf();

                var fig = plt.figure(); //create a new figure
                var ax1 = fig.add_subplot(1, 2, 1);
                var ax2 = fig.add_subplot(1, 2, 2);

                // Add style
                plt.style.use("ggplot");

                var counts_bins_patches = ax1.hist(distribution, edgecolor: "black");

                // Normalize counts to get colors 
                var counts = counts_bins_patches[0];
                var patches = counts_bins_patches[2];

                var norm_counts = counts / np.max(counts);

                int norm_counts_size = Convert.ToInt32(norm_counts.size.ToString());

                // Apply colors to patches based on frequency
                for (int i = 0; i < norm_counts_size; i++)
                {
                    plt.setp(patches[i], "facecolor", plt.cm.viridis(norm_counts[i])); //plt.cm is the colormap module in MatPlotlib. viridis creates color maps from normalized value 0 to 1 that is optimized for color-blind people.
                }

                // **** AX1 Histogram first - frequency counts ***** 

                ax1.set_title(title);
                ax1.set_xlabel(xlabel);
                ax1.set_ylabel(ylabel);

                string cwd = os.getcwd();

                // Calculate average and standard deviation
                var average = np.mean(distribution);
                var std_dev = np.std(distribution);
                var total_count = np.size(distribution);

                // Format average and standard deviation to two decimal places
                var average_formatted = np.round(average, 2);
                var std_dev_formatted = np.round(std_dev, 2);

                //Add legend with average and standard deviation
                ax1.legend(new string[] { $"Total count: {total_count}\n Average: {average_formatted} cm\nStd Dev: {std_dev_formatted} cm" }, framealpha: 0.5, fancybox: true);



                //***** AX2 : Set up ax2 = Percentage histogram next *******

                ax2.set_title("Percentage distribution");
                ax2.set_xlabel(xlabel);
                ax2.set_ylabel(ylabel);
                // Fix for CS1977: Cast the lambda expression to a delegate type
                ax2.yaxis.set_major_formatter((PyObject)plt.FuncFormatter(new Func<double, int, string>((y, _) => $"{y:P0}")));

                ax2.hist(distribution, edgecolor: "black", weights: np.ones(distribution.size) / distribution.size);

                // Format y-axis to show percentages
                ax2.yaxis.set_major_formatter(plt.FuncFormatter(new Func<double, int, string>((y, _) => $"{y:P0}")));

                // tight layout to prevent overlap 
                plt.tight_layout();

                // Show the plot with the two subplots at last (render to back buffer 'Agg', see method SavePlot for details)
                plt.show();

                result = SavePlot(plt, theme: "bmh", dpi: 150);
            }

            return Task.FromResult(result);
        }

        public Task<string> GeneratedCumulativeGraphFromValues(List<double> values)
        {
            string? result = null;
            using (Py.GIL()) //Python Global Interpreter Lock (GIL)
            {
                var (np, os, scipy, mpl, plt) = PythonHelper.ImportPythonModules();

                dynamic pythonValues = np.cumsum(np.array(values.ToArray()));

                // Ensure clearing the plot
                plt.clf();

                // Create a figure with increased size
                dynamic fig = plt.figure(figsize: new PyTuple(new PyObject[] { new PyFloat(6), new PyFloat(4) }));

                // Plot data
                plt.plot(values, color: "green");

                string cwd = os.getcwd();

                result = SavePlot(plt, theme: "ggplot", dpi: 200);

            }

            return Task.FromResult(result);
        }

        public Task<string> GenerateRandomizedCumulativeGraph()
        {
            string? result = null;
            using (Py.GIL()) //Python Global Interpreter Lock (GIL)
            {

                dynamic np = Py.Import("numpy");

                //TODO : Remove imports of pandas and scipy and datetime if they are not needed

                Py.Import("pandas");
                Py.Import("scipy");
                Py.Import("datetime");
                dynamic os = Py.Import("os");

                dynamic mpl = Py.Import("matplotlib");
                dynamic plt = Py.Import("matplotlib.pyplot");

                // Set dark theme
                plt.style.use("ggplot");

                mpl.use("Agg");


                // Generate data
                //dynamic x = np.arange(0, 10, 0.1);
                //dynamic y = np.multiply(2, x); // Use NumPy's multiply function

                dynamic values = np.cumsum(np.random.randn(1000, 1));


                // Ensure clearing the plot
                plt.clf();

                // Create a figure with increased size
                dynamic fig = plt.figure(figsize: new PyTuple(new PyObject[] { new PyFloat(6), new PyFloat(4) }));

                // Plot data
                plt.plot(values, color: "blue");

                string cwd = os.getcwd();

                result = SavePlot(plt, theme: "ggplot", dpi: 200);

            }

            return Task.FromResult(result);
        }

        /// <summary>
        /// Saves the plot to a PNG file with a unique name based on the current date and time
        /// </summary>
        /// <param name="plot">Plot, must be a PyPlot plot use Python.net Py.Import("matplotlib.pyplot")</param>
        /// <param name="theme"></param>
        /// <param name="dpi"></param>
        /// <returns></returns>
        public string? SavePlot(dynamic plt, string theme = "ggplot", int dpi = 200)
        {
            string? plotSavedImagePath = null;
            //using (Py.GIL()) //Python Global Interpreter Lock (GIL)
            //{
            dynamic os = Py.Import("os");
            dynamic mpl = Py.Import("matplotlib");
            // Set dark theme
            plt.style.use(theme);
            mpl.use("Agg"); //set up rendering of plot to back-buffer ('headless' mode)

            string cwd = os.getcwd();
            // Save plot to PNG file
            string imageToCreatePath = $@"GeneratedImages\{DateTime.Now.ToString("yyyyMMddHHmmss")}{Guid.NewGuid().ToString("N")}_plotimg.png";
            string imageToCreateWithFolderPath = $@"{cwd}\wwwroot\{imageToCreatePath}";
            plt.savefig(imageToCreateWithFolderPath, dpi: dpi); //save the plot to a file (use full path)
            plotSavedImagePath = imageToCreatePath;

            CleanupOldGeneratedImages(cwd);
            //}
            return plotSavedImagePath;
        }

        private static void CleanupOldGeneratedImages(string cwd)
        {
            lock (_lock)
            {

                Directory.GetFiles(cwd + @"\wwwroot\GeneratedImages", "*.png")
                 .OrderByDescending(File.GetLastWriteTime)
                 .Skip(10)
                 .ToList()
                 .ForEach(File.Delete);
            }
        }

}



The code above shows some additional examples of using MatPlotLib.
  • Histogram example
  • Line graph using cumulative sum by making use of NumPy or a helper method in .NET
These examples demonstrates also that MatPlotLib can be used for statistics, which today for .NET is mostly crunched with the help of Excel or EP Plus library for example. Since Python is considered as the home of data visualization with its vast ecosystem of data science libraries, this article and demos shows how you can get started with using this ecosystem from .NET. Note, using Python.net to create these plots in MatPlotLib is best prepared using Jupyter Notebook. When the plot displayed looks okay, it is time to integrate that Python script into .NET and C# using Python.Net library. Make note that there will be some challenges to get the Python code to work in C# of course. When passing in values to a function, sometimes you must use
for example NumPy to create compatible data types. Also note the usage of the Pystatic class here from Python.net , which offers the GIL Global Interpreter lock and a way to import Python modules.

https://jupyter.org/ A screenshot showing histogram in the demo is shown below. As we can see, MatPlotLib can be used from many different data visualizations and domains.