Tuesday, 15 April 2025

Adding plugins to use for Semantic Kernel

With Microsoft Semantic Kernel, it is possible to consume AI services such as Azure AI and OpenAI with less code, as this framework provides simplification and standardization for consuming these services. A repo on Github with the code shown in this article is provided here :

https://github.com/toreaurstadboss/SemanticKernelPluginDemov4

The demo code is a Blazor server app. It demonstrates how to use Microsoft Semantic Kernel with plugins. I have decided to provide the Northwind database as the extra data the plugin will use. Via debugging and seeing the output, I see that the plugin is successfully called and used. It is also easy to add plugins, which provides additional data to the AI model. This is suitable for providing AI powered solutions with private data that you want to provide to the AI model. For example, when using OpenAI Chat GPT-4, providing a plugin will make it possible to specify which data are to be presented and displayed. It is a convenient way to provide a natural language interface for doing data reporting such as listing up results from this plugin. The plugin can provide kernel functions, using attributes on the method. Let's first look at the Program.cs file for wiring up the semantic kernel for a Blazor Server demo app.

Program.cs



using Microsoft.EntityFrameworkCore;
using Microsoft.SemanticKernel;
using SemanticKernelPluginDemov4.Models;
using SemanticKernelPluginDemov4.Services;


namespace SemanticKernelPluginDemov4
{
    public class Program
    {
        public static void Main(string[] args)
        {
            var builder = WebApplication.CreateBuilder(args);

            // Add services to the container.
            builder.Services.AddRazorPages();
            builder.Services.AddServerSideBlazor();

            // Add DbContext
            builder.Services.AddDbContextFactory<NorthwindContext>(options =>
                options.UseSqlServer(builder.Configuration.GetConnectionString("DefaultConnection")));

            builder.Services.AddScoped<IOpenAIChatcompletionService, OpenAIChatcompletionService>();

            builder.Services.AddScoped<NorthwindSemanticKernelPlugin>();

            builder.Services.AddScoped(sp =>
            {
                var kernelBuilder = Kernel.CreateBuilder();
                kernelBuilder.AddOpenAIChatCompletion(modelId: builder.Configuration.GetSection("OpenAI").GetValue<string>("ModelId")!,
                    apiKey: builder.Configuration.GetSection("OpenAI").GetValue<string>("ApiKey")!);

                var kernel = kernelBuilder.Build();

                var dbContextFactory = sp.GetRequiredService<IDbContextFactory<NorthwindContext>>();
                var northwindSemanticKernelPlugin = new NorthwindSemanticKernelPlugin(dbContextFactory);
                kernel.ImportPluginFromObject(northwindSemanticKernelPlugin);

                return kernel;
            });

            var app = builder.Build();

            // Configure the HTTP request pipeline.
            if (!app.Environment.IsDevelopment())
            {
                app.UseExceptionHandler("/Error");
                // The default HSTS value is 30 days. You may want to change this for production scenarios, see https://aka.ms/aspnetcore-hsts.
                app.UseHsts();
            }

            app.UseHttpsRedirection();

            app.UseStaticFiles();

            app.UseRouting();

            app.MapBlazorHub();
            app.MapFallbackToPage("/_Host");

            app.Run();
        }
    }
}



In the code above, note the following:
  • The usage of IDbContextFactory for creating a db context, injected into the plugin. This is a Blazor server app, so this service is used to create db contet, since a Blazor server will have a durable connection between client and the server over Signal-R and there needs to use this interface to create dbcontext instances as needed
  • Using the method ImportPluginFromObject to import the plugin into the semantic kernel built here. Note that we register the kernel as a scoped service here. Also the plugin is registered as a scoped service here.
The plugin looks like this.

NorthwindSemanticKernelplugin.cs



using Microsoft.EntityFrameworkCore;
using Microsoft.SemanticKernel;
using SemanticKernelPluginDemov4.Models;
using System.ComponentModel;

namespace SemanticKernelPluginDemov4.Services
{

    public class NorthwindSemanticKernelPlugin
    {
        private readonly IDbContextFactory<NorthwindContext> _northwindContext;

        public NorthwindSemanticKernelPlugin(IDbContextFactory<NorthwindContext> northwindContext)
        {
            _northwindContext = northwindContext;
        }

        [KernelFunction]
        [Description("When asked about the suppliers of Nortwind database, use this method to get all the suppliers. Inform that the data comes from the Semantic Kernel plugin called : NortwindSemanticKernelPlugin")]
        public async Task<List<string>> GetSuppliers()
        {
            using (var dbContext = _northwindContext.CreateDbContext())
            {
                return await dbContext.Suppliers.OrderBy(s => s.CompanyName).Select(s => "Kernel method 'NorthwindSemanticKernelPlugin:GetSuppliers' gave this: " + s.CompanyName).ToListAsync();
            }
        }

        [KernelFunction]
        [Description("When asked about the total sales of a given month in a year, use this method. In case asked for multiple months, call this method again multiple times, adjusting the month and year as provided. The month and year is to be in the range 1-12 for months and for year 1996-1998. Suggest for the user what the valid ranges are in case other values are provided.")]
        public async Task<decimal> GetTotalSalesInMontAndYear(int month, int year)
        {
            using (var dbContext = _northwindContext.CreateDbContext())
            {
                var sumOfOrders = await (from Order o in dbContext.Orders
                             join OrderDetail od in dbContext.OrderDetails on o.OrderId equals od.OrderId
                             where o.OrderDate.HasValue && (o.OrderDate.Value.Month == month
                             && o.OrderDate.Value.Year == year) 
                             select (od.UnitPrice * od.Quantity) * (1 - (decimal)od.Discount)).SumAsync();

                return sumOfOrders;
            }
        }

    }
}


In the code above, note the attributes used. KernelFunction tells that this is a method the Semantic kernel can use. The description attribute instructs the AI LLM model how to use the method, how to provide parameter values if any and when the method is to be called. Let's look at the OpenAI service next.

OpenAIChatcompletionService.cs



using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;

namespace SemanticKernelPluginDemov4.Services
{

    public class OpenAIChatcompletionService : IOpenAIChatcompletionService
    {
        private readonly Kernel _kernel;

        private IChatCompletionService _chatCompletionService;

        public OpenAIChatcompletionService(Kernel kernel)
        {
            _kernel = kernel;
            _chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
        }

        public async IAsyncEnumerable<string?> RunQuery(string question)
        {
            var chatHistory = new ChatHistory();

            chatHistory.AddSystemMessage("You are a helpful assistant, answering only on questions about Northwind database. In case you got other questions, inform that you only can provide questions about the Northwind database. It is important that only the provided Northwind database functions added to the language model through plugin is used when answering the questions. If no answer is available, inform this.");

            chatHistory.AddUserMessage(question);

            await foreach (var chatUpdate in _chatCompletionService.GetStreamingChatMessageContentsAsync(chatHistory, CreateOpenAIExecutionSettings(), _kernel))
            {
                yield return chatUpdate.Content;
            }
        }

        private OpenAIPromptExecutionSettings? CreateOpenAIExecutionSettings()
        {
            return new OpenAIPromptExecutionSettings
            {
                ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
            };
        }

    }
}



In the code above, the kernel is injected into this service. The kernel was registered in Program.cs as a scoped service, so it is injected here. The method GetRequiredService is similar to the method with same name of IServiceProvider used inside Program.cs. Note the use of ToolCallBehavior set to AutoInvokeKernelFunctions. Extending the AI powered functionality with plugins requires little extra code with Microsoft Semantic kernel. A screenshot of the demo is shown below.

Monday, 31 March 2025

Generating Dall-e-3 images using Microsoft Semantic Kernel

In this demo, Dall-e-3 images are generated from a console app using Microsoft Semantic Kernel. The semantic kernel is a library that offers different plugins for different AI services. It is supported for multiple languages, these are C#, Java and Python. Its goal is to ease the use of consuming AI services and building a shared infrastructure for these services and offer a way to conceptualize and abstract the consumption of these services. It can also be seen as a middleware for the services and offering a framework where consuming AI services becomes a more standardized process. A Github repo has been created with the code for this demo here:

Github repo for this demo
Dall-e-3 image generator with semantic kernel

The demo contains two steps, first building the semantic kernel itself and then the image generation. First off, the .csproj file has package references to the latest as of March 2025 nuget package of Microsoft Semantic Kernel.
DalleImageGeneratorWithSemanticKernel.csproj


<Project Sdk="Microsoft.NET.Sdk"> 

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <NoWarn>$(NoWarn);CS8618,IDE0009,CA1051,CA1050,CA1707,CA1054,CA2007,VSTHRD111,CS1591,RCS1110,RCS1243,CA5394,SKEXP0001,SKEXP0010,SKEXP0020,SKEXP0040,SKEXP0050,SKEXP0060,SKEXP0070,SKEXP0101,SKEXP0110</NoWarn>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.SemanticKernel" Version="1.44.0" />
    <PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="8.0.1" />
  </ItemGroup>

</Project>


Note that multiple warnings are marked as no warning as semantic kernel is open for change in the future and thus flags multiple different warnings. The image generation demo is set up like this in the class ImageGeneration. Note how the Kernel object is built up here. It got a builder that offers many methods to add AI services. In this case we add an ITextToImageService. The modelName used here is "dall-e-3".
ImageGeneration.cs


using DalleImageGeneratorWithSemanticKernel;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.TextToImage;
using OpenAI.Images;
using System;
using System.Diagnostics;

namespace UseSemanticKernelFromNET;

public class ImageGeneration
{
    public async Task GenerateBasicImage(string modelName)
    {
        Kernel kernel = Kernel
            .CreateBuilder()
            .AddOpenAITextToImage(modelId:modelName, apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY")!).Build();

        ITextToImageService imageService = kernel.GetRequiredService<ITextToImageService>();

        Console.WriteLine("##### SEMANTIC KERNEL - IMAGE GENERATOR DALL-E-3 CONSOLE APP #####\n\n");


        string prompt =
           """
            In the humorous image, Vice President JD Vance and his wife are seen stepping out of their plane onto the icy runway of
            Thule Air Base. Just as they set foot on the frozen ground, a bunch of playful polar bears greet them enthusiastically, much like 
            overzealous fans welcoming celebrities. The surprised expressions on their faces are priceless as the couple finds 
            themselves being "chased" by these bundles of fur and excitement. JD Vance, with a mix of amusement and alarm, has one 
            shoe comically left behind in the snow, while his wife, holding onto her hat against the chilly wind, can't suppress a laugh.
            The scene is completed with members of the Air Base
            staff in the background, chuckling and capturing the moment on their phones, adding to the light-heartedness of the unexpected encounter.  
            The plane should carry the AirForce One Colors and read "United States of America". 
         """;

        Console.WriteLine($"\n ### STORY FOR THE IMAGE TO GENERATE WITH DALL-E-3 ### \n{prompt}\n\n");

        Console.WriteLine("\n\nStarting generation of dall-e-3 image...");

        var cts = new CancellationTokenSource();
        var cancellationToken = cts.Token;

        var rotationTask = Task.Run(() => ConsoleUtil.RotateDash(cancellationToken), cts.Token);

        var image = await imageService.GetOpenAIImageContentAsync(prompt,
            kernel: kernel,
            size: (1024, 1024), //for Dall-e-2 images, use: 256x256, 512x512, or 1024x1024. For dalle-3 images, use: 1024x1024, 1792x1024, 1024x1792. 
            style: "vivid",
            quality: "hd", //high
            responseFormat: "b64_json", // bytes
            cancellationToken: cancellationToken);       
        
        cts.Cancel(); //cancel to stop animating the waiting indicator

        var imageTmpFilePng = Path.ChangeExtension(Path.GetTempFileName(), "png");
        image?.FirstOrDefault()?.WriteToFile(imageTmpFilePng);

        Console.WriteLine($"Wrote image to location: {imageTmpFilePng}");

        Process.Start(new ProcessStartInfo
        {
            FileName = "explorer.exe",
            Arguments = imageTmpFilePng,
            UseShellExecute = true
        });

    }

}


A helper extension method has been added for the Open AI Dall-e-3 image creation. Please note that one should stick to not too many extension methods of semantic kernel itself as this defeats the purpose of a standardized way of using the semantic kernel. But in this case, it is just a helper method to customize the generation of particularly dall-e-3 (and dall-e-2) images from Open AI using the Semantic kernel. The code is shown below
TextToImageServiceExtensions.cs


using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Services;
using Microsoft.SemanticKernel.TextToImage;

namespace UseSemanticKernelFromNET;

public static class TextToImageServiceExtensions
{


    /// <summary>
    /// Generates OpenAI image content asynchronously based on the provided text input and settings.
    /// </summary>
    /// <param name="imageService">The image service used to generate the image content.</param>
    /// <param name="input">The text input used to generate the image.</param>
    /// <param name="kernel">An optional kernel instance for additional processing.</param>
    /// <param name="size">
    /// The desired size of the generated image. For DALL-E 2 images, use: 256x256, 512x512, or 1024x1024. 
    /// For DALL-E 3 images, use: 1024x1024, 1792x1024, or 1024x1792.
    /// </param>
    /// <param name="style">The style of the image. Must be "vivid" or "natural".</param>
    /// <param name="quality">The quality of the image. Must be "standard", "hd", or "high".</param>
    /// <param name="responseFormat">
    /// The format of the response. Must be one of the following: "url", "uri", "b64_json", or "bytes".
    /// </param>
    /// <param name="cancellationToken">A token to monitor for cancellation requests.</param>
    /// <returns>
    /// A task that represents the asynchronous operation. The task result contains a read-only list of 
    /// <see cref="ImageContent"/> objects representing the generated images.
    /// </returns>
    public static Task<IReadOnlyList<ImageContent>> GetOpenAIImageContentAsync(this ITextToImageService imageService,
        TextContent input,
        Kernel? kernel = null,
        (int width, int height) size = default((int, int)), // for Dall-e-2 images, use: 256x256, 512x512, or 1024x1024. For dalle-3 images, use: 1024x1024, 1792x1024, 1024x1792. 
        string style = "vivid",
        string quality = "hd",
        string responseFormat = "b64_json",        
        CancellationToken cancellationToken = default)
    {
        
        string? currentModelId = imageService.GetModelId();

        if (currentModelId != "dall-e-3" && currentModelId != "dall-e-2")
        {
            throw new NotSupportedException("This method is only supported for the DALL-E 2 and DALL-E 3 models.");
        }

        if (size.width == 0 || size.height == 0)
        {
            size = (1024, 1024); //defaulting here to (1024, 1024).
        }

        if (currentModelId == "dall-e-2"){
            var supportedSizes = new[]{
                (256, 256),
                (512, 512),
                (1024, 1024)
            };
            if (!supportedSizes.Contains(size))
            {
                throw new ArgumentException("For DALL-E 2, the size must be one of: 256x256, 512x512, or 1024x1024.");
            }
        }
        else if (currentModelId == "dall-e-3")
        {
            var supportedSizes = new[]{
                (1024, 1024),
                (1792, 1024),
                (1024, 1792)
            };
            if (!supportedSizes.Contains(size))
            {
                throw new ArgumentException("For DALL-E 3, the size must be one of: 256x256, 512x512, or 1024x1024.");
            }
        }

        return imageService.GetImageContentsAsync(
            input,
            new OpenAITextToImageExecutionSettings
                {
                    Size = size,
                    Style = style, //must be "vivid" or "natural"
                    Quality = quality, //must be "standard" or "hd" or "high"
                    ResponseFormat = responseFormat // url or uri or b64_json or bytes
                },
            kernel,
            cancellationToken);

    }
}


Screenshot of this demo, console app running:
The console app will generate the dall-e-3 image using OpenAI service for this and save the image as a PNG image and save it into file saved into a temporary location and then open this image using Windows default image viewer application. Example image generated :

Saturday, 22 March 2025

Image classification using ML.NET Machine Learning

I added a demo using ML.Net in a Github. The demo is available in this repository :

https://github.com/toreaurstadboss/ImageClassificationMLNetBlazorDemo

A screenshot shows the application running below :

ML.Net is Microsoft's machine learning library. It is combined with tooling inside VS 2022 an easy way to locally use machine learning models on your CPU or GPU, or hosted in Azure cloud services. The website for ML.Net is available here for more information about ML.Net and documentation:

https://dotnet.microsoft.com/en-us/apps/ai/ml-dotnet

In the demo above I have trained the model to recognize either horses or mooses. These species are both mammals and herbivores and somewhat are similar in appearance. I have trained the machine learning model in this demo only with ten images of each category, then again with ten other test images that checks if the model recognizes correctly if we see a horse or a moose. Already with just ten images, it did not miss once, and of course a better example for a real world machine learning model would have scoured over tens of thousand of images to handle all edge cases. ML.Net is very easy to run, it can be run locally on your own machine, using the CPU or GPU. The GPU must be CUDA compatible. That actually means you need a NVIDIA card with 8-series. I got such a card on a laptop of mine and have tested it. The following links points to download pages of NVIDIA for downloading the necessary software as of March 2025 to run ML.Net image classification functionality on GPUs :

Download Cuda 10.1

Cuda 10.1 can be downloaded from here: https://developer.nvidia.com/cuda-10.1-download-archive-base

CuDnn 7.6.4

CuDnn can be downloaded from here: https://developer.nvidia.com/rdp/cudnn-archive

Getting started with image classification using ML.Net

It is easiest to use VS 2022 to add a ML.Net machine learning model. Inside VS 2022, right click your project and choose Add and choose Machine Learning Model In case you do not see this option, hit the start menu and type in Visual Studio installer Now, hit the button Modify for your VS installation. Choose Individual Components Search for 'ml'. Select the ML.NET Model Builder. There are also a package called ML.NET Model Builder 2022, I also chose that.
Choosing the scenario
Now, after adding the Machine Learning model, the first page asks for a scenario. I choose Image Classification here, below Computer Vision scenario category.

Choosing the environment
Then I hit the button Local. In the next step, I select Local (CPU). Note that I have tested also Nvidia Cuda-compatible graphics card / GPU on another laptop and it also worked great and should be preferred if you have a GPU compatible and have installed Cuda 10.1 and Cdnn 7.6.4 as shown in links above.

Hit the button Next Step.
Choosing the Data
It is time to train the machine learning model with data ! I have gathered ten sample images of mooses and horses each. By pointing to a folder with images where each category of images are gathered in subfolders of this folder.

Next step is Train
Training the model
Here you can hit the button Train again. When you have trained enough here the model, you can hit the button Next step . Training the machine learning will take some time depending on you using CPU or GPU and the number of input images here. Usually it takes a few seconds, but not many minutes to churn through a couple of images as shown here, 20 images in total.

Loading up the image data and using the machine learning model

Note that ML.Net demands support to renderinteractive rendering of web apps, pure Blazor WASM apps are not supported. The following file shows how the Blazor serverside app is set up.

Program.cs

using ImageClassificationMLNetBlazorDemo.Components;

var builder = WebApplication.CreateBuilder(args);

// Add services to the container.
builder.Services.AddRazorComponents()
    .AddInteractiveServerComponents();

var app = builder.Build();

// Configure the HTTP request pipeline.
if (!app.Environment.IsDevelopment())
{
    app.UseExceptionHandler("/Error", createScopeForErrors: true);
    // The default HSTS value is 30 days. You may want to change this for production scenarios, see https://aka.ms/aspnetcore-hsts.
    app.UseHsts();
}

app.UseHttpsRedirection();

app.UseStaticFiles();
app.UseAntiforgery();

app.MapRazorComponents<App>()
    .AddInteractiveServerRenderMode();

app.Run();


InteractiveServer is set up inside the App.razor using the HeadOutlet.

App.razor


<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <base href="/" />
    <link rel="stylesheet" href="app.css" />
    <link rel="stylesheet" href="lib/bootstrap/css/bootstrap.min.css" />
    <link rel="stylesheet" href="ImageClassificationMLNetBlazorDemo.styles.css" />
    <link rel="icon" type="image/png" href="favicon.png" />

    <HeadOutlet @rendermode="InteractiveServer" />
</head>

<body>
    <Routes @rendermode="InteractiveServer" />
    <script src="_framework/blazor.web.js"></script>
</body>

</html>


The following codebehind of the razor component Home.razor in the demo repo shows how a file uploaded using the InputFile control in Blazor serverside. Home.razor.cs


@code {

    private string? _base64ImageSource = null;
    private string? _predictedLabel = "No classification";
    private IOrderedEnumerable<KeyValuePair<string, float>>? _predictedLabels = null;
    private int? _assessedPredictionQuality = null;
    private string? _errorMessage = null;

    private async Task LoadFileAsync(InputFileChangeEventArgs e)
    {
        try
        {
            ResetPrivateFields();

            if (e.File.Size <= 0 || e.File.Size >= 2 * 1024 * 1024)
            {
                _errorMessage = "Sorry, the uploaded image but be between 1 byte and 2 MB!";
                return;
            }

            byte[] imageBytes = await GetImageBytes(e.File);
            _base64ImageSource = GetBase64ImageSourceString(e.File.ContentType, imageBytes);

            PredictImageClassification(imageBytes);

        }
        catch (Exception err)
        {
            Console.WriteLine(err);
        }
    }

    private void ResetPrivateFields()
    {
        _base64ImageSource = null;
        _predictedLabel = null;
        _predictedLabels = null;
        _assessedPredictionQuality = null;
    }

    private int GetAssesPrediction()
    {
        int result = 1;
        if (_predictedLabel != null && _predictedLabels != null)
        {
            foreach (var label in _predictedLabels)
            {
                if (label.Key == _predictedLabel)
                {
                    result = label.Value switch
                    {
                        <= 0.50f => 1,
                        <= 0.70f => 2,
                        <= 0.80f => 3,
                        <= 0.85f => 4,
                        <= 0.90f => 5,
                        <= 1.0f => 6,
                        _ => 1 //default to dice we get some other score here..
                    };
                }
            }
        }

        return result;
    }

    private void PredictImageClassification(byte[] imageBytes)
    {

        var input = new ModelInput
            {
                ImageSource = imageBytes
            };
        ModelOutput output = HorseOrMooseImageClassifier.Predict(input);
        _predictedLabel = output.PredictedLabel;

        _predictedLabels = HorseOrMooseImageClassifier.PredictAllLabels(input);

        _assessedPredictionQuality = GetAssesPrediction(); //check how good the prediction is, give a score from 1-6 (dice score!)

        StateHasChanged();
    }


    private async Task<byte[]> GetImageBytes(IBrowserFile file) 
    {
        using MemoryStream memoryStream = new();
        var stream = file.OpenReadStream(2 * 1024 * 1024, CancellationToken.None);
        await stream.CopyToAsync(memoryStream);
        return memoryStream.ToArray();
    }

    private string GetBase64ImageSourceString(string contentType, byte[] bytes)
    {
        string preAmble = $"data:{contentType};base64,";
        return $"{preAmble}{(Convert.ToBase64String(bytes))}";
    }
}


As the code shows above, using the machine learning model is quite convenient, we just use the methods Predict to get the Label that is decided exists in the loaded image. This is the image classiciation that the machine learning found. Note that using the method PredictAllLabels get the confidence of the different labels show in this demo. There are no limitations on the number of categories here in the image classification labels that one could train a model to look after. A benefit with ML.Net is the option to use it on-premise servers and get fairly good result on just a few sample images. But the more sample images you obtain for a label, the more precise the machine learning model will become. It is possible to download a pre-trained model such as Inceptionv3 that is compatible with Tensorflow used here that supports up to 1000 categories. More information is available here from Microsoft about using a pre-trained model such as InceptionV3:

https://learn.microsoft.com/en-us/dotnet/machine-learning/tutorials/image-classification