Monday, 31 March 2025

Generating Dall-e-3 images using Microsoft Semantic Kernel

In this demo, Dall-e-3 images are generated from a console app using Microsoft Semantic Kernel. The semantic kernel is a library that offers different plugins for different AI services. It is supported for multiple languages, these are C#, Java and Python. Its goal is to ease the use of consuming AI services and building a shared infrastructure for these services and offer a way to conceptualize and abstract the consumption of these services. It can also be seen as a middleware for the services and offering a framework where consuming AI services becomes a more standardized process. A Github repo has been created with the code for this demo here:

Github repo for this demo
Dall-e-3 image generator with semantic kernel

The demo contains two steps, first building the semantic kernel itself and then the image generation. First off, the .csproj file has package references to the latest as of March 2025 nuget package of Microsoft Semantic Kernel.
DalleImageGeneratorWithSemanticKernel.csproj


<Project Sdk="Microsoft.NET.Sdk"> 

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <NoWarn>$(NoWarn);CS8618,IDE0009,CA1051,CA1050,CA1707,CA1054,CA2007,VSTHRD111,CS1591,RCS1110,RCS1243,CA5394,SKEXP0001,SKEXP0010,SKEXP0020,SKEXP0040,SKEXP0050,SKEXP0060,SKEXP0070,SKEXP0101,SKEXP0110</NoWarn>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.SemanticKernel" Version="1.44.0" />
    <PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="8.0.1" />
  </ItemGroup>

</Project>


Note that multiple warnings are marked as no warning as semantic kernel is open for change in the future and thus flags multiple different warnings. The image generation demo is set up like this in the class ImageGeneration. Note how the Kernel object is built up here. It got a builder that offers many methods to add AI services. In this case we add an ITextToImageService. The modelName used here is "dall-e-3".
ImageGeneration.cs


using DalleImageGeneratorWithSemanticKernel;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.TextToImage;
using OpenAI.Images;
using System;
using System.Diagnostics;

namespace UseSemanticKernelFromNET;

public class ImageGeneration
{
    public async Task GenerateBasicImage(string modelName)
    {
        Kernel kernel = Kernel
            .CreateBuilder()
            .AddOpenAITextToImage(modelId:modelName, apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY")!).Build();

        ITextToImageService imageService = kernel.GetRequiredService<ITextToImageService>();

        Console.WriteLine("##### SEMANTIC KERNEL - IMAGE GENERATOR DALL-E-3 CONSOLE APP #####\n\n");


        string prompt =
           """
            In the humorous image, Vice President JD Vance and his wife are seen stepping out of their plane onto the icy runway of
            Thule Air Base. Just as they set foot on the frozen ground, a bunch of playful polar bears greet them enthusiastically, much like 
            overzealous fans welcoming celebrities. The surprised expressions on their faces are priceless as the couple finds 
            themselves being "chased" by these bundles of fur and excitement. JD Vance, with a mix of amusement and alarm, has one 
            shoe comically left behind in the snow, while his wife, holding onto her hat against the chilly wind, can't suppress a laugh.
            The scene is completed with members of the Air Base
            staff in the background, chuckling and capturing the moment on their phones, adding to the light-heartedness of the unexpected encounter.  
            The plane should carry the AirForce One Colors and read "United States of America". 
         """;

        Console.WriteLine($"\n ### STORY FOR THE IMAGE TO GENERATE WITH DALL-E-3 ### \n{prompt}\n\n");

        Console.WriteLine("\n\nStarting generation of dall-e-3 image...");

        var cts = new CancellationTokenSource();
        var cancellationToken = cts.Token;

        var rotationTask = Task.Run(() => ConsoleUtil.RotateDash(cancellationToken), cts.Token);

        var image = await imageService.GetOpenAIImageContentAsync(prompt,
            kernel: kernel,
            size: (1024, 1024), //for Dall-e-2 images, use: 256x256, 512x512, or 1024x1024. For dalle-3 images, use: 1024x1024, 1792x1024, 1024x1792. 
            style: "vivid",
            quality: "hd", //high
            responseFormat: "b64_json", // bytes
            cancellationToken: cancellationToken);       
        
        cts.Cancel(); //cancel to stop animating the waiting indicator

        var imageTmpFilePng = Path.ChangeExtension(Path.GetTempFileName(), "png");
        image?.FirstOrDefault()?.WriteToFile(imageTmpFilePng);

        Console.WriteLine($"Wrote image to location: {imageTmpFilePng}");

        Process.Start(new ProcessStartInfo
        {
            FileName = "explorer.exe",
            Arguments = imageTmpFilePng,
            UseShellExecute = true
        });

    }

}


A helper extension method has been added for the Open AI Dall-e-3 image creation. Please note that one should stick to not too many extension methods of semantic kernel itself as this defeats the purpose of a standardized way of using the semantic kernel. But in this case, it is just a helper method to customize the generation of particularly dall-e-3 (and dall-e-2) images from Open AI using the Semantic kernel. The code is shown below
TextToImageServiceExtensions.cs


using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Services;
using Microsoft.SemanticKernel.TextToImage;

namespace UseSemanticKernelFromNET;

public static class TextToImageServiceExtensions
{


    /// <summary>
    /// Generates OpenAI image content asynchronously based on the provided text input and settings.
    /// </summary>
    /// <param name="imageService">The image service used to generate the image content.</param>
    /// <param name="input">The text input used to generate the image.</param>
    /// <param name="kernel">An optional kernel instance for additional processing.</param>
    /// <param name="size">
    /// The desired size of the generated image. For DALL-E 2 images, use: 256x256, 512x512, or 1024x1024. 
    /// For DALL-E 3 images, use: 1024x1024, 1792x1024, or 1024x1792.
    /// </param>
    /// <param name="style">The style of the image. Must be "vivid" or "natural".</param>
    /// <param name="quality">The quality of the image. Must be "standard", "hd", or "high".</param>
    /// <param name="responseFormat">
    /// The format of the response. Must be one of the following: "url", "uri", "b64_json", or "bytes".
    /// </param>
    /// <param name="cancellationToken">A token to monitor for cancellation requests.</param>
    /// <returns>
    /// A task that represents the asynchronous operation. The task result contains a read-only list of 
    /// <see cref="ImageContent"/> objects representing the generated images.
    /// </returns>
    public static Task<IReadOnlyList<ImageContent>> GetOpenAIImageContentAsync(this ITextToImageService imageService,
        TextContent input,
        Kernel? kernel = null,
        (int width, int height) size = default((int, int)), // for Dall-e-2 images, use: 256x256, 512x512, or 1024x1024. For dalle-3 images, use: 1024x1024, 1792x1024, 1024x1792. 
        string style = "vivid",
        string quality = "hd",
        string responseFormat = "b64_json",        
        CancellationToken cancellationToken = default)
    {
        
        string? currentModelId = imageService.GetModelId();

        if (currentModelId != "dall-e-3" && currentModelId != "dall-e-2")
        {
            throw new NotSupportedException("This method is only supported for the DALL-E 2 and DALL-E 3 models.");
        }

        if (size.width == 0 || size.height == 0)
        {
            size = (1024, 1024); //defaulting here to (1024, 1024).
        }

        if (currentModelId == "dall-e-2"){
            var supportedSizes = new[]{
                (256, 256),
                (512, 512),
                (1024, 1024)
            };
            if (!supportedSizes.Contains(size))
            {
                throw new ArgumentException("For DALL-E 2, the size must be one of: 256x256, 512x512, or 1024x1024.");
            }
        }
        else if (currentModelId == "dall-e-3")
        {
            var supportedSizes = new[]{
                (1024, 1024),
                (1792, 1024),
                (1024, 1792)
            };
            if (!supportedSizes.Contains(size))
            {
                throw new ArgumentException("For DALL-E 3, the size must be one of: 256x256, 512x512, or 1024x1024.");
            }
        }

        return imageService.GetImageContentsAsync(
            input,
            new OpenAITextToImageExecutionSettings
                {
                    Size = size,
                    Style = style, //must be "vivid" or "natural"
                    Quality = quality, //must be "standard" or "hd" or "high"
                    ResponseFormat = responseFormat // url or uri or b64_json or bytes
                },
            kernel,
            cancellationToken);

    }
}


Screenshot of this demo, console app running:
The console app will generate the dall-e-3 image using OpenAI service for this and save the image as a PNG image and save it into file saved into a temporary location and then open this image using Windows default image viewer application. Example image generated :

Saturday, 22 March 2025

Image classification using ML.NET Machine Learning

I added a demo using ML.Net in a Github. The demo is available in this repository :

https://github.com/toreaurstadboss/ImageClassificationMLNetBlazorDemo

A screenshot shows the application running below :

ML.Net is Microsoft's machine learning library. It is combined with tooling inside VS 2022 an easy way to locally use machine learning models on your CPU or GPU, or hosted in Azure cloud services. The website for ML.Net is available here for more information about ML.Net and documentation:

https://dotnet.microsoft.com/en-us/apps/ai/ml-dotnet

In the demo above I have trained the model to recognize either horses or mooses. These species are both mammals and herbivores and somewhat are similar in appearance. I have trained the machine learning model in this demo only with ten images of each category, then again with ten other test images that checks if the model recognizes correctly if we see a horse or a moose. Already with just ten images, it did not miss once, and of course a better example for a real world machine learning model would have scoured over tens of thousand of images to handle all edge cases. ML.Net is very easy to run, it can be run locally on your own machine, using the CPU or GPU. The GPU must be CUDA compatible. That actually means you need a NVIDIA card with 8-series. I got such a card on a laptop of mine and have tested it. The following links points to download pages of NVIDIA for downloading the necessary software as of March 2025 to run ML.Net image classification functionality on GPUs :

Download Cuda 10.1

Cuda 10.1 can be downloaded from here: https://developer.nvidia.com/cuda-10.1-download-archive-base

CuDnn 7.6.4

CuDnn can be downloaded from here: https://developer.nvidia.com/rdp/cudnn-archive

Getting started with image classification using ML.Net

It is easiest to use VS 2022 to add a ML.Net machine learning model. Inside VS 2022, right click your project and choose Add and choose Machine Learning Model In case you do not see this option, hit the start menu and type in Visual Studio installer Now, hit the button Modify for your VS installation. Choose Individual Components Search for 'ml'. Select the ML.NET Model Builder. There are also a package called ML.NET Model Builder 2022, I also chose that.
Choosing the scenario
Now, after adding the Machine Learning model, the first page asks for a scenario. I choose Image Classification here, below Computer Vision scenario category.

Choosing the environment
Then I hit the button Local. In the next step, I select Local (CPU). Note that I have tested also Nvidia Cuda-compatible graphics card / GPU on another laptop and it also worked great and should be preferred if you have a GPU compatible and have installed Cuda 10.1 and Cdnn 7.6.4 as shown in links above.

Hit the button Next Step.
Choosing the Data
It is time to train the machine learning model with data ! I have gathered ten sample images of mooses and horses each. By pointing to a folder with images where each category of images are gathered in subfolders of this folder.

Next step is Train
Training the model
Here you can hit the button Train again. When you have trained enough here the model, you can hit the button Next step . Training the machine learning will take some time depending on you using CPU or GPU and the number of input images here. Usually it takes a few seconds, but not many minutes to churn through a couple of images as shown here, 20 images in total.

Loading up the image data and using the machine learning model

Note that ML.Net demands support to renderinteractive rendering of web apps, pure Blazor WASM apps are not supported. The following file shows how the Blazor serverside app is set up.

Program.cs

using ImageClassificationMLNetBlazorDemo.Components;

var builder = WebApplication.CreateBuilder(args);

// Add services to the container.
builder.Services.AddRazorComponents()
    .AddInteractiveServerComponents();

var app = builder.Build();

// Configure the HTTP request pipeline.
if (!app.Environment.IsDevelopment())
{
    app.UseExceptionHandler("/Error", createScopeForErrors: true);
    // The default HSTS value is 30 days. You may want to change this for production scenarios, see https://aka.ms/aspnetcore-hsts.
    app.UseHsts();
}

app.UseHttpsRedirection();

app.UseStaticFiles();
app.UseAntiforgery();

app.MapRazorComponents<App>()
    .AddInteractiveServerRenderMode();

app.Run();


InteractiveServer is set up inside the App.razor using the HeadOutlet.

App.razor


<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <base href="/" />
    <link rel="stylesheet" href="app.css" />
    <link rel="stylesheet" href="lib/bootstrap/css/bootstrap.min.css" />
    <link rel="stylesheet" href="ImageClassificationMLNetBlazorDemo.styles.css" />
    <link rel="icon" type="image/png" href="favicon.png" />

    <HeadOutlet @rendermode="InteractiveServer" />
</head>

<body>
    <Routes @rendermode="InteractiveServer" />
    <script src="_framework/blazor.web.js"></script>
</body>

</html>


The following codebehind of the razor component Home.razor in the demo repo shows how a file uploaded using the InputFile control in Blazor serverside. Home.razor.cs


@code {

    private string? _base64ImageSource = null;
    private string? _predictedLabel = "No classification";
    private IOrderedEnumerable<KeyValuePair<string, float>>? _predictedLabels = null;
    private int? _assessedPredictionQuality = null;
    private string? _errorMessage = null;

    private async Task LoadFileAsync(InputFileChangeEventArgs e)
    {
        try
        {
            ResetPrivateFields();

            if (e.File.Size <= 0 || e.File.Size >= 2 * 1024 * 1024)
            {
                _errorMessage = "Sorry, the uploaded image but be between 1 byte and 2 MB!";
                return;
            }

            byte[] imageBytes = await GetImageBytes(e.File);
            _base64ImageSource = GetBase64ImageSourceString(e.File.ContentType, imageBytes);

            PredictImageClassification(imageBytes);

        }
        catch (Exception err)
        {
            Console.WriteLine(err);
        }
    }

    private void ResetPrivateFields()
    {
        _base64ImageSource = null;
        _predictedLabel = null;
        _predictedLabels = null;
        _assessedPredictionQuality = null;
    }

    private int GetAssesPrediction()
    {
        int result = 1;
        if (_predictedLabel != null && _predictedLabels != null)
        {
            foreach (var label in _predictedLabels)
            {
                if (label.Key == _predictedLabel)
                {
                    result = label.Value switch
                    {
                        <= 0.50f => 1,
                        <= 0.70f => 2,
                        <= 0.80f => 3,
                        <= 0.85f => 4,
                        <= 0.90f => 5,
                        <= 1.0f => 6,
                        _ => 1 //default to dice we get some other score here..
                    };
                }
            }
        }

        return result;
    }

    private void PredictImageClassification(byte[] imageBytes)
    {

        var input = new ModelInput
            {
                ImageSource = imageBytes
            };
        ModelOutput output = HorseOrMooseImageClassifier.Predict(input);
        _predictedLabel = output.PredictedLabel;

        _predictedLabels = HorseOrMooseImageClassifier.PredictAllLabels(input);

        _assessedPredictionQuality = GetAssesPrediction(); //check how good the prediction is, give a score from 1-6 (dice score!)

        StateHasChanged();
    }


    private async Task<byte[]> GetImageBytes(IBrowserFile file) 
    {
        using MemoryStream memoryStream = new();
        var stream = file.OpenReadStream(2 * 1024 * 1024, CancellationToken.None);
        await stream.CopyToAsync(memoryStream);
        return memoryStream.ToArray();
    }

    private string GetBase64ImageSourceString(string contentType, byte[] bytes)
    {
        string preAmble = $"data:{contentType};base64,";
        return $"{preAmble}{(Convert.ToBase64String(bytes))}";
    }
}


As the code shows above, using the machine learning model is quite convenient, we just use the methods Predict to get the Label that is decided exists in the loaded image. This is the image classiciation that the machine learning found. Note that using the method PredictAllLabels get the confidence of the different labels show in this demo. There are no limitations on the number of categories here in the image classification labels that one could train a model to look after. A benefit with ML.Net is the option to use it on-premise servers and get fairly good result on just a few sample images. But the more sample images you obtain for a label, the more precise the machine learning model will become. It is possible to download a pre-trained model such as Inceptionv3 that is compatible with Tensorflow used here that supports up to 1000 categories. More information is available here from Microsoft about using a pre-trained model such as InceptionV3:

https://learn.microsoft.com/en-us/dotnet/machine-learning/tutorials/image-classification

Tuesday, 18 March 2025

Converting a .NET Datetime to a DateTime2 in T-SQL

This article presents a handy T-Sql that extracts a DateTime value stored in .NET in a numerical field and converts it into a Sql Server DateTime (DateTime2 column). The T-SQL will convert into a DateTime2 with a SECOND precision. An assumption here is that any numerical value larger than 100000000000 contains a DateTime value. This is an acceptable assumption when you log data, as very large values usually indicate a datetime value. But you might want to have additional checking here of course in addition to what i show in the example T-SQL script. Here is the T-SQL that shows how we can convert the .NET DateTime into a SQL DateTime.


-- Last updated: March 18, 2025
-- Synopsis: This script retrieves detailed change log information from the ObjectChanges, PropertyChanges, and ChangeSets tables.
-- It filters the results based on specific identifiers stored in a table variable, in this example Guids. 

-- In this example T-Sql the library FrameLog is used to store a log

-- DateTime columns are retrieved by looking at number of ticks elapsed since DateTime.MinValue as 
-- DateTime columns are stored in SQL Server as a this numeric value. 


DECLARE @EXAMPLEGUIDS TABLE (ID NVARCHAR(36))

INSERT INTO @EXAMPLEGUIDS (Id)
VALUES
('1968126a-64c1-4d15-bf23-8cb8497dcaa9'), 
('3e11aad8-95df-4377-ad63-c2fec3d43034'),
('acbdd116-b6a5-4425-907b-f86cb55aeedd') --tip: define which form Guids to fetch the ChangeLog database tables which 'FrameLog' uses The form Guids can each be retrieved from url showing the form in MRS in the browser

SELECT 
o.Id as ObjectChanges_Id, 
o.ObjectReference as ObjectReference, 
o.TypeName as ObjectChanges_TypeName, 
c.Id as Changeset_Id, 
c.[Timestamp] as Changeset_Timestamp,
c.Author_UserName as Changeset_AuthorName,
p.[Id] as PropertyChanges_Id,
p.[PropertyName],
p.[Value],
p.[ValueAsInt],
CASE 
    WHEN p.Value IS NOT NULL 
         AND ISNUMERIC(p.[Value]) = 1 
         AND CAST(p.[Value] AS decimal) > 100000000000 
    THEN 
        DATEADD(SECOND, 
            CAST(CAST(p.[Value] AS decimal) / 10000000 AS BIGINT) % 60, 
            DATEADD(MINUTE, 
                CAST(CAST(p.[Value] AS decimal) / 10000000 / 60 AS BIGINT), 
                CAST('0001-01-01' AS datetime2)
            )
        )
    ELSE NULL
END AS ValueAsDate,
o.ChangeType as ObjectChanges_ChangeTypeIfSet
FROM propertychanges p
LEFT OUTER JOIN ObjectChanges o on o.Id = p.ObjectChange_Id
LEFT OUTER JOIN ChangeSets c on o.ChangeSet_Id = c.Id
WHERE ObjectChange_Id in (
 SELECT ObjectChanges.Id
  FROM PropertyChanges
  LEFT OUTER JOIN ObjectChanges on ObjectChanges.Id = PropertyChanges.ObjectChange_Id
  LEFT OUTER JOIN ChangeSets on ObjectChanges.ChangeSet_Id = ChangeSets.Id
  WHERE ObjectChange_Id in (SELECT Id FROM ObjectChanges where ObjectReference IN (
   SELECT Id FROM @EXAMPLEGUIDS
   ))) --find out the Changeset where ObjectChange_Id equals the Id of ObjectChanges where ObjectReference equals one of the identifiers in @EXAMPLEGUIDS
ORDER BY ObjectReference, Changeset_Id DESC, Changeset_Timestamp DESC



The T-Sql is handy in case you come across datetime columns from .NET that are saved as ticks in a column as numerical value and shows how we can do the conversion.