Coding Grounds: Artificial Intelligence

Showing posts with label Artificial Intelligence. Show all posts

Tuesday, 15 April 2025

Adding plugins to use for Semantic Kernel

With Microsoft Semantic Kernel, it is possible to consume AI services such as Azure AI and OpenAI with less code, as this framework provides simplification and standardization for consuming these services. A repo on Github with the code shown in this article is provided here :

https://github.com/toreaurstadboss/SemanticKernelPluginDemov4

The demo code is a Blazor server app. It demonstrates how to use Microsoft Semantic Kernel with plugins. I have decided to provide the Northwind database as the extra data the plugin will use. Via debugging and seeing the output, I see that the plugin is successfully called and used. It is also easy to add plugins, which provides additional data to the AI model. This is suitable for providing AI powered solutions with private data that you want to provide to the AI model. For example, when using OpenAI Chat GPT-4, providing a plugin will make it possible to specify which data are to be presented and displayed. It is a convenient way to provide a natural language interface for doing data reporting such as listing up results from this plugin. The plugin can provide kernel functions, using attributes on the method. Let's first look at the Program.cs file for wiring up the semantic kernel for a Blazor Server demo app.

Program.cs



using Microsoft.EntityFrameworkCore;
using Microsoft.SemanticKernel;
using SemanticKernelPluginDemov4.Models;
using SemanticKernelPluginDemov4.Services;


namespace SemanticKernelPluginDemov4
{
    public class Program
    {
        public static void Main(string[] args)
        {
            var builder = WebApplication.CreateBuilder(args);

            // Add services to the container.
            builder.Services.AddRazorPages();
            builder.Services.AddServerSideBlazor();

            // Add DbContext
            builder.Services.AddDbContextFactory<NorthwindContext>(options =>
                options.UseSqlServer(builder.Configuration.GetConnectionString("DefaultConnection")));

            builder.Services.AddScoped<IOpenAIChatcompletionService, OpenAIChatcompletionService>();

            builder.Services.AddScoped<NorthwindSemanticKernelPlugin>();

            builder.Services.AddScoped(sp =>
            {
                var kernelBuilder = Kernel.CreateBuilder();
                kernelBuilder.AddOpenAIChatCompletion(modelId: builder.Configuration.GetSection("OpenAI").GetValue<string>("ModelId")!,
                    apiKey: builder.Configuration.GetSection("OpenAI").GetValue<string>("ApiKey")!);

                var kernel = kernelBuilder.Build();

                var dbContextFactory = sp.GetRequiredService<IDbContextFactory<NorthwindContext>>();
                var northwindSemanticKernelPlugin = new NorthwindSemanticKernelPlugin(dbContextFactory);
                kernel.ImportPluginFromObject(northwindSemanticKernelPlugin);

                return kernel;
            });

            var app = builder.Build();

            // Configure the HTTP request pipeline.
            if (!app.Environment.IsDevelopment())
            {
                app.UseExceptionHandler("/Error");
                // The default HSTS value is 30 days. You may want to change this for production scenarios, see https://aka.ms/aspnetcore-hsts.
                app.UseHsts();
            }

            app.UseHttpsRedirection();

            app.UseStaticFiles();

            app.UseRouting();

            app.MapBlazorHub();
            app.MapFallbackToPage("/_Host");

            app.Run();
        }
    }
}

In the code above, note the following:

The usage of IDbContextFactory for creating a db context, injected into the plugin. This is a Blazor server app, so this service is used to create db contet, since a Blazor server will have a durable connection between client and the server over Signal-R and there needs to use this interface to create dbcontext instances as needed
Using the method ImportPluginFromObject to import the plugin into the semantic kernel built here. Note that we register the kernel as a scoped service here. Also the plugin is registered as a scoped service here.

The plugin looks like this.

NorthwindSemanticKernelplugin.cs



using Microsoft.EntityFrameworkCore;
using Microsoft.SemanticKernel;
using SemanticKernelPluginDemov4.Models;
using System.ComponentModel;

namespace SemanticKernelPluginDemov4.Services
{

    public class NorthwindSemanticKernelPlugin
    {
        private readonly IDbContextFactory<NorthwindContext> _northwindContext;

        public NorthwindSemanticKernelPlugin(IDbContextFactory<NorthwindContext> northwindContext)
        {
            _northwindContext = northwindContext;
        }

        [KernelFunction]
        [Description("When asked about the suppliers of Nortwind database, use this method to get all the suppliers. Inform that the data comes from the Semantic Kernel plugin called : NortwindSemanticKernelPlugin")]
        public async Task<List<string>> GetSuppliers()
        {
            using (var dbContext = _northwindContext.CreateDbContext())
            {
                return await dbContext.Suppliers.OrderBy(s => s.CompanyName).Select(s => "Kernel method 'NorthwindSemanticKernelPlugin:GetSuppliers' gave this: " + s.CompanyName).ToListAsync();
            }
        }

        [KernelFunction]
        [Description("When asked about the total sales of a given month in a year, use this method. In case asked for multiple months, call this method again multiple times, adjusting the month and year as provided. The month and year is to be in the range 1-12 for months and for year 1996-1998. Suggest for the user what the valid ranges are in case other values are provided.")]
        public async Task<decimal> GetTotalSalesInMontAndYear(int month, int year)
        {
            using (var dbContext = _northwindContext.CreateDbContext())
            {
                var sumOfOrders = await (from Order o in dbContext.Orders
                             join OrderDetail od in dbContext.OrderDetails on o.OrderId equals od.OrderId
                             where o.OrderDate.HasValue && (o.OrderDate.Value.Month == month
                             && o.OrderDate.Value.Year == year) 
                             select (od.UnitPrice * od.Quantity) * (1 - (decimal)od.Discount)).SumAsync();

                return sumOfOrders;
            }
        }

    }
}

In the code above, note the attributes used. KernelFunction tells that this is a method the Semantic kernel can use. The description attribute instructs the AI LLM model how to use the method, how to provide parameter values if any and when the method is to be called. Let's look at the OpenAI service next.

OpenAIChatcompletionService.cs



using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;

namespace SemanticKernelPluginDemov4.Services
{

    public class OpenAIChatcompletionService : IOpenAIChatcompletionService
    {
        private readonly Kernel _kernel;

        private IChatCompletionService _chatCompletionService;

        public OpenAIChatcompletionService(Kernel kernel)
        {
            _kernel = kernel;
            _chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
        }

        public async IAsyncEnumerable<string?> RunQuery(string question)
        {
            var chatHistory = new ChatHistory();

            chatHistory.AddSystemMessage("You are a helpful assistant, answering only on questions about Northwind database. In case you got other questions, inform that you only can provide questions about the Northwind database. It is important that only the provided Northwind database functions added to the language model through plugin is used when answering the questions. If no answer is available, inform this.");

            chatHistory.AddUserMessage(question);

            await foreach (var chatUpdate in _chatCompletionService.GetStreamingChatMessageContentsAsync(chatHistory, CreateOpenAIExecutionSettings(), _kernel))
            {
                yield return chatUpdate.Content;
            }
        }

        private OpenAIPromptExecutionSettings? CreateOpenAIExecutionSettings()
        {
            return new OpenAIPromptExecutionSettings
            {
                ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
            };
        }

    }
}

In the code above, the kernel is injected into this service. The kernel was registered in Program.cs as a scoped service, so it is injected here. The method GetRequiredService is similar to the method with same name of IServiceProvider used inside Program.cs. Note the use of ToolCallBehavior set to AutoInvokeKernelFunctions. Extending the AI powered functionality with plugins requires little extra code with Microsoft Semantic kernel. A screenshot of the demo is shown below.

Monday, 31 March 2025

Generating Dall-e-3 images using Microsoft Semantic Kernel

In this demo, Dall-e-3 images are generated from a console app using Microsoft Semantic Kernel. The semantic kernel is a library that offers different plugins for different AI services. It is supported for multiple languages, these are C#, Java and Python. Its goal is to ease the use of consuming AI services and building a shared infrastructure for these services and offer a way to conceptualize and abstract the consumption of these services. It can also be seen as a middleware for the services and offering a framework where consuming AI services becomes a more standardized process. A Github repo has been created with the code for this demo here:

Github repo for this demo

Dall-e-3 image generator with semantic kernel

The demo contains two steps, first building the semantic kernel itself and then the image generation. First off, the .csproj file has package references to the latest as of March 2025 nuget package of Microsoft Semantic Kernel.

DalleImageGeneratorWithSemanticKernel.csproj



<Project Sdk="Microsoft.NET.Sdk"> 

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <NoWarn>$(NoWarn);CS8618,IDE0009,CA1051,CA1050,CA1707,CA1054,CA2007,VSTHRD111,CS1591,RCS1110,RCS1243,CA5394,SKEXP0001,SKEXP0010,SKEXP0020,SKEXP0040,SKEXP0050,SKEXP0060,SKEXP0070,SKEXP0101,SKEXP0110</NoWarn>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.SemanticKernel" Version="1.44.0" />
    <PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="8.0.1" />
  </ItemGroup>

</Project>

Note that multiple warnings are marked as no warning as semantic kernel is open for change in the future and thus flags multiple different warnings. The image generation demo is set up like this in the class ImageGeneration. Note how the Kernel object is built up here. It got a builder that offers many methods to add AI services. In this case we add an ITextToImageService. The modelName used here is "dall-e-3".

ImageGeneration.cs



using DalleImageGeneratorWithSemanticKernel;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.TextToImage;
using OpenAI.Images;
using System;
using System.Diagnostics;

namespace UseSemanticKernelFromNET;

public class ImageGeneration
{
    public async Task GenerateBasicImage(string modelName)
    {
        Kernel kernel = Kernel
            .CreateBuilder()
            .AddOpenAITextToImage(modelId:modelName, apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY")!).Build();

        ITextToImageService imageService = kernel.GetRequiredService<ITextToImageService>();

        Console.WriteLine("##### SEMANTIC KERNEL - IMAGE GENERATOR DALL-E-3 CONSOLE APP #####\n\n");


        string prompt =
           """
            In the humorous image, Vice President JD Vance and his wife are seen stepping out of their plane onto the icy runway of
            Thule Air Base. Just as they set foot on the frozen ground, a bunch of playful polar bears greet them enthusiastically, much like 
            overzealous fans welcoming celebrities. The surprised expressions on their faces are priceless as the couple finds 
            themselves being "chased" by these bundles of fur and excitement. JD Vance, with a mix of amusement and alarm, has one 
            shoe comically left behind in the snow, while his wife, holding onto her hat against the chilly wind, can't suppress a laugh.
            The scene is completed with members of the Air Base
            staff in the background, chuckling and capturing the moment on their phones, adding to the light-heartedness of the unexpected encounter.  
            The plane should carry the AirForce One Colors and read "United States of America". 
         """;

        Console.WriteLine($"\n ### STORY FOR THE IMAGE TO GENERATE WITH DALL-E-3 ### \n{prompt}\n\n");

        Console.WriteLine("\n\nStarting generation of dall-e-3 image...");

        var cts = new CancellationTokenSource();
        var cancellationToken = cts.Token;

        var rotationTask = Task.Run(() => ConsoleUtil.RotateDash(cancellationToken), cts.Token);

        var image = await imageService.GetOpenAIImageContentAsync(prompt,
            kernel: kernel,
            size: (1024, 1024), //for Dall-e-2 images, use: 256x256, 512x512, or 1024x1024. For dalle-3 images, use: 1024x1024, 1792x1024, 1024x1792. 
            style: "vivid",
            quality: "hd", //high
            responseFormat: "b64_json", // bytes
            cancellationToken: cancellationToken);       
        
        cts.Cancel(); //cancel to stop animating the waiting indicator

        var imageTmpFilePng = Path.ChangeExtension(Path.GetTempFileName(), "png");
        image?.FirstOrDefault()?.WriteToFile(imageTmpFilePng);

        Console.WriteLine($"Wrote image to location: {imageTmpFilePng}");

        Process.Start(new ProcessStartInfo
        {
            FileName = "explorer.exe",
            Arguments = imageTmpFilePng,
            UseShellExecute = true
        });

    }

}

A helper extension method has been added for the Open AI Dall-e-3 image creation. Please note that one should stick to not too many extension methods of semantic kernel itself as this defeats the purpose of a standardized way of using the semantic kernel. But in this case, it is just a helper method to customize the generation of particularly dall-e-3 (and dall-e-2) images from Open AI using the Semantic kernel. The code is shown below

TextToImageServiceExtensions.cs



using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Services;
using Microsoft.SemanticKernel.TextToImage;

namespace UseSemanticKernelFromNET;

public static class TextToImageServiceExtensions
{


    /// <summary>
    /// Generates OpenAI image content asynchronously based on the provided text input and settings.
    /// </summary>
    /// <param name="imageService">The image service used to generate the image content.</param>
    /// <param name="input">The text input used to generate the image.</param>
    /// <param name="kernel">An optional kernel instance for additional processing.</param>
    /// <param name="size">
    /// The desired size of the generated image. For DALL-E 2 images, use: 256x256, 512x512, or 1024x1024. 
    /// For DALL-E 3 images, use: 1024x1024, 1792x1024, or 1024x1792.
    /// </param>
    /// <param name="style">The style of the image. Must be "vivid" or "natural".</param>
    /// <param name="quality">The quality of the image. Must be "standard", "hd", or "high".</param>
    /// <param name="responseFormat">
    /// The format of the response. Must be one of the following: "url", "uri", "b64_json", or "bytes".
    /// </param>
    /// <param name="cancellationToken">A token to monitor for cancellation requests.</param>
    /// <returns>
    /// A task that represents the asynchronous operation. The task result contains a read-only list of 
    /// <see cref="ImageContent"/> objects representing the generated images.
    /// </returns>
    public static Task<IReadOnlyList<ImageContent>> GetOpenAIImageContentAsync(this ITextToImageService imageService,
        TextContent input,
        Kernel? kernel = null,
        (int width, int height) size = default((int, int)), // for Dall-e-2 images, use: 256x256, 512x512, or 1024x1024. For dalle-3 images, use: 1024x1024, 1792x1024, 1024x1792. 
        string style = "vivid",
        string quality = "hd",
        string responseFormat = "b64_json",        
        CancellationToken cancellationToken = default)
    {
        
        string? currentModelId = imageService.GetModelId();

        if (currentModelId != "dall-e-3" && currentModelId != "dall-e-2")
        {
            throw new NotSupportedException("This method is only supported for the DALL-E 2 and DALL-E 3 models.");
        }

        if (size.width == 0 || size.height == 0)
        {
            size = (1024, 1024); //defaulting here to (1024, 1024).
        }

        if (currentModelId == "dall-e-2"){
            var supportedSizes = new[]{
                (256, 256),
                (512, 512),
                (1024, 1024)
            };
            if (!supportedSizes.Contains(size))
            {
                throw new ArgumentException("For DALL-E 2, the size must be one of: 256x256, 512x512, or 1024x1024.");
            }
        }
        else if (currentModelId == "dall-e-3")
        {
            var supportedSizes = new[]{
                (1024, 1024),
                (1792, 1024),
                (1024, 1792)
            };
            if (!supportedSizes.Contains(size))
            {
                throw new ArgumentException("For DALL-E 3, the size must be one of: 256x256, 512x512, or 1024x1024.");
            }
        }

        return imageService.GetImageContentsAsync(
            input,
            new OpenAITextToImageExecutionSettings
                {
                    Size = size,
                    Style = style, //must be "vivid" or "natural"
                    Quality = quality, //must be "standard" or "hd" or "high"
                    ResponseFormat = responseFormat // url or uri or b64_json or bytes
                },
            kernel,
            cancellationToken);

    }
}

Screenshot of this demo, console app running:

The console app will generate the dall-e-3 image using OpenAI service for this and save the image as a PNG image and save it into file saved into a temporary location and then open this image using Windows default image viewer application. Example image generated :

Sunday, 8 December 2024

Extending Azure AI Search with data sources

This article will present both code and tips around getting Azure AI Search to utilize additional data sources. The article builds upon the previous article in the blog:

https://toreaurstad.blogspot.com/2024/12/azure-ai-openai-chat-gpt-4-client.html

This code will use Open AI Chat GPT-4 together with additional data source. I have tested this using Storage account in Azure which contains blobs with documents. First off, create Azure AI services if you do not have this yet.

Then create an Azure AI Search

Choose the location and the Pricing Tier. You can choose the Free (F) pricing tier to test out the Azure AI Search. The standard pricing tier comes in at about 250 USD per month, so a word of caution here as billing might incur if you do not choose the Free tier. Head over to the Azure AI Search service after it is crated and note inside the Overview the Url. Expand the Search management and choose the folowing menu options and fill out them in this order:

Data sources
Indexes
Indexers

There are several types of data sources you can add.

Azure Blog Storage
Azure Data Lake Storage Gen2
Azure Cosmos DB
Azure SQL Database
Azure Table Storage
Fabric OneLake files

Upload files to the blob container

I have tested out adding a data source using Azure Blob Storage. I had to create a new storage account and I believe Azure might have changed it over the years, so for best compability, add a brand new storage account. Then choose a blob container inside the Blob storage, then hit the Create button.
Head over to your Storage browser inside your storage account, then choose Blob container. You can add a Blob container and then after it is created, click the Upload button.
You can then upload multiple files into the blob container (it is like a folder, which saves your files as blobs).

Setting up the index

After the Blob storage (storage account) is added to the data source, choose the Indexes menu button inside Azure AI search. Click Add index.
After the index is added, choose the button Add field
Add a field name called : Edit.String of type Edm.String.
Click the checkbox for Retrievable and Searchable. Click the button Save

Setting up the indexer

Choose to add an Indexer via button Add indexer
Choose the Index you added
Choose the Data source you added
Select the indexed extensions and specify which file types to index. Probably you should select text based files here, such as .md and .markdown files and even some binary file type such as .pdf and .docx can be selected here
Data to extract: Choose Content and metadata

Source code for this article

The source code can be cloned from this Github repo:
br /> https://github.com/toreaurstadboss/OpenAIDemo.git

The code for this article is available in the branch:
feature/openai-search-documentsources To add the data source to our ChatClient instance, we do the following. Please note that this method will be changed in the Azure AI SDK in the future :



            ChatCompletionOptions? chatCompletionOptions = null;
            if (dataSources?.Any() == true)
            {
                chatCompletionOptions = new ChatCompletionOptions();

                foreach (var dataSource in dataSources!)
                {
#pragma warning disable AOAI001 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.
                    chatCompletionOptions.AddDataSource(new AzureSearchChatDataSource()
                    {
                        Endpoint = new Uri(dataSource.endpoint),
                        IndexName = dataSource.indexname,
                        Authentication = DataSourceAuthentication.FromApiKey(dataSource.authentication)
                    });
#pragma warning restore AOAI001 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.
                }

            }

The updated version of the extension class of OpenAI.Chat.ChatClient then looks like this: ChatClientExtensions.cs



using Azure.AI.OpenAI.Chat;
using OpenAI.Chat;
using System.ClientModel;
using System.Text;

namespace ToreAurstadIT.OpenAIDemo
{
    public static class ChatclientExtensions
    {

        /// <summary>
        /// Provides a stream result from the Chatclient service using AzureAI services.
        /// </summary>
        /// <param name="chatClient">ChatClient instance</param>
        /// <param name="message">The message to send and communicate to the ai-model</param>
        /// <returns>Streamed chat reply / result. Consume using 'await foreach'</returns>
        public static AsyncCollectionResult<StreamingChatCompletionUpdate> GetStreamedReplyAsync(this ChatClient chatClient, string message,
            (string endpoint, string indexname, string authentication)[]? dataSources = null)
        {
            ChatCompletionOptions? chatCompletionOptions = null;
            if (dataSources?.Any() == true)
            {
                chatCompletionOptions = new ChatCompletionOptions();

                foreach (var dataSource in dataSources!)
                {
#pragma warning disable AOAI001 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.
                    chatCompletionOptions.AddDataSource(new AzureSearchChatDataSource()
                    {
                        Endpoint = new Uri(dataSource.endpoint),
                        IndexName = dataSource.indexname,
                        Authentication = DataSourceAuthentication.FromApiKey(dataSource.authentication)
                    });
#pragma warning restore AOAI001 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.
                }

            }

            return chatClient.CompleteChatStreamingAsync(
                [new SystemChatMessage("You are an helpful, wonderful AI assistant"), new UserChatMessage(message)], chatCompletionOptions);
        }

        public static async Task<string> GetStreamedReplyStringAsync(this ChatClient chatClient, string message, (string endpoint, string indexname, string authentication)[]? dataSources = null, bool outputToConsole = false)
        {
            var sb = new StringBuilder();
            await foreach (var update in GetStreamedReplyAsync(chatClient, message, dataSources))
            {
                foreach (var textReply in update.ContentUpdate.Select(cu => cu.Text))
                {
                    sb.Append(textReply);
                    if (outputToConsole)
                    {
                        Console.Write(textReply);
                    }
                }
            }
            return sb.ToString();
        }

    }
}

The updated code for the demo app then looks like this, I chose to just use tuples here for the endpoint, index name and api key:

ChatpGptDemo.cs



using OpenAI.Chat;
using OpenAIDemo;
using System.Diagnostics;

namespace ToreAurstadIT.OpenAIDemo
{
    public class ChatGptDemo
    {

        public async Task<string?> RunChatGptQuery(ChatClient? chatClient, string msg)
        {
            if (chatClient == null)
            {
                Console.WriteLine("Sorry, the demo failed. The chatClient did not initialize propertly.");
                return null;
            }

            Console.WriteLine("Searching ... Please wait..");

            var stopWatch = Stopwatch.StartNew();

            var chatDataSources = new[]{
                (
                    SearchEndPoint: Environment.GetEnvironmentVariable("AZURE_SEARCH_AI_ENDPOINT", EnvironmentVariableTarget.User) ?? "N/A",
                    SearchIndexName: Environment.GetEnvironmentVariable("AZURE_SEARCH_AI_INDEXNAME", EnvironmentVariableTarget.User) ?? "N/A",
                    SearchApiKey: Environment.GetEnvironmentVariable("AZURE_SEARCH_AI_APIKEY", EnvironmentVariableTarget.User) ?? "N/A"
                )
            };

            string reply = "";

            try
            {

                reply = await chatClient.GetStreamedReplyStringAsync(msg, dataSources: chatDataSources, outputToConsole: true);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }

            Console.WriteLine($"The operation took: {stopWatch.ElapsedMilliseconds} ms");


            Console.WriteLine();

            return reply;
        }

    }
}

The code here expects that three user-specific environment variables exists. Please note that the API key can be found under the menu item Keys in Azure AI Search. There are two admin keys and multiple query keys. To distribute keys to other users, you of course share the API query key, not the admin key(s). The screenshot below shows the demo. It is a console application, it could be web application or other client :

Please note that the Free tier of Azure AI Search is rather slow and seems to only allow queryes at a certain interval, it will suffice to just test it out. To really test it out in for example an Intranet scenario, the standard tier Azure AI search service is recommended, at about 250 USD per month as noted.

Conclusions

Getting an Azure AI Chat service to work in intranet scenarios using a combination of Open AI Chat GPT-4 together with a custom collection of files that are indexed offers a nice combination of building up a knowledge base which you can query against. It is rather convenient way of building an on-premise solution for intranet AI chat service using Azure cloud services.

Coding Grounds