https://github.com/toreaurstadboss/Image.Analyze.Azure.Ai
A screen shot for this demo is shown below: The demo allows you to upload a picture (supported formats are .jpeg, .jpg and .png, but Azure AI Image Analyzer supports a lot of other image formats too). The demo shows a preview of the selected image and to the right an image of bounding boxes of objects in the image. A list of tags extracted from the image are also shown. Raw data from the Azure Image Analyzer service is shown in the text box area below the pictures, with a list of tags to the right. The demo is written with .NET Maui Blazor and .NET 6. Let us look at some code for making this demo. ImageSaveService.cs
using Image.Analyze.Azure.Ai.Models;
using Microsoft.AspNetCore.Components.Forms;
namespace Ocr.Handwriting.Azure.AI.Services
{
public class ImageSaveService : IImageSaveService
{
public async Task<ImageSaveModel> SaveImage(IBrowserFile browserFile)
{
var buffers = new byte[browserFile.Size];
var bytes = await browserFile.OpenReadStream(maxAllowedSize: 30 * 1024 * 1024).ReadAsync(buffers);
string imageType = browserFile.ContentType;
var basePath = FileSystem.Current.AppDataDirectory;
var imageSaveModel = new ImageSaveModel
{
SavedFilePath = Path.Combine(basePath, $"{Guid.NewGuid().ToString("N")}-{browserFile.Name}"),
PreviewImageUrl = $"data:{imageType};base64,{Convert.ToBase64String(buffers)}",
FilePath = browserFile.Name,
FileSize = bytes / 1024,
};
await File.WriteAllBytesAsync(imageSaveModel.SavedFilePath, buffers);
return imageSaveModel;
}
}
}
//Interface defined inside IImageService.cs shown below
using Image.Analyze.Azure.Ai.Models;
using Microsoft.AspNetCore.Components.Forms;
namespace Ocr.Handwriting.Azure.AI.Services
{
public interface IImageSaveService
{
Task<ImageSaveModel> SaveImage(IBrowserFile browserFile);
}
}
The ImageSaveService saves the uploaded image from the IBrowserFile into a base-64 string from the image bytes of the uploaded IBrowserFile via OpenReadStream of the IBrowserFile.
This allows us to preview the uploaded image. The code also saves the image to the AppDataDirectory that MAUI supports - FileSystem.Current.AppDataDirectory.
Let's look at how to call the analysis service itself, it is actually quite straight forward.
ImageAnalyzerService.cs
using Azure;
using Azure.AI.Vision.Common;
using Azure.AI.Vision.ImageAnalysis;
namespace Image.Analyze.Azure.Ai.Lib
{
public class ImageAnalyzerService : IImageAnalyzerService
{
public ImageAnalyzer CreateImageAnalyzer(string imageFile)
{
string key = Environment.GetEnvironmentVariable("AZURE_COGNITIVE_SERVICES_VISION_SECONDARY_KEY");
string endpoint = Environment.GetEnvironmentVariable("AZURE_COGNITIVE_SERVICES_VISION_SECONDARY_ENDPOINT");
var visionServiceOptions = new VisionServiceOptions(new Uri(endpoint), new AzureKeyCredential(key));
using VisionSource visionSource = CreateVisionSource(imageFile);
var analysisOptions = CreateImageAnalysisOptions();
var analyzer = new ImageAnalyzer(visionServiceOptions, visionSource, analysisOptions);
return analyzer;
}
private static VisionSource CreateVisionSource(string imageFile)
{
using var stream = File.OpenRead(imageFile);
using var reader = new StreamReader(stream);
byte[] imageBuffer;
using (var streamReader = new MemoryStream())
{
stream.CopyTo(streamReader);
imageBuffer = streamReader.ToArray();
}
using var imageSourceBuffer = new ImageSourceBuffer();
imageSourceBuffer.GetWriter().Write(imageBuffer);
return VisionSource.FromImageSourceBuffer(imageSourceBuffer);
}
private static ImageAnalysisOptions CreateImageAnalysisOptions() => new ImageAnalysisOptions
{
Language = "en",
GenderNeutralCaption = false,
Features =
ImageAnalysisFeature.CropSuggestions
| ImageAnalysisFeature.Caption
| ImageAnalysisFeature.DenseCaptions
| ImageAnalysisFeature.Objects
| ImageAnalysisFeature.People
| ImageAnalysisFeature.Text
| ImageAnalysisFeature.Tags
};
}
}
//interface shown below
public interface IImageAnalyzerService
{
ImageAnalyzer CreateImageAnalyzer(string imageFile);
}
We retrieve environment variables here and we create an ImageAnalyzer. We create a Vision source from the saved picture we uploaded and open a stream to it using File.OpenRead method on System.IO.
Since we saved the file in the AppData folder of the .NET MAUI app, we can read this file.
We set up the image analysis options and the vision service options. We then call the return the image analyzer.
Let's look at the code-behind of the index.razor file that initializes the Image analyzer, and runs the Analyze method of it.
Index.razor.cs
using Azure.AI.Vision.ImageAnalysis;
using Image.Analyze.Azure.Ai.Extensions;
using Image.Analyze.Azure.Ai.Models;
using Microsoft.AspNetCore.Components.Forms;
using Microsoft.JSInterop;
using System.Text;
namespace Image.Analyze.Azure.Ai.Pages
{
partial class Index
{
private IndexModel Model = new();
//https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/call-analyze-image-40?WT.mc_id=twitter&pivots=programming-language-csharp
private string ImageInfo = string.Empty;
private async Task Submit()
{
if (Model.PreviewImageUrl == null || Model.SavedFilePath == null)
{
await Application.Current.MainPage.DisplayAlert($"MAUI Blazor Image Analyzer App", $"You must select an image first before running Image Analysis. Supported formats are .jpeg, .jpg and .png", "Ok", "Cancel");
return;
}
using var imageAnalyzer = ImageAnalyzerService.CreateImageAnalyzer(Model.SavedFilePath);
ImageAnalysisResult analysisResult = await imageAnalyzer.AnalyzeAsync();
if (analysisResult.Reason == ImageAnalysisResultReason.Analyzed)
{
Model.ImageAnalysisOutputText = analysisResult.OutputImageAnalysisResult();
Model.Caption = $"{analysisResult.Caption.Content} Confidence: {analysisResult.Caption.Confidence.ToString("F2")}";
Model.Tags = analysisResult.Tags.Select(t => $"{t.Name} (Confidence: {t.Confidence.ToString("F2")})").ToList();
var jsonBboxes = analysisResult.GetBoundingBoxesJson();
await JsRunTime.InvokeVoidAsync("LoadBoundingBoxes", jsonBboxes);
}
else
{
ImageInfo = $"The image analysis did not perform its analysis. Reason: {analysisResult.Reason}";
}
StateHasChanged(); //visual refresh here
}
private async Task CopyTextToClipboard()
{
await Clipboard.SetTextAsync(Model.ImageAnalysisOutputText);
await Application.Current.MainPage.DisplayAlert($"MAUI Blazor Image Analyzer App", $"The copied text was put into the clipboard. Character length: {Model.ImageAnalysisOutputText?.Length}", "Ok", "Cancel");
}
private async Task OnInputFile(InputFileChangeEventArgs args)
{
var imageSaveModel = await ImageSaveService.SaveImage(args.File);
Model = new IndexModel(imageSaveModel);
await Application.Current.MainPage.DisplayAlert($"MAUI Blazor ImageAnalyzer app App", $"Wrote file to location : {Model.SavedFilePath} Size is: {Model.FileSize} kB", "Ok", "Cancel");
}
}
}
In the code-behind above we have a submit handler called Submit. We there analyze the image and send the result both to the UI and also to a client side Javascript method using IJSRuntime in .NET MAUI Blazor.
Let's look at the two helper methods of ImageAnalysisResult next.
ImageAnalysisResultExtensions.cs
using Azure.AI.Vision.ImageAnalysis;
using System.Text;
namespace Image.Analyze.Azure.Ai.Extensions
{
public static class ImageAnalysisResultExtensions
{
public static string GetBoundingBoxesJson(this ImageAnalysisResult result)
{
var sb = new StringBuilder();
sb.AppendLine(@"[");
int objectIndex = 0;
foreach (var detectedObject in result.Objects)
{
sb.Append($"{{ \"Name\": \"{detectedObject.Name}\", \"Y\": {detectedObject.BoundingBox.Y}, \"X\": {detectedObject.BoundingBox.X}, \"Height\": {detectedObject.BoundingBox.Height}, \"Width\": {detectedObject.BoundingBox.Width}, \"Confidence\": \"{detectedObject.Confidence:0.0000}\" }}");
objectIndex++;
if (objectIndex < result.Objects?.Count)
{
sb.Append($",{Environment.NewLine}");
}
else
{
sb.Append($"{Environment.NewLine}");
}
}
sb.Remove(sb.Length - 2, 1); //remove trailing comma at the end
sb.AppendLine(@"]");
return sb.ToString();
}
public static string OutputImageAnalysisResult(this ImageAnalysisResult result)
{
var sb = new StringBuilder();
if (result.Reason == ImageAnalysisResultReason.Analyzed)
{
sb.AppendLine($" Image height = {result.ImageHeight}");
sb.AppendLine($" Image width = {result.ImageWidth}");
sb.AppendLine($" Model version = {result.ModelVersion}");
if (result.Caption != null)
{
sb.AppendLine(" Caption:");
sb.AppendLine($" \"{result.Caption.Content}\", Confidence {result.Caption.Confidence:0.0000}");
}
if (result.DenseCaptions != null)
{
sb.AppendLine(" Dense Captions:");
foreach (var caption in result.DenseCaptions)
{
sb.AppendLine($" \"{caption.Content}\", Bounding box {caption.BoundingBox}, Confidence {caption.Confidence:0.0000}");
}
}
if (result.Objects != null)
{
sb.AppendLine(" Objects:");
foreach (var detectedObject in result.Objects)
{
sb.AppendLine($" \"{detectedObject.Name}\", Bounding box {detectedObject.BoundingBox}, Confidence {detectedObject.Confidence:0.0000}");
}
}
if (result.Tags != null)
{
sb.AppendLine($" Tags:");
foreach (var tag in result.Tags)
{
sb.AppendLine($" \"{tag.Name}\", Confidence {tag.Confidence:0.0000}");
}
}
if (result.People != null)
{
sb.AppendLine($" People:");
foreach (var person in result.People)
{
sb.AppendLine($" Bounding box {person.BoundingBox}, Confidence {person.Confidence:0.0000}");
}
}
if (result.CropSuggestions != null)
{
sb.AppendLine($" Crop Suggestions:");
foreach (var cropSuggestion in result.CropSuggestions)
{
sb.AppendLine($" Aspect ratio {cropSuggestion.AspectRatio}: "
+ $"Crop suggestion {cropSuggestion.BoundingBox}");
};
}
if (result.Text != null)
{
sb.AppendLine($" Text:");
foreach (var line in result.Text.Lines)
{
string pointsToString = "{" + string.Join(',', line.BoundingPolygon.Select(pointsToString => pointsToString.ToString())) + "}";
sb.AppendLine($" Line: '{line.Content}', Bounding polygon {pointsToString}");
foreach (var word in line.Words)
{
pointsToString = "{" + string.Join(',', word.BoundingPolygon.Select(pointsToString => pointsToString.ToString())) + "}";
sb.AppendLine($" Word: '{word.Content}', Bounding polygon {pointsToString}, Confidence {word.Confidence:0.0000}");
}
}
}
var resultDetails = ImageAnalysisResultDetails.FromResult(result);
sb.AppendLine($" Result details:");
sb.AppendLine($" Image ID = {resultDetails.ImageId}");
sb.AppendLine($" Result ID = {resultDetails.ResultId}");
sb.AppendLine($" Connection URL = {resultDetails.ConnectionUrl}");
sb.AppendLine($" JSON result = {resultDetails.JsonResult}");
}
else
{
var errorDetails = ImageAnalysisErrorDetails.FromResult(result);
sb.AppendLine(" Analysis failed.");
sb.AppendLine($" Error reason : {errorDetails.Reason}");
sb.AppendLine($" Error code : {errorDetails.ErrorCode}");
sb.AppendLine($" Error message: {errorDetails.Message}");
}
return sb.ToString();
}
}
}
Finally, let's look at the client side Javascript function that we call and send the bounding boxes json to draw the boxes. We will use Canvas in HTML 5 to show the picture and the bounding boxes of objects found in the image.
index.html
<script type="text/javascript">
var colorPalette = ["red", "yellow", "blue", "green", "fuchsia", "moccasin", "purple", "magenta", "aliceblue", "lightyellow", "lightgreen"];
function rescaleCanvas() {
var img = document.getElementById('PreviewImage');
var canvas = document.getElementById('PreviewImageBbox');
canvas.width = img.width;
canvas.height = img.height;
}
function getColor() {
var colorIndex = parseInt(Math.random() * 10);
var color = colorPalette[colorIndex];
return color;
}
function LoadBoundingBoxes(objectDescriptions) {
if (objectDescriptions == null || objectDescriptions == false) {
alert('did not find any objects in image. returning from calling load bounding boxes : ' + objectDescriptions);
return;
}
var objectDesc = JSON.parse(objectDescriptions);
//alert('calling load bounding boxes, starting analysis on clientside js : ' + objectDescriptions);
rescaleCanvas();
var canvas = document.getElementById('PreviewImageBbox');
var img = document.getElementById('PreviewImage');
var ctx = canvas.getContext('2d');
ctx.drawImage(img, img.width, img.height);
ctx.font = "10px Verdana";
for (var i = 0; i < objectDesc.length; i++) {
ctx.beginPath();
ctx.strokeStyle = "black";
ctx.lineWidth = 1;
ctx.fillText(objectDesc[i].Name, objectDesc[i].X + objectDesc[i].Width / 2, objectDesc[i].Y + objectDesc[i].Height / 2);
ctx.fillText("Confidence: " + objectDesc[i].Confidence, objectDesc[i].X + objectDesc[i].Width / 2, 10 + objectDesc[i].Y + objectDesc[i].Height / 2);
}
for (var i = 0; i < objectDesc.length; i++) {
ctx.fillStyle = getColor();
ctx.globalAlpha = 0.2;
ctx.fillRect(objectDesc[i].X, objectDesc[i].Y, objectDesc[i].Width, objectDesc[i].Height);
ctx.lineWidth = 3;
ctx.strokeStyle = "blue";
ctx.rect(objectDesc[i].X, objectDesc[i].Y, objectDesc[i].Width, objectDesc[i].Height);
ctx.fillStyle = "black";
ctx.fillText("Color: " + getColor(), objectDesc[i].X + objectDesc[i].Width / 2, 20 + objectDesc[i].Y + objectDesc[i].Height / 2);
ctx.stroke();
}
ctx.drawImage(img, 0, 0);
console.log('got these object descriptions:');
console.log(objectDescriptions);
}
</script>
The index.html file in wwwroot is the place we usually put extra css and js for MAUI Blazor apps and Blazor apps. I have chosen to put the script directly into the index.html file and not in a .js file, but that is an option to be chosen to tidy up a bit more.
So there you have it, we can relatively easily find objects in images using Azure analyze image service in Azure Cognitive Services. We can get tags and captions of the image. In the demo the caption is shown above the picture loaded.
Azure Computer vision service is really good since it has got a massive training set and can recognize a lot of different objects for different usages.
As you see in the source code, I have the key and endpoint inside environment variables that the code expects exists. Never expose keys and endpoints in your source code.
No comments:
Post a Comment