Saturday, 16 November 2024

Url encoding base 64 strings in .NET 9

This article shows new functionality how to url encode base 64 strings in .NET 9. In .NET 8 you would do multiple steps to url encode base 64 strings like this: Program.cs



using System.Buffers.Text;
using System.Net;
using System.Text;
using System.Text.Encodings.Web;

byte[] data = Encoding.UTF8.GetBytes("Hello there, how yall doin");
var base64 = Convert.ToBase64String(data);
var base64UrlEncoded = WebUtility.UrlEncode(base64);

Console.WriteLine(base64UrlEncoded);


We here first convert the string to a bytes in a byte array and then we base 64 encode the byte array into a Base64 string. Finally we url encode the string into a URL safe string. Let's see how simple this is in .NET 9 : Program.cs (v2)



using System.Buffers.Text;
using System.Net;
using System.Text;
using System.Text.Encodings.Web;

byte[] data = Encoding.UTF8.GetBytes("Hello there, how yall doin");
var base64UrlEncodedInNet9 = Base64Url.EncodeToString(data);

Console.WriteLine(base64UrlEncodedInNet9);


If we use ImplicitUsings here in the .csproj file the code above just becomes :

byte[] data = Encoding.UTF8.GetBytes("Hello there, how yall doin");
var base64UrlEncodedInNet9 = Base64Url.EncodeToString(data);
Console.WriteLine(base64UrlEncodedInNet9);

This shows we can skip the intermediate step where we first convert the bytes into a base64-string and then into a Url safe string and instead do a base-64 encoding and then an url encoding in one go. This way is more optimized, it is also possible here to use ReadOnlySpan (that works for both .NET 8 and .NET 9). Putting together we get:

using System.Buffers.Text;
using System.Net;
using System.Text;
using System.Text.Encodings.Web;

ReadOnlySpan data = Encoding.UTF8.GetBytes("Hello there, how yall doin");
var base64 = Convert.ToBase64String(data);
var base64UrlEncoded = WebUtility.UrlEncode(base64);

var base64UrlEncodedInNet9 = Base64Url.EncodeToString(data);

Console.WriteLine(base64UrlEncoded);
Console.WriteLine(base64UrlEncodedInNet9);

The output is the following :

SGVsbG8gdGhlcmUsIGhvdyB5YWxsIGRvaW4%3D
SGVsbG8gdGhlcmUsIGhvdyB5YWxsIGRvaW4

As we can see, the .NET 9 Base64.UrlEncode skips the the padding characters, so beware of that.



Note that by omitting the padding, it is necessary to pad the base 64 url encoded string if you want to decode it. Consider this helpful extension method to add the necessary padding:


/// <summary>
/// Provides extension methods for Base64 encoding operations.
/// </summary>
public static class Base64Extensions
{
    /// <summary>
    /// Adds padding to a Base64 encoded string to ensure its length is a multiple of 4.
    /// </summary>
    /// <param name="base64">The Base64 encoded string without padding.</param>
    /// <param name="isUrlEncode">Set to true if this is URL encode, will add instead '%3D%' as padding at the end (0-2 such padding chars, same for '=').</param>
    /// <returns>The Base64 encoded string with padding added, or the original string if it is null or whitespace.</returns>
    public static string? AddPadding(this string base64, bool isUrlEncode = false)
    {
        string paddedBase64 = !string.IsNullOrWhiteSpace(base64) ? base64.PadRight(base64.Length + (4 - (base64.Length % 4)) % 4, '=') : base64;
        return !isUrlEncode ? paddedBase64 : paddedBase64?.Replace("=", "%3D");
    }    
}


We can now achieve the same output with this extension method :



using System.Buffers.Text;
using System.Net;
using System.Text;
using System.Text.Encodings.Web;

ReadOnlySpan data = Encoding.UTF8.GetBytes("Hello there, how yall doin");
var base64 = Convert.ToBase64String(data);
var base64UrlEncoded = WebUtility.UrlEncode(base64);

var base64UrlEncodedInNet9 = Base64Url.EncodeToString(data);

// Using the extension method to add padding
base64UrlEncodedInNet9 = base64UrlEncodedInNet9.AddPadding(isUrlEncode: true);

Console.WriteLine(base64UrlEncoded);
Console.WriteLine(base64UrlEncodedInNet9);


Finally, the output using the two different approaching pre .NET 9 and .NET 9 gives the same results:

SGVsbG8gdGhlcmUsIGhvdyB5YWxsIGRvaW4%3D
SGVsbG8gdGhlcmUsIGhvdyB5YWxsIGRvaW4%3D

Monday, 28 October 2024

Enumerating concurrent collections with snapshots in C#

In standard collections in C#, it is not allowed to alter collections you iterate upon using foreach for example, since it throws InvalidOperationException - Collection was modified; enumeration operation may not execute. Concurrent collections can be altered while being iterated. This is the default behavior, allow concurrent behavior while iterating - as locking the entire concurrent collection is costly. You can however enforce a consistent way of iterating the concurrent collection by making a snapshot of it. For concurrent dictionaries, we use the ToArray method.


	var capitals = new ConcurrentDictionary<string, string>{
		["Norway"] = "Oslo",
		["Denmark"] = "Copenhagen",
		["Sweden"] = "Stockholm",
		["Faroe Islands"] = "Torshamn",
		["Finland"] = "Helsinki",
		["Iceland"] = "Reykjavik"
	};

	//make a snapshot of the concurrent dictionary first 
	
	var capitalsSnapshot = capitals.ToArray();
	
	//do some modifications
	
	foreach (var capital in capitals){
		capitals[capital.Key] = capital.Value.ToUpper();
	}

	foreach (var capital in capitalsSnapshot)
	{
		Console.WriteLine($"The capital in {capital.Key} is {capital.Value}");
	}

This outputs:


The capital in Denmark is Copenhagen
The capital in Sweden is Stockholm
The capital in Faroe Islands is Torshamn
The capital in Norway is Oslo
The capital in Finland is Helsinki
The capital in Iceland is Reykjavik  



The snapshot of the concurrent collection was not modified by the modifications done. Let's look at the concurrent collection again and iterate upon it.


	foreach (var capital in capitals)
	{
		Console.WriteLine($"The capital in {capital.Key} is {capital.Value}");
	}

This outputs:


Enumerate capitals in concurrent array - just enumerating with ToArray() - elements can be changed while enumerating. Faster, but more unpredictable
The capital in Denmark is COPENHAGEN
The capital in Sweden is STOCKHOLM
The capital in Faroe Islands is TORSHAMN
The capital in Norway is OSLO
The capital in Finland is HELSINKI
The capital in Iceland is REYKJAVIK



As we can see, the concurrent dictionary has modified its contents and this shows that we can get modifications upon iterating collections. If you do want to get consistent results, using a snapshot should be desired. But note that this will lock the entire collection and involve costly operations of copying the contents. If you do do concurrent collection snapshots, keep the number of snapshots to a minimum and iterate upon these snapshots, preferable only doing one snapshot in one single place in the method for the specific concurrent dictionary.

Monday, 7 October 2024

Partition methods for collections in C#

This article will look at some partition methods for collections in C#, specifically List<T>, ConcurrentDictionary<TKey, TValue> and Dictionary<TKey, TValue>

Definition of partitioning: Partitioning consists of splitting up a collection {n1, n2, .. nk } into partitions of size P = C , where C is a positive constant integer. The last partition will consist of [0, C], the last C elements.

Example: A list of 100 elements will be partition by size 30, giving four partitions : 1: 0-29 2: 30-59 3: 60-89 4: 90-99

Note that partition 4 only got 9 elements.

Let's head over to some code.

The partition methods are the following :



using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;

public static class CollectionExtensions
{

    public static IEnumerable<IList<T&t;> Partition<T>(this IList<T> source, int size)
    {
        for (int i = 0; i < Math.Ceiling(source.Count / (double)size); i++)
        {
            yield return new List<T>(source.Skip(i * size).Take(size));
        }
    }

    public static IEnumerable<Dictionary<TKey, TValue&t;> Partition<TKey, TValue>(this IDictionary<TKey, TValue> source, int size)
    {
        for (int i = 0; i < Math.Ceiling(source.Keys.Count / (double)size); i++)
        {
            yield return new Dictionary<TKey, TValue>(source.Skip(i * size).Take(size));
        }
    }

    public static IEnumerable<ConcurrentDictionary<TKey, TValue> Partition<TKey, TValue>(this ConcurrentDictionary<TKey, TValue> source, int size)
    {
        for (int i = 0; i < Math.Ceiling(source.Keys.Count / (double)size); i++)
        {
            yield return new ConcurrentDictionary<TKey, TValue>(source.Skip(i * size).Take(size));
        }
    }

}




These three methods are very similar. An example usage is shown below. We partition a ConcurrentDictionary, for example one consisting of 200,000 key value pairs into partitions by size 50,000. This will produce a total of four partitions which are then processed at parallell.

Make note that even though you can partition a ConcurrentDictionary into multiple concurrent dictionaries consist after partitioning, the simpler approach code at the bottom of the method was quicker when I tested it out. There are a lot of pitfalls when it comes to concurrent programming.

The key takeaway from this article was how you can partition a collection into multiple partitions, this will enable you to do "Divide and Conquer" strategy when it comes to collections to partition labor among several threads in parallell.



	static int Enumerate(ConcurrentDictionary<int, int> dict)
	{
		//var stopWatch = Stopwatch.StartNew();

		var dicts = dict.Partition(dict.Count / 4).ToList();

		//Console.WriteLine(dicts.ElementAt(0).Count());
		//Console.WriteLine($"Partitioning took: {stopWatch.ElapsedMilliseconds} ms");

		int total = 0;

		Parallel.For(0, 4, (i) =>
		{
			int subTotal = 0;
			var curDict = dicts.ElementAt(i);
			//int count = curDict.Count;
			//Console.WriteLine($"Number in curDict : {count}");
			foreach (var item in curDict)
			{
				Interlocked.Add(ref subTotal, item.Value);
			}
			Interlocked.Add(ref total, subTotal);
		});

		return total;
        
        //Simpler approach :

		//int expectedTotal = dict.Count;

		//int total = 0;
		//Parallel.ForEach(dict, keyValPair =>
		//	 {
		//		 //int count = dict.Count;
		//		 Interlocked.Add(ref total, keyValPair.Value);
		//	 });
		//return total;
	}