Monday, 7 October 2024

Partition methods for collections in C#

This article will look at some partition methods for collections in C#, specifically List<T>, ConcurrentDictionary<TKey, TValue> and Dictionary<TKey, TValue>

Definition of partitioning: Partitioning consists of splitting up a collection {n1, n2, .. nk } into partitions of size P = C , where C is a positive constant integer. The last partition will consist of [0, C], the last C elements.

Example: A list of 100 elements will be partition by size 30, giving four partitions : 1: 0-29 2: 30-59 3: 60-89 4: 90-99

Note that partition 4 only got 9 elements.

Let's head over to some code.

The partition methods are the following :



using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;

public static class CollectionExtensions
{

    public static IEnumerable<IList<T&t;> Partition<T>(this IList<T> source, int size)
    {
        for (int i = 0; i < Math.Ceiling(source.Count / (double)size); i++)
        {
            yield return new List<T>(source.Skip(i * size).Take(size));
        }
    }

    public static IEnumerable<Dictionary<TKey, TValue&t;> Partition<TKey, TValue>(this IDictionary<TKey, TValue> source, int size)
    {
        for (int i = 0; i < Math.Ceiling(source.Keys.Count / (double)size); i++)
        {
            yield return new Dictionary<TKey, TValue>(source.Skip(i * size).Take(size));
        }
    }

    public static IEnumerable<ConcurrentDictionary<TKey, TValue> Partition<TKey, TValue>(this ConcurrentDictionary<TKey, TValue> source, int size)
    {
        for (int i = 0; i < Math.Ceiling(source.Keys.Count / (double)size); i++)
        {
            yield return new ConcurrentDictionary<TKey, TValue>(source.Skip(i * size).Take(size));
        }
    }

}




These three methods are very similar. An example usage is shown below. We partition a ConcurrentDictionary, for example one consisting of 200,000 key value pairs into partitions by size 50,000. This will produce a total of four partitions which are then processed at parallell.

Make note that even though you can partition a ConcurrentDictionary into multiple concurrent dictionaries consist after partitioning, the simpler approach code at the bottom of the method was quicker when I tested it out. There are a lot of pitfalls when it comes to concurrent programming.

The key takeaway from this article was how you can partition a collection into multiple partitions, this will enable you to do "Divide and Conquer" strategy when it comes to collections to partition labor among several threads in parallell.



	static int Enumerate(ConcurrentDictionary<int, int> dict)
	{
		//var stopWatch = Stopwatch.StartNew();

		var dicts = dict.Partition(dict.Count / 4).ToList();

		//Console.WriteLine(dicts.ElementAt(0).Count());
		//Console.WriteLine($"Partitioning took: {stopWatch.ElapsedMilliseconds} ms");

		int total = 0;

		Parallel.For(0, 4, (i) =>
		{
			int subTotal = 0;
			var curDict = dicts.ElementAt(i);
			//int count = curDict.Count;
			//Console.WriteLine($"Number in curDict : {count}");
			foreach (var item in curDict)
			{
				Interlocked.Add(ref subTotal, item.Value);
			}
			Interlocked.Add(ref total, subTotal);
		});

		return total;
        
        //Simpler approach :

		//int expectedTotal = dict.Count;

		//int total = 0;
		//Parallel.ForEach(dict, keyValPair =>
		//	 {
		//		 //int count = dict.Count;
		//		 Interlocked.Add(ref total, keyValPair.Value);
		//	 });
		//return total;
	}


Monday, 30 September 2024

Generic alternate lookup for Dictionary in .NET 9

Alternate lookup for Dictionary in .NET 9 demo

This repo contains code that shows how an alternate lookup of dictionaries can be implemented in .NET 9. A generic alternate equality comparer is also included. Alternate lookups of dictionaries allows you to take control how you can look up values in a dictionaries in a custom manner. Usually, we use a simple key for a dictionary, such as an int. In case you instead have keys that are complex objects such as class instances, having a custom way of defining alternate lookup gives more flexibility. In the generic equality comparer, a key expression is provided, where a member expression is expected. You can for example have a class Person where you could use a property Id of type Guid and use that key to look up values in a dictionary that uses Person as a key. The code below and sample code demonstrates how it can be used.

Now, would you use this in .NET ? You can utilize usage of Spans, allowing increased performance for dictionary lookups. Also you can use this technique to more collections, such as HashSet, ConcurrentDictionary, FrozenDictionary and FrozenSet. The generic alternate equality comparer looks like this :


using System.Linq.Expressions;
using LookupDictionaryOptimized;


namespace LookupDictionaryOptimized
{
    public class AlternateEqualityComparer<T, TKey> : IEqualityComparer<T>, IAlternateEqualityComparer<TKey, T>
        where T : new()
    {
        private readonly Expression<Func<T, TKey>> _keyAccessor;

        private TKey GetKey(T obj) => _keyAccessor.Compile().Invoke(obj);

        public AlternateEqualityComparer(Expression<Func<T, TKey>> keyAccessor)
        {
            _keyAccessor = keyAccessor;
        }

        public AlternateEqualityComparer<T, TKey> Instance
        {
            get
            {
                return new AlternateEqualityComparer<T, TKey>(_keyAccessor);
            }
        }

        T IAlternateEqualityComparer<TKey, T>.Create(TKey alternate)
        {
            //create a dummy default instance if the requested key is not contained in the dictionary
            return Activator.CreateInstance<T>();
        }

        public bool Equals(T? x, T? y)
        {
            if (x == null && y == null)
            {
                return true;
            }
            if ((x == null && y != null) || (x != null && y == null))
            {
                return false;
            }
            TKey xKey = GetKey(x!);
            TKey yKey = GetKey(y!);
            return xKey!.Equals(yKey);
        }

        public int GetHashCode(T obj) => GetKey(obj)?.GetHashCode() ?? default;

        public int GetHashCode(TKey alternate) => alternate?.GetHashCode() ?? default;

        public bool Equals(TKey alternate, T other)
        {
            if (alternate == null && other == null)
            {
                return true;
            }
            if ((alternate == null && other != null) || (alternate != null && other == null))
            {
                return false;
            }
            TKey otherKey = GetKey(other);
            return alternate!.Equals(otherKey);
        }
    }

}

The demo below shows how to use this. When instantiating the dictionary, it is possibe to set the IEqualityComparer. You can at the same time implement IAlternateEqualityComparer. The generic class above does this for you, and an instance of this comparer is passed into the dictionary as an argument upon creation. A lookup can then be stored into a variable
using the GetAlternateLookup method.

Note about this demo code below. We could expand and allow multiple members or any custom logic when defining alternate equality lookup. But the code below only expects one key property. To get more control of the alterate lookup, you must write an equality and alternate equality comparer manually, but much of the plumbing code could be defined in a generic manner.

For example, we could define a compound key such as a ReadonlySpan of char or a string where we combine the key properties we want to use. Such a generic alternate equality comparer could expect a params of key properties and then build a compound key. It is possible here to to use HashCode.Combine method for example. I might look in to such an implementation later on, for example demo how to use TWO properties for a lookup or even consider a Func<bool> method to define as the equality comparison method. But quickly , the gains of a such a generic mechanism might become counteractive opposed to just writing an equality comparer and alternate comparer manually.

The primary motivation of alternate dictionary lookup is actually performance, as the alternate lookup allows to make more use of Spans and avoid a lot of allocations and give improved performance.


    /// <summary>
    /// Based from inspiration of nDepend blog article : https://blog.ndepend.com/alternate-lookup-for-dictionary-and-hashset-in-net-9/
    /// </summary>
    public static class DemoAlternateLookupV2
    {
        public static void RunGenericDemo()
        {
            var paul = new Person("Paul", "Jones");
            var joey = new Person("Joey", "Green");
            var laura = new Person("Laura", "Bridges");

            var mrX = new Person("Mr", "X"); //this object is not added to the dictionary

            AlternateEqualityComparer<Person, Guid> personComparer = new AlternateEqualityComparer<Person, Guid>(m => m.Id);

            var dict = new Dictionary<Person, int>(personComparer.Instance)
            {
                { paul, 11 },
                { joey, 22 },
                { laura, 33 }
            };

            var lauraId = laura.Id;
            //Dictionary<Person, int>.AlternateLookup<Guid> lookup = dict.GetAlternateLookup<Guid>();  Easier : just use var on left hand side

            var lookup = dict.GetAlternateLookup<Guid>();
            int lookedUpPersonId = lookup[lauraId];

            Console.WriteLine($"Retrieved a Dictionary<Person,Guid> value via alternate lookup key: {lauraId}.\nThe looked up value is: {lookedUpPersonId}");
            lookedUpPersonId.Should().Be(33);
            Console.WriteLine($"Expected value retrieved. OK.");

            Console.WriteLine("Testing also to look for a person not contained in the dictionary");

            bool lookedUpNonExistingPersonFound = lookup.ContainsKey(mrX.Id);
            Console.WriteLine($"Retrieved a Dictionary<Person,Guid> value via alternate lookup key: {mrX.Id}.\nThe looked up value found : {lookedUpNonExistingPersonFound}");

        }

    }

The generic alternate equality comparer requires a public parameterless constructor. Also, the provided keyExpression for the key - the property of the class which will serve as the alternate lookup. The Person class looks like this :



 namespace LookupDictionaryOptimized
{
    public class Person
    {

        public Person(string firstName, string lastName)
        {
            FirstName = firstName;
            LastName = lastName;
        }

        public Person()
        {
            FirstName = string.Empty;
            LastName = string.Empty;
            Id = Guid.Empty;
        }

        public string FirstName { get; set; }
        public string LastName { get; set; }
        public Guid Id { get; set; } = Guid.NewGuid();
    }
}

Output below:



Retrieved a Dictionary<Person,Guid> value via alternate lookup key: 5b2b1d28-c024-4b76-8cdd-2717c42dc7f8.
The looked up value is: 33
Expected value retrieved. OK.
Testing also to look for a person not contained in the dictionary
Retrieved a Dictionary<Person,Guid> value via alternate lookup key: 6ae6f259-14a6-4960-889b-15f33aab4ec0.
The looked up value found : False
Hit the any key to continue..

More about alternate lookups can be read in this nDepend blog article: https://blog.ndepend.com/alternate-lookup-for-dictionary-and-hashset-in-net-9/

Monday, 23 September 2024

Looking up gMSA accounts in Active Directory

A short article with tips how you can find gMSA accounts in Active Directory or AD. First off, import the module ActiveDirectory. Then you can run the following snippet to find some gMsa accounts of which you know part of the name of.


Import-Module ActiveDirectory
Get-ADObject -Filter {ObjectClass -eq "msDS-GroupManagedServiceAccount"  -and Name -Like '*SomeGmsa*' }  -Properties DistinguishedName,  SamAccountName | Select DistinguishedName, SamAccountName 


This yields the results:


DistinguishedName                                                   SamAccountName  
-----------------                                                   --------------  
CN=gMSA1DVSomeGmsa,CN=Managed Service Accounts,DC=someacme,DC=org        MSA1DVSomeAcmeP$    
CN=gMSA1_gMSA1DGmsaPT,CN=Managed Service Accounts,DC=someacme,DC=org     MSA1gMSA1DVSomeAcme$
CN=gMSA1_DVSomeGmsaPT,CN=Managed Service Accounts,DC=someacme,DC=org     MSA1DVSomeAcmePT$   

You can search for gMSA users in AD like this:


Import-Module ActiveDirectory 

Get-ADServiceAccount -Filter "Name -like '*SomeGmsa*'"


This should yield a list of matching gMSA users with given name :

You can also ask for all properties of Gmsa users using -Properties with * :



Import-Module ActiveDirectory
Get-ADObject -Filter {ObjectClass -eq "msDS-GroupManagedServiceAccount"  -and Name -Like '*SomeGmsa*' }  -Properties *