Coding Grounds

Monday, 15 July 2024

Caching pure functions using Memoize in C#

This article will present a technique for caching pure functions in C# using Memoize technique. This is a programmatic caching of pure method or function where we have a method that always returns the same result or equivalent result given an input. This adds scalability, maybe the method takes long time to process and we want to avoid using resources and provide a quicker answer. If your method has side effects or does not yield the same or equivalent result (cosmetic changes ignored) given a set of parameter(s), it should not be memoized. But if it does, here is how you can do this. Note that memoize is a general technique used in functional programming and is used in many languages such as Javascript, for example in the Underscore.Js lib. First off, let's define some POCOs to test the memoize function out. We will use a small sample set of movies and their actors and additional information from the fabulous year 1997.

MovieStore.cs



public class MovieStore {
    public string GetActorsByMovieTitle(string movieTitle)
    {
        Console.WriteLine($"Retrieving actors for movie with title {movieTitle} at: {DateTime.Now}");
        List<Movie> movies1997 = System.Text.Json.JsonSerializer.Deserialize<List<Movie>>(movies1997json);
        string actors = string.Join(",", movies1997
        	.FirstOrDefault(m => m.name?.ToLower() == movieTitle?.ToLower())?.actors.ToArray());
        return actors;
    }   
    
    string movies1997json = """
[
{
  "name": "The Lost World: Jurassic Park",
  "year": 1997,
  "runtime": 129,
  "categories": [
    "adventure",
    "action",
    "sci-fi"
  ],
  "releasedate": "1997-05-23",
  "director": "Steven Spielberg",
  "writer": [
    "Michael Crichton",
    "David Koepp"
  ],
  "actors": [
    "Jeff Goldblum",
    "Julianne Moore",
    "Pete Postlethwaite"
  ],
  "storyline": "Four years after the failure of Jurassic Park on Isla Nublar, John Hammond reveals to Ian Malcolm that there was another island (\"Site B\") on which dinosaurs were bred before being transported to Isla Nublar. Left alone since the disaster, the dinosaurs have flourished, and Hammond is anxious that the world see them in their \"natural\" environment before they are exploited."
},
{
  "name": "The Fifth Element",
  "year": 1997,
  "runtime": 127,
  "categories": [
    "action",
    "adventure",
    "sci-fi"
  ],
  "releasedate": "1997-05-09",
  "director": "Luc Besson",
  "writer": [
    "Luc Besson",
    "Robert Mark Kamen"
  ],
  "actors": [
    "Bruce Willis",
    "Milla Jovovich",
    "Gary Oldman",
    "Chris Tucker",
    "Ian Holm",
    "Luke Perry",
    "Brion James",
    "Tommy Lister",
    "Lee Evans",
    "Charlie Creed-Miles",
    "John Neville",
    "John Bluthal",
    "Mathieu Kassovitz",
    "Christpher Fairbank"
  ],
  "storyline": "In the colorful future, a cab driver unwittingly becomes the central figure in the search for a legendary cosmic weapon to keep Evil and Mr. Zorg at bay."
} ,
{
  "name": "Starship Troopers",
  "year": 1997,
  "runtime": 129,
  "categories": [
    "action",
    "adventure",
    "sci-fi",
    "thriller"
  ],
  "releasedate": "1997-11-07",
  "director": "Paul Verhoeven",
  "writer": [
    "Edward Neumeier",
    "Robert A. Heinlein"
  ],
  "actors": [
    "Casper Van Dien",
    "Dina Meyer",
    "Denise Richards",
    "Jake Busey",
    "Neil Patrick Harris",
    "Clancy Brown",
    "Seth Gilliam",
    "Patrick Muldoon",
    "Michael Ironside"
  ],
  "storyline": "In the distant future, the Earth is at war with a race of giant alien insects. Little is known about the Bugs except that they are intent on the eradication of all human life. But there was a time before the war... A Mobile Infantry travels to distant alien planets to take the war to the Bugs. They are a ruthless enemy with only one mission: Survival of their species no matter what the cost..."
}
]
""";
}

Movie.cs



public class Movie
{
    public string name { get; set; }
    public int year { get; set; }
    public int runtime { get; set; }
    public List<string> categories { get; set; }
    public string releasedate { get; set; }
    public string director { get; set; }
    public List<string> writer { get; set; }
    public List<string> actors { get; set; }
    public string storyline { get; set; }
}

Let's suppose the method GetActorsByMovieTitle is called many times or takes a lot of time to calculate. We want to cache it, to memoize it. It will be cached in a simple manner using memoize. This will short term cache the results, if we would like to persist the memoized results for long duration, we would use some other caching service such as database or Redis cache. The caching will function in sequential calls inside the same scope, it could be scoped as a singleton and long term cached inside memory for example. So here is how we can do the memoization shown below.

FunctionalExtensions.cs



public static Func<T1, TOut> Memoize<T1, TOut>(this Func<T1, TOut> @this, Func<T1, string> keyGenerator)
	{
		var dict = new Dictionary<string, TOut>();
		return x =>
		{
			string key = keyGenerator(x);
			if (!dict.ContainsKey(key))
			{
				dict.Add(key, @this(x));
			}
			return dict[key];
		};
	}
	public static Func<T1, T2, TOut> Memoize<T1, T2, TOut>(this Func<T1, T2, TOut> @this, Func<T1, T2, string> keyGenerator)
	{
		var dict = new Dictionary<string, TOut>();
		return (x,y) =>
		{
			string key = keyGenerator(x,y);
			if (!dict.ContainsKey(key))
			{
				dict.Add(key, @this(x,y));
			}
			return dict[key];
		};
	}
	public static Func<T1, T2, T3, TOut> Memoize<T1, T2, T3, TOut>(this Func<T1, T2, T3, TOut> @this, Func<T1, T2, T3, string> keyGenerator)
	{
		var dict = new Dictionary<string, TOut>();
		return (x, y, z) =>
		{
			string key = keyGenerator(x, y,z);
			if (!dict.ContainsKey(key))
			{
				dict.Add(key, @this(x, y, z));
			}
			return dict[key];
		};
	}
	public static Func<T1, T2, T3, T4, TOut> Memoize<T1, T2, T3, T4, TOut>(this Func<T1, T2, T3, T4, TOut> @this, Func<T1, T2, T3, T4, string> keyGenerator)
	{
		var dict = new Dictionary<string, TOut>();
		return (x, y, z, w) =>
		{
			string key = keyGenerator(x, y, z, w);
			if (!dict.ContainsKey(key))
			{
				dict.Add(key, @this(x, y, z, w));
			}
			return dict[key];
		};
	}

As we see above, we use a dictionary inside the memoize overloads and the way generics works, a dictionary will live inside each overloaded method accepting a different count of generic type parameters. We also provide a keyGenerator method that must be supplied to specify how we build up a unique key that we decide how we shall key each results from the given set of parameter(s). Note that we return here a function result, that is a func, that returns TOut and accepts the specified parameters in each overload. T1 or T1,T2 or T1,T2,T3 or T1,T2,T3,T4 and so on. Expanding the methods above to for example 16 parameters would be fairly easy, the code above shows how we can add support for more and more parameters. I believe you should avoid methods with more than 7 parameters,
but the code above should be clear. We return a func and we also accept also a func which returns TOut and same amount of parameters of same types T1,.. in each overload. Okay, next up an example how we can use this memoize function in the main method.

Program.cs



void Main()
{
    var movieStore = new MovieStore();
    
    //string actors = movieStore.GetActorsByMovieTitle("Starship troopers");
    //actors.Dump("Starship Troopers - Actors");
    //
    //Demo of memoized function
    
    var GetActorsByMovieTitle = ((string movieTitle) => movieStore.GetActorsByMovieTitle(movieTitle));
    var GetActorsByMovieTitleM = GetActorsByMovieTitle.Memoize(x => x);
    
    var starShipTroopersActors1 = GetActorsByMovieTitleM("Starship troopers");
    starShipTroopersActors1.Dump("Starship troopers - Call to method #1 time");
    var starShipTroopersActors2 = GetActorsByMovieTitleM("Starship troopers");
    starShipTroopersActors2.Dump("Starship troopers - Call to method #2 time");
    var starShipTroopersActors3 = GetActorsByMovieTitleM("Starship troopers");
    starShipTroopersActors3.Dump("Starship troopers - Call to method #3 time");
}

Note that in the test case above we send in one parameter T1 of type string, which is a movie title and we declare a func variable first using a lambda. We have to do the memoization in two declarations here and we use the convention that we suffix the memoized function with 'M' for 'Memoize'

Program.cs



void Main()
{
    var movieStore = new MovieStore();    
    var GetActorsByMovieTitle = ((string movieTitle) => movieStore.GetActorsByMovieTitle(movieTitle));
    var GetActorsByMovieTitleM = GetActorsByMovieTitle.Memoize(x => x);

The code has added a Console.WriteLine in the method which is memoized to check how many times the method is actually called or the cached result is returned instead. A run in Linqpad 7 is shown in screenshot below, showing that the output is cached correct. Note that if we wanted a thread implementation, we could instead use ConcurrentDictionary for example. The following methods show how we can do this. We exchanged Dictionary with ConcurrentDictionary and exchanged Add with TryAdd method of ConcurrentDictionary.

Program.cs



	public static Func<T1, TOut> MemoizeV2<T1, TOut>(this Func<T1, TOut> @this, Func<T1, string> keyGenerator)
	{
		var dict = new ConcurrentDictionary<string, TOut>();
		return x =>
		{
			string key = keyGenerator(x);
			if (!dict.ContainsKey(key))
			{
				dict.TryAdd(key, @this(x));
			}
			return dict[key];
		};
	}
	public static Func<T1, T2, TOut> MemoizeV2<T1, T2, TOut>(this Func<T1, T2, TOut> @this, Func<T1, T2, string> keyGenerator)
	{
		var dict = new ConcurrentDictionary<string, TOut>();
		return (x, y) =>
		{
			string key = keyGenerator(x, y);
			if (!dict.ContainsKey(key))
			{
				dict.TryAdd(key, @this(x, y));
			}
			return dict[key];
		};
	}
	public static Func<T1, T2, T3, TOut> MemoizeV2<T1, T2, T3, TOut>(this Func<T1, T2, T3, TOut> @this, Func<T1, T2, T3, string> keyGenerator)
	{
		var dict = new ConcurrentDictionary<string, TOut>();
		return (x, y, z) =>
		{
			string key = keyGenerator(x, y, z);
			if (!dict.ContainsKey(key))
			{
				dict.TryAdd(key, @this(x, y, z));
			}
			return dict[key];
		};
	}
	public static Func<T1, T2, T3, T4, TOut> MemoizeV2<T1, T2, T3, T4, TOut>(this Func<T1, T2, T3, T4, TOut> @this, Func<T1, T2, T3, T4, string> keyGenerator)
	{
		var dict = new ConcurrentDictionary<string, TOut>();
		return (x, y, z, w) =>
		{
			string key = keyGenerator(x, y, z, w);
			if (!dict.ContainsKey(key))
			{
				dict.TryAdd(key, @this(x, y, z, w));
			}
			return dict[key];
		};
	}

Hopefully, memoize or the process of memoization should be clearer now. It is a call based caching technique used preferably for pure functions / methods that has the same or equivalent result given a set of input parameter(s) and we memoize the function / method and cache the results. When used inside e.g. a singleton, we can cache longer time in memory and achieve performance boosts. You could do the same of course using a static variable, but the memoize technique is more generic purpose and is a pattern that is used in many programming languages. F# usually got way better support for functional programming than C#, but actually lacks a built in memoization functionality. Other languages do support memoization built in, such as in Python and LISP. The following screen shot shows a run of memoization above, I used ConcurrentDictionary when I tested.

Tuesday, 2 July 2024

Maybe Monad in C# - Guarding against nulls

This article will look more at the Maybe monad in C#. It is used to guard against nulls and is one of the most known monads in functional programming. The Maybe monad is also called Option or Optional many places. The screenshot below shows the allowed state transitions.

First off, records will be used since they support immutability out of the box.

Maybe.cs



public abstract record Maybe<T>();
public record Nothing<T> : Maybe<T>;
public record UnhandledNothing<T> : Nothing<T>;
public record Something<T>(T Value) : Maybe<T>;
public record Error<T>(Exception CapturedError) : Maybe<T>;
public record UnhandledError<T>(Exception CapturedError) : Error<T>(CapturedError);

As we see, the base record Maybe of T is abstract and can be one of several subtypes. Nothing means there are no value contained inside the 'container' or monad. Something of T means a value is inside the 'container'. This value can be null, but the container wrapping this value is not null, hence by sticking to monads such as Maybe we avoid null issues for the code that lives outside using the container, accessing the container. Maybe container here is no magic bullet, but it will make it easier to avoid null issues in your code in the parts where it is used. Let's next look at extension methods for the Maybe types defined in the records above.

MaybeExtensions.cs



public static class MaybeExtensions
{

	/// <summary>ToMayBe operates as a Return function in functional programming, lifting a normal value into a monadic value</summary>
	public static Maybe<T> ToMaybe<T>(this T @this)
	{		
		if (!EqualityComparer<T>.Default.Equals(@this, default))
		{
			return new Something<T>(@this);
		}
		else if (@this != null && Nullable.GetUnderlyingType(@this.GetType()) == null && @this.GetType().IsPrimitive){
			//primitive types that are not nullable and has got a default value are to be considered to have contents and have Something<T>, for example int value 0 or bool value false
			return new Something<T>(@this);
		}		
		return new Nothing<T>();
	}
	
	/// <summary>TryGetValue is similar to Reduce method in functional programming, but signal a boolean flag if the retrieval of value was successful</summary>
	public static bool TryGetValue<T>(this Maybe<T> @this, out T value){
	 	value = @this switch {
			Something<T> s => s.Value,
			_ => default(T)			
		};
		return @this is Something<T>;
	}
	
    //<summary>Call method Bind first to get correct behavior;/summary>
	public static Maybe<T> OnSomething<T>(this Maybe<T> @this, Action<T> actionOnSomething){
		if (@this is Something<T> s){
			actionOnSomething(s.Value);
		}		
		return @this;
	}
	
    //<summary>Call method Bind first to get correct behavior;/summary>
	public static Maybe<T> OnNothing<T>(this Maybe<T> @this, Action actionOnNothing){
		if (@this is UnhandledNothing<T>){
			actionOnNothing();
			return new Nothing<T>(); //switch from UnhandledNothing<T> to Nothing<T>
		}
		return @this;
	}
	
    //<summary>Call method Bind first to get correct behavior;/summary>
	public static Maybe<T> OnError<T>(this Maybe<T> @this, Action<Exception> actionOnError){
		if (@this is UnhandledError<T> e){
			actionOnError(e.CapturedError);
			return new Error<T>(e.CapturedError); //switch from UnhandledError<T> to Error<T>
		}
		return @this;
	}
		
	/// <summary>Bind is similar to Map in functional programming, it applies a function to update and allow switching from TIn to TOut data types, possibly different types</summary>
	public static Maybe<TOut> Bind<TIn, TOut>(this Maybe<TIn> @this, Func<TIn, TOut> f){
		try
		{
			Maybe<TOut> updatedMaybe = @this switch {
			    null => new Error<TOut>(new Exception("Object input is null")),
				Something<TIn> s when !EqualityComparer<TIn>.Default.Equals(s.Value, default) => new Something<TOut>(f(s.Value)),
				Something<TIn> s when @this.GetType().GetGenericArguments().First().IsPrimitive && Nullable.GetUnderlyingType(@this.GetType()) == null => new Something<TOut>(f(s.Value)),
				Something<TIn> _ => new UnhandledNothing<TOut>(),
				UnhandledNothing<TIn> _ => new UnhandledNothing<TOut>(),
				Nothing<TIn> _ => new Nothing<TOut>(),
				UnhandledError<TIn> u => new UnhandledError<TOut>(u.CapturedError),
				Error<TIn> e => new Error<TOut>(e.CapturedError),
				_ => new Error<TOut>(new Exception($"Got a subtype of Maybe<T>, which is not supported: {@this?.GetType().Name}"))				
			};
			return updatedMaybe;			
		}
		catch (Exception ex)
		{			
			return new UnhandledError<TOut>(ex);
		}		
	}
		
}

The extension methods above handle the basics for the Maybe of T monad.

We transform from a value of T to the Maybe of T using the method ToMaybe, this method is called Return many places. I think the name ToMaybe is more intuitive for more developers in C#.
The method TryGetValue is called Reduce many places and extracts the value of Maybe of T if it is available and returns a boolean value if the Maybe of T actually contains a value. If Maybe of T is not the type Something of T it does not hold a value, so knowing this is useful after retrieving the value.
The method Bind is sometimes called Map and allows updates of the value inside the Maybe of T and also perform transitions of sub types of Maybe of T and uses pattern matching in C#. Bind both maps and also performs the overall flow of state via controlling the sub type of Maybe of T as shown in the pattern specified inside Bind
The methods OnError, OnNothing, OnSomething are callbacks to do logic when retrieving values of these types that inherit from Maybe of T. Make sure you call the method Bind first

If you know more methods a Maybe of T monad should support, please let me know. A more detailed example of a Maybe of T monad can be seen in this implementation, note that it is called Option of T instead.

https://github.com/nlkl/Optional/blob/master/src/Optional/Option_Maybe.cs

The benefit with the code in this article is that it is shorter and that it uses records in C#. Since it is shorter, it should be easier to adapt and adjust to the needs. Demo code is next. Consider this record:

Car.cs



public record Car(string Model, string Make, bool? IsFourWheelDrive = null, string? Color = null);

Program.cs



void Main()
{
	var something = new Something<string>("hullo");
	something = something with { Value = null };
	
	var somethingnull = EqualityComparer<string>.Default.Equals(something.Value, default);
	
	int? nullableNumber = 1;
	Maybe<int?> maybe = nullableNumber.ToMaybe();

	var volvo = new Car("240 GL", "Volvo");

	bool isValueRetrieved = volvo.ToMaybe()
		.Bind(car => car with { IsFourWheelDrive = false })
		.Bind(car => car with { Color = "Blue" })
		.TryGetValue(out Car updatedVolvo);

	new { updatedVolvo, isValueRetrieved }.Dump("Updated Volvo");
	
	Maybe<string> noCar = new Something<string>(null)
							.Bind(x => x)
							.OnNothing(() => Console.WriteLine("The Car is nothing (null)!"));
	noCar.Dump("No car");
		
	maybe.Dump("Maybe");
	
	something.Dump();
	
}

Output from Linqpad shows the result from running the demo code. Note that since a record Car was used in the demo code, inside the Bind calls it was necessary to use the 'with' to mutate the record and create a new record
with the necessary adjustments. Also, it is important to call Bind before handlers OnNothing, OnError and OnSomething to work properly in the code, since the transition rules inside Bind should run first to check the correct subtype of Maybe of T. Many implementations of Option of T or Maybe of T that is, also implement operators for equality comparison of the value inside the Maybe of T if there is a defined value there (i.e. Something of T). The mentioned example on Github goes in detail on this and adds several methods for equality comparisons. Also, there are examples of support async methods such as BindAsync in Maybe of T implementations out there. Many of the building blocks of functional programming in C# are not standardized or built into the framework, so working out these building blocks yourself to tailour your source code needs is probably a good solution, if not going for libraries such as LanguageExt.

https://github.com/louthy/language-ext

Finally, notice that in the screenshot below, we have a Value of null for the variable something in the demo code. To get the proper sub type of Maybe of T, we should also call the Bind method. If we instead of :


  something.Dump();

Do this:


  something.Bind(x => x).Dump();

We get the correct subtype of Maybe of T, UnhandledNothing of T.


And if we also handle the UnhandledNothing of T, we get Nothing of T.

something.Bind(x => x).OnNothing(() => Console.WriteLine("something is null")).Dump();

Sunday, 30 June 2024

Fork combinator revisited - supporting multiple part functions in C#

This article shows an example of a Fork combinator or 'monad' that will allow you to specify a join function that operates on all the part results and allow you to specify multiple part functions to be operated in sequence. First off, we define a simple map monad to map a value to another value of possibly other type. Then we define the fork combinator. The code below is very simple and short, it uses LINQ functionality to combine the results via the Select method, Linq is also functional so this is how we build up functional monads in C#, using Linq and Func and generics (and pattern matching and more).

Combinators.cs



public static class Combinators {
	
	public static TOut Map<TIn, TOut>(this TIn @this, Func<TIn, TOut> f) => f(@this);
	
	public static TOut Fork<TIn, TMiddle, TOut>(this TIn @this, Func<IEnumerable<TMiddle>, TOut> joinFunc,
		params Func<TIn, TMiddle>[] partFuncs) => partFuncs.Select(pf => pf(@this)).Map(joinFunc);
	
}

Let's look at a simple demo how to use this

Program.cs



public static class Program {
	
	string hello = "hhhhhheeeeeeeelllllllllllooooo";
	
	int sumOfLettersToLookFor = hello.Fork(results => (int)results.Sum(), 
				x => (double)x.Count(l => l == 'h'),
				x => (double) x.Count(l => l == 'e'),
				x => (double) x.Count(l => l == 'l'),
				x => (double) x.Count(l => l == 'o'));
	
	sumOfLettersToLookFor.Dump();
	
}

Functional programming has many of these monads that are very short and allows you to do combinations that would be lengthy and stateful in the procedural / object oriented way but elegant and short in the functional world. Finally a screenshot from Linqpad 7 showing the code above works : (A reference to Jerry Seinfeld to the right for those who know Seinfeld episodes)