Sunday, 24 July 2022

Generic repository pattern for Azure Cosmos DB

I have looked into Azure Cosmos DB to learn a bit about this schemaless database in Azure cloud. It is a powerful 'document database' which saves and loads data inside 'containers' in databases in Azure. You can work against this database strongly typed in C# by for example creating a repository pattern. The code I have made is in a class library which you can clone from here:
 
 git clone https://github.com/toreaurstadboss/AzureCosmosDbRepositoryLib.git
 

To get started with Azure Cosmos DB, you must create a user first that is against Azure Cosmos DB. This will be your 'db user' in the cloud of course. When you start up Azure Cosmos DB, select the data explorer tab to view your data, where you can enter manual queries and look at data (and manipulate it). Note that there are already more official packages for repository pattern .NET SDK available by David Pine, which you should consider using as shown here: https://docs.microsoft.com/en-us/events/azure-cosmos-db-azure-cosmos-db-conf/a-deep-dive-into-the-cosmos-db-repository-pattern-dotnet-sdk However, I have also published and pushed a simple GitHub repo I have created here which might be easier to get started with and understand. My goal anyways was to have a learning experience myself with testing out Azure Cosmos DB. The Github repo is available here: https://github.com/toreaurstadboss/AzureCosmosDbRepositoryLib The methods of the repository is listed inside IRepository :
 
  using AzureCosmosDbRepositoryLib.Contracts;
using Microsoft.Azure.Cosmos;
using System.Linq.Expressions;

namespace AzureCosmosDbRepositoryLib;


/// <summary>
/// Repository pattern for Azure Cosmos DB
/// </summary>
public interface IRepository<T> where T : IStorableEntity
{

    /// <summary>
    /// Adds an item to container in DB. 
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="item"></param>
    /// <returns></returns>
    Task<ISingleResult<T>?> Add(T item);

    /// <summary>
    /// Retrieves an item to container in DB. Param <paramref name="partitionKey"/> and param <paramref name="id"/> should be provided.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="id"></param>
    /// <returns></returns>
    Task<ISingleResult<T>?> Get(IdWithPartitionKey id);

    /// <summary>
    /// Searches for a matching items by predicate (where condition) given in <paramref name="searchRequest"/>.
    /// </summary>
    /// <param name="searchRequest"></param>
    /// <returns></returns>
    Task<ICollectionResult<T>?> Find(ISearchRequest<T> searchRequest);

    /// <summary>
    /// Searches for a matching items by predicate (where condition) given in <paramref name="searchRequest"/>.
    /// </summary>
    /// <param name="searchRequest"></param>
    /// <returns></returns>
    Task<ISingleResult<T>?> FindOne(ISearchRequest<T> searchRequest);

    /// <summary>
    /// Removes an item from container in DB. Param <paramref name="partitionKey"/> and param <paramref name="id"/> should be provided.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="partitionKey"></param>
    /// <param name="id"></param>
    /// <returns></returns>
    Task<ISingleResult<T>?> Remove(IdWithPartitionKey id);

    /// <summary>
    /// Removes items from container in DB. Param <paramref name="ids"/> must be provided.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="partitionKey"></param>
    /// <param name="id"></param>
    /// <returns></returns>
    Task<ICollectionResult<T>?> RemoveRange(List<IdWithPartitionKey> ids);

    /// <summary>
    /// Adds a set of items to container in DB. A shared partitionkey is used and the items are added inside a transaction as a single operation.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="items"></param>
    /// <param name="partitionKey"></param>
    /// <returns></returns>
    Task<ICollectionResult<T>?> AddRange(IDictionary<PartitionKey, T> items);

    /// <summary>
    /// Adds or updates items via 'Upsert' method in container in DB. 
    /// </summary>
    /// <param name="item"></param>
    /// <returns></returns>
    Task<ICollectionResult<T>?> AddOrUpdateRange(IDictionary<PartitionKey, T> items);


    /// <summary>
    /// Adds or updates an item via 'Upsert' method in container in DB. 
    /// </summary>
    /// <param name="item"></param>
    /// <returns></returns>
    Task<ISingleResult<T>?> AddOrUpdate(T item);

    /// <summary>
    /// Retrieves results paginated of page size. Looks at all items of type <typeparamref name="T"/> in the container. Send in a null value for continuationToken in first request and then use subsequent returned continuation tokens to 'sweep through' the paged data divided by <paramref name="pageSize"/>.
    /// </summary>
    /// <param name="pageSize"></param>
    /// <param name="continuationToken"></param>
    /// <param name="sortDescending">If true, sorting descending (sorting via LastUpdate property so newest items shows first)</param>
    /// <returns></returns>
    Task<IPaginatedResult<T>?> GetAllPaginated(int pageSize, string? continuationToken = null, bool sortDescending = false, Expression<Func<T, object>>[]? sortByMembers = null);

    /// <summary>
    /// On demand method exposed from exposing this respository on demands. Frees up resources such as CosmosClient object inside.
    /// </summary>
    void Dispose();

    /// <summary>
    /// Returns name of database in Azure Cosmos DB
    /// </summary>
    /// <returns></returns>
    string? GetDatabaseName();

    /// <summary>
    /// Returns Container id inside database in Azure Cosmos DB
    /// </summary>
    /// <returns></returns>
    string? GetContainerId(); 

}
  
 
 
Let's first look at retrieving the paginated results using a 'continuation token' which is how you do paging inside Azure Cosmos DB where this token is a 'bookmark'.
 
 
  public async Task<IPaginatedResult<T>?> GetAllPaginated(int pageSize, string? continuationToken = null, bool sortDescending = false,
        Expression<Func<T, object>>[]? sortByMembers = null)
    {
        string sortByMemberNames = sortByMembers == null ? "c.LastUpdate" :
            string.Join(",", sortByMembers.Select(x => "c." + x.GetMemberName()).ToArray()); 
        var query = new QueryDefinition($"SELECT * FROM c ORDER BY {sortByMemberNames} {(sortDescending ? "DESC" : "ASC")}".Trim()); //default query - will filter to type T via 'ItemQueryIterator<T>' 
        var queryRequestOptions = new QueryRequestOptions
        {
            MaxItemCount = pageSize
        };
        var queryResultSetIterator = _container.GetItemQueryIterator<T>(query, requestOptions: queryRequestOptions,
            continuationToken: continuationToken);
        var result = queryResultSetIterator.HasMoreResults ? await queryResultSetIterator.ReadNextAsync() : null;
        if (result == null)
            return null!;

        var sourceContinuationToken = result.ContinuationToken;
        var paginatedResult = new PaginatedResult<T>(sourceContinuationToken, result.Resource);
        return paginatedResult;

    }
 
 
We sort by LastUpdate member default and we send in the pagesize, sorting ascending default and allowing to specify sorting members. A helper method to get the name of the property expressions to use as sorting member is also used here. We get a query item iterator from the 'container' and then read the found items which is in the Resource property, all asynchronously. Note that we return the continutation token here in the end and we initially send in null as the continuation token. Each call to getting a new page will get a new continuation token so we can browse through the data in pages. When the continuation token is null, we have come to the end of the data. PaginatedResult looks like this:
 
 
namespace AzureCosmosDbRepositoryLib.Contracts
{

    public interface IPaginatedResult<T>
    {
        public IList<T> Items { get; }
        public string? ContinuationToken { get; set; }
    }

    public class PaginatedResult<T> : IPaginatedResult<T>
    {
        public PaginatedResult(string continuationToken, IEnumerable<T> items)
        {
            Items = new List<T>();
            if (items != null)
            {
                foreach (var item in items)
                {
                    Items.Add(item);
                }
            }
            ContinuationToken = continuationToken;
        }
        public IList<T> Items { get; private set; }
        public string? ContinuationToken { get; set; }
    }
}

 
Another thing in the lib is the contract IStorableEntity, which is a generic interface of type T, which defined a Id property - note the usage of JsonProperty attribute. Also, we set up a partition key for the item here.
 
 
using Microsoft.Azure.Cosmos;
using Newtonsoft.Json;

namespace AzureCosmosDbRepositoryLib.Contracts
{
    public interface IStorableEntity
    {
        [JsonProperty("id")]
        string Id { get; set; }

        PartitionKey? PartitionKey { get; }

        DateTime? LastUpdate { get; set; }
    }
} 
 
 
It is important to both have set the id and partitionkey when you save, update and delete items in container in Azure Cosmos DB so it works as expected. There are other methods in this repo as seen in the IRepository interface. The repository class will take care of creating the database and container in Azure Cosmos DB if required. Note also that it is important in intranet scenarios to set up Gateway connection mode. This is done default and the reason why this is done is because of firewall issues.
 
 
  private void InitializeDatabaseAndContainer(CosmosClientOptions? clientOptions, ThroughputProperties? throughputPropertiesForDatabase, bool defaultToUsingGateway)
    {
        _client = clientOptions == null ?
            defaultToUsingGateway ?
            new CosmosClient(_connectionString, new CosmosClientOptions
            {
                ConnectionMode = ConnectionMode.Gateway //this is the connection mode that works best in intranet-environments and should be considered as best compatible approach to avoid firewall issues
            }) :
            new CosmosClient(_connectionString) :
            new CosmosClient(_connectionString, _cosmosClientOptions);

        //Run initialization 
        if (throughputPropertiesForDatabase == null)
        {
            _database = Task.Run(async () => await _client.CreateDatabaseIfNotExistsAsync(_databaseName)).Result; //create the database if not existing (will go for default options regarding scaling)
        }
        else
        {
            _database = Task.Run(async () => await _client.CreateDatabaseIfNotExistsAsync(_databaseName, throughputPropertiesForDatabase)).Result; //create the database if not existing - specify specific through put options
        }

        // The container we will create.  
        _container = Task.Run(async () => await _database.CreateContainerIfNotExistsAsync(_containerId, _partitionKeyPath)).Result;
    } 
 
 
Another example using another iterator than the item query generator is the linq query generator. This is used inside the Find method :
 
 public async Task<ICollectionResult<T>?> Find(ISearchRequest<T>? searchRequest)
    {
        if (searchRequest?.Filter == null)
            return await Task.FromResult<ICollectionResult<T>?>(null);
        var linqQueryable = _container.GetItemLinqQueryable<T>();
        var stopWatch = Stopwatch.StartNew();
        try
        {
            using var feedIterator = linqQueryable.Where(searchRequest.Filter).ToFeedIterator();
            while (feedIterator.HasMoreResults)
            {
                var items = await feedIterator.ReadNextAsync();
                var result = BuildSearchResultCollection(items.Resource);
                result.ExecutionTimeInMs = stopWatch.ElapsedMilliseconds;
                return result;
            }
        }
        catch (Exception err)
        {
            return await Task.FromResult(BuildSearchResultCollection(err));
        }
        return await Task.FromResult<ICollectionResult<T>?>(null);
    } 
 
 
 
 
using System.Linq.Expressions;
namespace AzureCosmosDbRepositoryLib.Contracts
{
    public interface ISearchRequest<T>
    {
        Expression<Func<T, bool>>? Filter { get; set; }
    }

    public class SearchRequest<T> : ISearchRequest<T>
    {
        public Expression<Func<T, bool>>? Filter { get; set; }
    }
} 
 
Note that this lib is using Microsoft.Azure.Cosmos of version 3.29. There are differences between the major version obviously, so the methods shown here applies to Azure Cosmos DB 3.x. This is the only nuget package this lib requires. You should consider the SDK that David Pine created, but if you want to create a repository pattern against Azure Cosmos DB - then maybe you find my repository pattern a starting point and you can borrow some code from it. One final note - I had troubles doing a batch insert in the lib for a type of T using transactions in Azure Cosmos DB, that this seems to require a common partition key - it ended up in a colission. So the AddRange method in the lib is not batched with one partition but done sequentially looping through the items for now.. Other than that - the lib should work for some core usages in ordinary scenarios. The lib should log errors a bit better too, so the lib is primarily for demonstration usages and showing essential CRUD operations in Azure Cosmos DB. Note that the connection string should be saved into a appsettings.json file for example or in dev environments consider using dotnet user secrets as I have done so we do not expose secrets to source control. The connection string is shown under the 'Keys' tab in the Azure cosmos. You will look for 'primary connetion string' here as this is how you connect to your database and container(s), where data resides. Use the 'data explorer' tab to work with the data.

1 comment: