Monday, 7 June 2021

Json serialization using Utf8JsonReaderSerializer in .net core

.NET 5 and .net core contains a lot of new methods for Json functionality in the System.Text.Json namespace. I created a helper class for reading a file using Utf8JsonReaderSerializer and this just outputs the json to a formatted json string. With optimizations, the serialization could be done even faster. For now, I need to use a conversion between StringBuilder toString to remove last commas of arrays and properties of objects as the Utf8JsonReaderSerializer is sequential, forward-only as mentioned in the API page at: https://docs.microsoft.com/en-us/dotnet/api/system.text.json.utf8jsonreader?view=net-5.0
This is the helper method I came up with to read a file and take the way via Utf8JsonReaderSerializer:
 

using System;
using System.IO;
using System.Linq;
using System.Text;
using System.Text.Json;

namespace SystemTextJsonTestRun
{
    public static class Utf8JsonReaderSerializer
    {

        public static string ReadFile(string filePath)
        {         
            if (!File.Exists(filePath))
            {
                throw new FileNotFoundException(filePath);
            }

            var jsonBytes = File.ReadAllBytes(filePath);
            var jsonSpan = jsonBytes.AsSpan();
            var json = new Utf8JsonReader(jsonSpan);
            var sb = new StringBuilder();

            while (json.Read())
            {
                if (json.TokenType == JsonTokenType.StartObject)
                {
                    sb.Append(Environment.NewLine);
                }
                else if (json.TokenType == JsonTokenType.EndObject)
                {
                    //remove last comma added 

                    sb.RemoveLast(",");

                    sb.Append(Environment.NewLine);
                }

                if (json.CurrentDepth > 0)
                {
                    for (int i = 0; i < json.CurrentDepth; i++)
                    {
                        sb.Append(" "); //space indentation
                    }
                }

                sb.Append(GetTokenRepresentation(json));


                if (json.TokenType == JsonTokenType.EndObject || json.TokenType == JsonTokenType.EndArray)
                {
                    sb.AppendLine();
                }

                if (new[] { JsonTokenType.String, JsonTokenType.Number, JsonTokenType.Null, JsonTokenType.False,
                JsonTokenType.Number, JsonTokenType.None, JsonTokenType.True }.Contains(json.TokenType))
                {
                    sb.AppendLine(",");
                }

            }

            //remove last comma for EndObject 

            sb.RemoveLast(",");

            return sb.ToString(); 


        }


        private static string GetTokenRepresentation(Utf8JsonReader json) =>
          json.TokenType switch
          {
              JsonTokenType.StartObject => $"{{{Environment.NewLine}",
              JsonTokenType.EndObject => "},",
              JsonTokenType.StartArray => $"[{Environment.NewLine}",
              JsonTokenType.EndArray => $"]",
              JsonTokenType.PropertyName => $"\"{json.GetString()}\":",
              JsonTokenType.Comment => json.GetString(),
              JsonTokenType.String => $"\"{json.GetString()}\"",
              JsonTokenType.Number => GetNumberToString(json),
              JsonTokenType.True => json.GetBoolean().ToString().ToLower(),
              JsonTokenType.False => json.GetBoolean().ToString().ToLower(),
              JsonTokenType.Null => string.Empty,
              _ => "Unknown Json token type"
          };

        //TODO: Use the Try methods of the Utf8JsonReader more than trying and failing here 

        private static string GetNumberToString(Utf8JsonReader json)
        {
            try
            {
                if (int.TryParse(json.GetInt32().ToString(), out var res))
                    return res.ToString();
            }
            catch
            {
                try
                {
                    if (float.TryParse(json.GetSingle().ToString(), out var resFloat))
                        return resFloat.ToString();
                }
                catch
                {
                    try
                    {
                        if (decimal.TryParse(json.GetDouble().ToString(), out var resDes))
                            return resDes.ToString();
                    }
                    catch
                    {
                        return "?";
                    }
                }
            }
            return $"?"; //fallback to a string if not possible to deduce the type
        }

    }
}

  

The json file I tested the code with inputted came out again as this string:

{
 "courseName": "Build Your Own Application Framework",
 "language": "C#",
 "author":
 {
  "firstName":  "Matt",
  "lastName":  "Honeycutt"

 },
 "publishedAt": "2012-03-13T12:30:00.000Z",
 "publishedYear": 2014,
 "isActive": true,
 "isRetired": false,
 "tags": [
  "aspnet",
  "C#",
  "dotnet"
 ]

}

This code validates against Json Lint also: https://jsonlint.com Now why even bother parsing a Json file just to output the file again to a json string? Well, first of all, we use a very fast parser Utf8JsonReader from .NET and we can for example do various processing along the forward-only sequential processing and formatting indented the file. Utf8JsonReader will also validate the json document strictly to the Json specification - RFC 8259. Hence, we can get validation for free here to by catching any errors and returning true or false in method that scans this file by adding a method for this looking at the json.Read() method (if it returns false) or catching JsonException if a node of the json document does not validate. Also, a low level analysis of the Utf8JsonReader let's you see which different tokens of the json document structure .NET provides. We could transform the document or add specific formatting and so on by altering the code displayed here. To run the code test with a sample json document like this:

   class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Utf8JsonReader sample");

            string json = Utf8JsonReaderSerializer.ReadFile("sample.json");
            string tempFile = Path.ChangeExtension(Path.GetTempFileName(), "json"); 
            File.WriteAllText(tempFile, json);
            Console.WriteLine($"Json file read and processed result in location: {tempFile}");
            Console.WriteLine($"Json file contents: {Environment.NewLine}{json}");

        
        }

I have added the code for this here: https://github.com/toreaurstadboss/Utf8DataJsonReaderTest/

No comments:

Post a Comment