Sunday, 21 December 2025

Finding out file extension from byte inspection

Consider a byte array stored in a column in a table in a database column. How can we identify the file extension of the byte array by inspecting the byte array itself?
Note that byte arrays could be saved many places, also within files or similar.
The extension of a file can be discovered by inspect the File header. This is the first bytes, usually the first tens or hundreds of bytes of the byte array and constitute the file header. Some extensions got multiple file headers. A best effort to identity byte contents of a column in a database.

Let's use Powershell to inspect a file on disk, a sample JPEG file (.jpg). Lets run the following little script:

format-hex .\Stavkyrkje_Røldal.jpg | Select-Object -First 16 The first few bytes are FF D8 FF
I have added a sample Github repo with utility code to check well-known file types for their file extensions.

https://github.com/toreaurstadboss/FileHeaderUtil

The following screenshot shows the application in use. It found out that a byte array seems to be a PDF file by looking at the file header and file trailer. A good match was found :


In fact, a very good match, since both the header and the trailer fully agrees. Note that the 0A bytes are just padding bytes at the end of files and ignored in this util. See the method NormalizeHex presented further below.

Using Gary Kessler`s assembled lists of known file headers and trailers for well-known file types

The util class below shows the helper methods that inspects a byte array and evalues the file header and file trailer against a list of known such headers and trailers.

It bases a compilation of known file headers and file trailers known as "Magic Numbers", compiled by Gary Kessler during the years. In all, 600+ known file types are checked against to classify the matching file extension. Please note that there are cases where multiple matches exists of file header and file trailers matching the given byte array. The matches are sorted by number of matching bytes. The assembled list is very helpful. Thanks, Gary !

Using the file header and also possibly the last bytes of a byte array, the file trailer, we can classify the file type we have in the byte array, i.e. file extension is also implied here by recognizing the file array.

Of course, if one is allowing byte array to be uploaded from a public site for example, it still would be possible to inject malicious bytes, but being able to detect the kind of file is useful both concerning security policies and also determine if the bytes should be handled by an external application or provide information to the end-user what kind of file we have provided a path to for this util.

The curated list of file headers is based upon the list of signatures gathered by Gary Kessler and published on his website here (license of that file is not stated and considered public as it is publicly available information on his website not marked with a license):

https://www.garykessler.net/library/file_sigs.html

This list contains about 650 file types and should cover most of the wellknown formats, including formats not being used so often anymore. If you want to augment the list, check other sources such as Wikipedia if there is information about the given file extension's file header and/or file trailer, so-called "Magic number".

The curated list was updated 3rd June 2023 and contains most well-known file types.

The program uses the file signatures (Json format) to identity the file types of a byte array. Most usually, this is judged by looking at the first few bytes of the file (the so-called "magic numbers"). Sometimes, the file signature may also include bytes from the end of the file (the "trailer").


FileSignatureUtil.cs



using System;
using System.Collections.Generic;
using System.Text;

namespace FileHeaderUtil;


public static class FileSignatureUtil
{

    static FileSignature[] _fileSignatures = [];

    static FileSignatureUtil()
    {
        string json = File.ReadAllText("file_Sigs.json");
        var fileSignaturesRoot = System.Text.Json.JsonSerializer.Deserialize<FileSignatureRootElement>(json, new System.Text.Json.JsonSerializerOptions
        {
            PropertyNameCaseInsensitive = true
        });
        _fileSignatures = fileSignaturesRoot?.FileSigs?.ToArray()!;
    }

    /// <summary>
    /// Scans the specified file and returns a list of file signatures that match the file's header and, if applicable,
    /// file's trailer.
    /// </summary>
    /// <remarks>Only file signatures with a defined header are considered for matching. Trailer matching is
    /// performed if both the file and the signature define a trailer. A header and trailer of 64 bytes is evaluted to also 
    /// detect file types / extensions with longer headers and trailers.</remarks>
    /// <param name="targetFile">The path to the file to be analyzed. Cannot be null or empty.</param>
    /// <param name="byteCount">The number of bytes to read from the file for signature matching. Defaults to 64.</param>
    /// <param name="offset">The byte offset at which to begin reading the file for signature matching. Defaults to 0.</param>
    /// <param name="origin">Specifies the reference point used to obtain the offset. Defaults to <see cref="SeekOrigin.Begin"/>.</param>
    /// <returns>A list of <see cref="FileSignature"/> objects that match the file's header and trailer. The list is empty if no
    /// signatures match.</returns>

    public static List<FileSignature> GetMatchingFileSignatures(string targetFile, int byteCount = 64, int offset = 0, SeekOrigin origin = SeekOrigin.Begin)
    {
        static string NormalizeHex(string? hex, bool trimPadding)
        {
            if (string.IsNullOrWhiteSpace(hex))
            {
                return string.Empty;
            }           

            var parts = hex.Replace("-", " ").Split(new[] { ' ', }, StringSplitOptions.RemoveEmptyEntries)
                           .Select(h => h.ToUpperInvariant())
                           .ToList();

            if (trimPadding)
            {
                while (parts.Count > 0 && (parts.Last() == "0A" || parts.Last() == "0D" || parts.Last() == "00"))
                {
                    parts.RemoveAt(parts.Count - 1);
                }
            }

            return string.Join(" ", parts);
        }

        var matches = new List<(FileSignature Sig, int Score)>();

        string fileHeader = NormalizeHex(FileUtil.ShowHeader(targetFile, offset: 0), trimPadding: false);
        string fileTrailer = NormalizeHex(FileUtil.ShowTrailer(targetFile), trimPadding: true);

        foreach (var signature in _fileSignatures)
        {
            if (string.IsNullOrWhiteSpace(signature?.HeaderHex) || signature.HeaderHex == "(NULL)")
                continue;

            string sigHeader = NormalizeHex(signature.HeaderHex, trimPadding: false);
            string sigTrailer = NormalizeHex(signature.TrailerHex, trimPadding: true);

            if (!fileHeader.StartsWith(sigHeader, StringComparison.OrdinalIgnoreCase))
                continue;

            // Trailer check if defined
            if (!string.IsNullOrWhiteSpace(sigTrailer) && sigTrailer != "(NULL)")
            {
                if (!fileTrailer.EndsWith(sigTrailer, StringComparison.OrdinalIgnoreCase))
                    continue;
            }

            // Compute match score (# of matching bytes in header and trailer of file)
            int headerScore = CountMatchingPrefix(fileHeader, sigHeader);
            int trailerScore = CountMatchingSuffix(fileTrailer, sigTrailer);
            int scoreMeasuredAsMatchingByteCount = headerScore + trailerScore;
            signature.MatchingBytesCount = scoreMeasuredAsMatchingByteCount;
            signature.MatchingTrailerBytesCount = trailerScore;
            signature.MatchingHeaderBytesCount = headerScore;
            matches.Add((signature, scoreMeasuredAsMatchingByteCount));
        }

        return matches.OrderByDescending(m => m.Score).Select(m => m.Sig).ToList();
    }

    // Helpers
    private static int CountMatchingPrefix(string source, string pattern)
    {
        var srcParts = source.Split(' ');
        var patParts = pattern.Split(' ');
        int count = 0;
        for (int i = 0; i < Math.Min(srcParts.Length, patParts.Length); i++)
        {
            if (srcParts[i].Equals(patParts[i], StringComparison.OrdinalIgnoreCase))
                count++;
            else break;
        }
        return count;
    }

    private static int CountMatchingSuffix(string source, string pattern)
    {
        if (string.IsNullOrWhiteSpace(pattern)) return 0;
        var srcParts = source.Split(' ');
        var patParts = pattern.Split(' ');
        int count = 0;
        for (int i = 0; i < Math.Min(srcParts.Length, patParts.Length); i++)
        {
            if (srcParts[srcParts.Length - 1 - i].Equals(patParts[patParts.Length - 1 - i], StringComparison.OrdinalIgnoreCase))
                count++;
            else break;
        }
        return count;
    }

}





As we can see in the source code of NormalizeHex, ending padding chars are removed at the end, since in some cases, byte arrays (files or byte columns in databases for examples) are padded with certain bytes. Also, upper-case is applied and '-' is replaced by space ' '.

In the example below, a PDF file is scanned with the console app and the PDF file header and trailer is recognized. In this case, we also peel of trailing bytes at the end, as the specific PDF file had trailing bytes of pad bytes, more specifically : 0A.

FileUtil.cs

The util class here is used to load a file header or file trailer, a smaller byte array usually. 64 bytes is default evaluated here and should cover most file types file headers and file trailers, actually most file types only has 8 bytes or even less as a file header or file trailer.


namespace FileHeaderUtil
{

    /// <summary>
    /// Helper class for file operations
    /// </summary>
    public static class FileUtil
    {

        /// <summary>
        /// Prints the file header HEX representation
        /// </summary>
        /// <param name="filePath"></param>
        /// <param name="byteCount">Read the first n bytes. Defaults to 64 bytes.</param>
        /// <returns></returns>
        public static string? ShowHeader(string filePath, int byteCount = 64, int offset = 0)
        {
            if (!File.Exists(filePath))
            {
                throw new FileNotFoundException(filePath);
            }

            byte[] header = ReadBytes(filePath, byteCount, offset, SeekOrigin.Begin);
            if (header == null)
            {
                return null;
            }
            return BitConverter.ToString(header);
        }

        /// <summary>
        /// Prints the file trailer HEX representation
        /// </summary>
        /// <param name="filePath"></param>
        /// <param name="byteCount">Read the last n bytes. Defaults to 64 bytes.</param>
        /// <returns></returns>
        public static string? ShowTrailer(string filePath, int byteCount = 64, int offset = 0)
        {
            if (!File.Exists(filePath))
            {
                throw new FileNotFoundException(filePath);
            }

            byte[] header = ReadBytes(filePath, byteCount, offset, SeekOrigin.End);
            if (header == null)
            {
                return null;
            }
            return BitConverter.ToString(header);
        }

        /// <summary>
        /// Reads the n bytes of a byte array. Either from the start or the end of the byte array.
        /// </summary>
        /// <param name="filePath">File path of target file to read the byets</param>
        /// <param name="byteCount">The number of bytes to read</param>
        /// <param name="offset">Offset - number of bytes</param>
        /// <param name="origin">Origin to seek from. Can be either SeekOrigin.Begin, SeekOrigin.Current or SeekOrigin.End</param>
        /// <returns></returns>
        private static byte[] ReadBytes(string filePath, int byteCount, int offset = 0, SeekOrigin origin = SeekOrigin.Begin)
        {
            if (!File.Exists(filePath))
            {
                throw new FileNotFoundException(filePath);
            }

            if (byteCount < 1)
            {
                return Array.Empty<byte>();
            }
            byte[] buffer = new byte[byteCount];
            using var fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
            if (origin == SeekOrigin.Begin && offset > 0)
            {
                fileStream.Seek(offset, origin);
            }
            else if (origin == SeekOrigin.End)
            {
                fileStream.Seek(-1 * Math.Abs(offset+byteCount), origin);
            }
            else
            {
                //origin must be Current - offset is expected from the current position, just like SeekOrigin.Begin
                fileStream.Seek(offset, origin); 
            }

            int bytesRead = fileStream.Read(buffer, 0, byteCount); 
            if (bytesRead < byteCount)
            {
                Array.Resize(ref buffer, bytesRead);
            }
            return buffer;
        }        
        
    }

}


This console app will only consider max three matching file headers/trailers in cases where multiple such byte array pairs matches a given byte array of a file. To adjust this, see in Program.cs and adjust the Take parameter. Matches are ordered by number of bytes matching.

Tuesday, 9 December 2025

Enable font ligatures in Visual Studio 2026

Font ligatures are a cognitive boost for developers when reading code inside an IDE.

What are font ligatures

Font ligatures are special glyphs that combine multiple characters into a single, elegant symbol. For example, =>, ===, or != can appear as smooth connected symbols instead of separate characters. They don’t change your code—just make it more readable and visually appealing.

  • 🎨 Aesthetic boost – Makes your code look clean and modern without changing functionality.
  • 👁️ Better readability – Reduces visual clutter, making code easier on the eyes.
  • 🔍 Clearer syntax – Turns multi-character operators like => or === into neat symbols for quick recognition.
  • Faster comprehension – Helps spot patterns and logic flow at a glance.


In case you want to enable Font ligatures inside VS 2026, Visual Studio 2026, you actually have to resort to running a Powershell script or similar to alter the registry a bit.

EnableFonLigatures.ps1 | Powershell





# Enable Font Ligatures for Visual 2026 (18.x)
$basePath = "HKCU:\Software\Microsoft\VisualStudio"
$targetPrefixes = @("18.0_")
foreach ($prefix in $targetPrefixes) {
    $vsKeys = Get-ChildItem -Path $basePath | Where-Object { $_.PSChildName -like "$prefix*" }
    if ($vsKeys.Count -eq 0) {
        Write-Host "No keys found for prefix $prefix. Open Visual Studio and change Fonts & Colors once, then rerun."
    } else {
        foreach ($key in $vsKeys) {
            $fontColorsPath = Join-Path $key.PSPath "FontAndColors\Text Editor"
            
            # Create the path if missing
            if (-not (Test-Path $fontColorsPath)) {
                Write-Host "Creating missing path: $fontColorsPath"
                New-Item -Path $fontColorsPath -Force | Out-Null
            }
            # Set EnableFontLigatures to 1
            Set-ItemProperty -Path $fontColorsPath -Name "EnableFontLigatures" -Value 1 -Type DWord
            Write-Host "Ligatures enabled for: $fontColorsPath"
        }
    }
}


The following screenshot shows two ligatures symbols. Note the special symbols for => ('goes to') and != ('not equals') that are combined into one elegant symbol, which is more readable for the reader.

Monday, 24 November 2025

Exploring Extension Blocks and Constants in C# 14

Extension blocks - Extension properties

Extension blocks and an important new feature - extension properties can be made in C#14. This is available with .NET 10.

It is not possible to define a generic extension block to add extension properties / members (yet, as of C#14 anyways -maybe for future version of C#..)

Consider this example of some well-known constants from entry-level Calculus using extension properties.


using System.Numerics;

namespace Csharp14NewFeatures
{
    /// <summary>
    /// Provides well-known mathematical constants for any numeric type using generic math.
    /// </summary>
    /// <typeparam name="T">A numeric type implementing INumber<T> (e.g., double, decimal, float).</typeparam>
    public static class MathConstants<T> where T : INumber<T>
    {
        /// <summary>π (Pi), ratio of a circle's circumference to its diameter.</summary>
        public static T Pi => T.CreateChecked(Math.PI);

        /// <summary>τ (Tau), equal to 2π. Represents one full turn in radians.</summary>
        public static T Tau => T.CreateChecked(2 * Math.PI);

        /// <summary>e (Euler's number), base of the natural logarithm.</summary>
        public static T E => T.CreateChecked(Math.E);

        /// <summary>φ (Phi), the golden ratio (1 + √5) / 2.</summary>
        public static T Phi => T.CreateChecked((1 + Math.Sqrt(5)) / 2);

        /// <summary>√2, square root of 2. Appears in geometry and trigonometry.</summary>
        public static T Sqrt2 => T.CreateChecked(Math.Sqrt(2));

        /// <summary>√3, square root of 3. Common in triangle geometry.</summary>
        public static T Sqrt3 => T.CreateChecked(Math.Sqrt(3));

        /// <summary>ln(2), natural logarithm of 2.</summary>
        public static T Ln2 => T.CreateChecked(Math.Log(2));

        /// <summary>ln(10), natural logarithm of 10.</summary>
        public static T Ln10 => T.CreateChecked(Math.Log(10));

        /// <summary>Degrees-to-radians conversion factor (π / 180).</summary>
        public static T Deg2Rad => T.CreateChecked(Math.PI / 180.0);

        /// <summary>Radians-to-degrees conversion factor (180 / π).</summary>
        public static T Rad2Deg => T.CreateChecked(180.0 / Math.PI);
    }

    /// <summary>
    /// Extension blocks exposing math constants as properties for common numeric types.
    /// </summary>
    public static class MathExtensions
    {
        extension(double source)
        {
            /// <inheritdoc cref="MathConstants{T}.Pi"/>
            public double Pi => MathConstants<double>.Pi;
            public double Tau => MathConstants<double>.Tau;
            public double E => MathConstants<double>.E;
            public double Phi => MathConstants<double>.Phi;
            public double Sqrt2 => MathConstants<double>.Sqrt2;
            public double Sqrt3 => MathConstants<double>.Sqrt3;
            public double Ln2 => MathConstants<double>.Ln2;
            public double Ln10 => MathConstants<double>.Ln10;
            public double Deg2Rad => MathConstants<double>.Deg2Rad;
            public double Rad2Deg => MathConstants<double>.Rad2Deg;
        }

        extension(decimal source)
        {
            public decimal Pi => MathConstants<decimal>.Pi;
            public decimal Tau => MathConstants<decimal>.Tau;
            public decimal E => MathConstants<decimal>.E;
            public decimal Phi => MathConstants<decimal>.Phi;
            public decimal Sqrt2 => MathConstants<decimal>.Sqrt2;
            public decimal Sqrt3 => MathConstants<decimal>.Sqrt3;
            public decimal Ln2 => MathConstants<decimal>.Ln2;
            public decimal Ln10 => MathConstants<decimal>.Ln10;
            public decimal Deg2Rad => MathConstants<decimal>.Deg2Rad;
            public decimal Rad2Deg => MathConstants<decimal>.Rad2Deg;
        }

        extension(float source)
        {
            public float Pi => MathConstants<float>.Pi;
            public float Tau => MathConstants<float>.Tau;
            public float E => MathConstants<float>.E;
            public float Phi => MathConstants<float>.Phi;
            public float Sqrt2 => MathConstants<float>.Sqrt2;
            public float Sqrt3 => MathConstants<float>.Sqrt3;
            public float Ln2 => MathConstants<float>.Ln2;
            public float Ln10 => MathConstants<float>.Ln10;
            public float Deg2Rad => MathConstants<float>.Deg2Rad;
            public float Rad2Deg => MathConstants<float>.Rad2Deg;
        }
    }
}

We must define extension blocks per type here.

If we move over to extension methods, we still must use a non-generic class. However, we can use for example generic math, show below. This allows to reuse code accross multiple types, supporting INumber<T> in this case.


namespace Csharp14NewFeatures
{
    using System;
    using System.Numerics;

    namespace Csharp14NewFeatures
    {
        /// <summary>
        /// Provides generic mathematical constants via extension methods for numeric types.
        /// </summary>
        public static class MathConstantExtensions
        {
            public static T GetPi<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(Math.PI);

            public static T GetTau<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(2 * Math.PI);

            public static T GetEuler<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(Math.E);

            public static T GetPhi<T>(this T _) where T : INumber<T> =>
                T.CreateChecked((1 + Math.Sqrt(5)) / 2);

            public static T GetSqrt2<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(Math.Sqrt(2));

            public static T GetSqrt3<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(Math.Sqrt(3));

            public static T GetLn2<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(Math.Log(2));

            public static T GetLn10<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(Math.Log(10));

            public static T GetDeg2Rad<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(Math.PI / 180.0);

            public static T GetRad2Deg<T>(this T _) where T : INumber<T> =>
                T.CreateChecked(180.0 / Math.PI);
        }
    }
}

Example usage of the code above :


  #region Extension metmbers using block syntax - Math

  //Extension properties 
  double radians = double.Pi / 3.0; // Pi/3 radians = 60 degrees (1 * Pi = 180 degrees) 
  double degrees = radians * radians.Rad2Deg; // Using the extension method Rad2Deg

  Console.WriteLine($"Radians: {radians:F6}"); //outputs 1.04719..
  Console.WriteLine($"Degrees: {degrees:F6}"); //outputs 60

  //Using Extension methods 

    //Using Extension methods 

  double radiansV2 = 1.0.GetPi() / 3.0;
  double degreesV2 = radians * 1.0.GetRad2Deg();

  Console.WriteLine($"Radians: {radiansV2:F6}");
  Console.WriteLine($"Degrees: {degreesV2:F6}");

Output of the code usage above:


Radians: 1,047198
Degrees: 60,000000
Radians: 1,047198
Degrees: 60,000000

So to sum up, if you use extension blocks in C#, you can use them together with generics, but the extension block must be defined to a concrete type, not using generics. This will result in some cases in lengthier code, as we cannot use generics as much as extension methods allows. Note that extension methods also must be defined inside a non-generic class.