Inferring Abstract Implementations for JSON Serialization

A while ago, I was working on a project that had to provide a JSON specification to external developers, and be serializable/deserializable to an equivalent representation in C# (as an object graph).

One of the frustrations I ran into as I was developing this was seeing the “$type” field get scattered across any instance of an object when you instruct JSON.NET’s TypeHandling to deal with implementations of abstract classes (with TypeNameHandling.Auto).

Instead, I wrote a small JsonConverter class to scan for similar patterns of membership between the JSON object and the various derived candidates for an abstract class.

That is, say we have a few classes that look like this:

public abstract class AbstractBaseClass
{
    public string SomeString { get; set; }
    public AbstractBaseClass[] Children { get; set; }
}

public class ImplementationOne : AbstractBaseClass
{
    public bool AreWeCoding { get; set; }
}

public class ImplementationTwo : AbstractBaseClass
{
    public int SomeNumber { get; set; }
}

public class ImplementationThree<TOne, TTwo> : AbstractBaseClass
{
    public TOne ValueOne { get; set; }
    public TTwo ValueTwo { get; set; }
}

And an instance structure that looks like this:

AbstractBaseClass structure = new ImplementationOne()
{
    AreWeCoding = true,
    SomeString = "This is implementation one.",
    Children = new AbstractBaseClass[]
    {
        new ImplementationTwo()
        {
            SomeNumber = 42,
            SomeString = "This is implementation two."
        },
        new ImplementationOne()
        {
            AreWeCoding = false,
            SomeString = "This is also implementation one."
        },
        new ImplementationThree<char, double>()
        {
            ValueOne = 'a',
            ValueTwo = 3.1337,
            SomeString = "This is implementation three."
        }
    }
};

A couple of the challenges we face in doing this (without JSON.NET’s “$type” information) are:

  • How do we instantiate the base of the object instance tree when it is an implementation of an abstract type?
  • How do we handle generics (and/or how do we handle them on implementations of abstract types)?

We’ll come back to the code and explanation in a moment, but just to show how the result looks with and without the inference converter:

Without inference converter (863B):

{
    "AreWeCoding": true,
    "SomeString": "This is implementation one.",
    "Children": [
        {
            "$type": "ModuleConfigTest.ImplementationTwo, ModuleConfigTest",
            "SomeNumber": 42,
            "SomeString": "This is implementation two.",
            "Children": null
        },
        {
            "$type": "ModuleConfigTest.ImplementationOne, ModuleConfigTest",
            "AreWeCoding": false,
            "SomeString": "This is also implementation one.",
            "Children": null
        },
        {
            "$type": "ModuleConfigTest.ImplementationThree`2[[System.Char, mscorlib],[System.Double, mscorlib]], ModuleConfigTest",
            "ValueOne": "a",
            "ValueTwo": 3.1337,
            "SomeString": "This is implementation three.",
            "Children": null
        }
    ]
}

With inference converter (622B):

{
    "AreWeCoding": true,
    "SomeString": "This is implementation one.",
    "Children": [
        {
            "SomeNumber": 42,
            "SomeString": "This is implementation two.",
            "Children": null
        },
        {
            "AreWeCoding": false,
            "SomeString": "This is also implementation one.",
            "Children": null
        },
        {
            "T$": "System.Char,System.Double",
            "ValueOne": "a",
            "ValueTwo": 3.1337,
            "SomeString": "This is implementation three.",
            "Children": null
        }
    ]
}

Using the inference converter on our case brought down the size of the resulting JSON almost 250B, just over 1/4 of the original size. This technique was first written for serializing exceptionally large (100,000 or more instances) objects and children, so the best use cases for this will come with larger, more complex objects. The limitations I have run into before are:

  • When serializing generics, the type is specified with its full name but not the namespace, so if the generic type is not part of a loaded namespace that will be searched by the Type resolver, the code will need to be modified to include the namespace.
  • If multiple derivations of the same abstract class have serialized identical fingerprints (same fields/properties in both), an ambiguous match will occur (identical scores) and the first Type of the group will be returned.

Enough chit-chat… To the code!
If you just scrolled down here for a quick copy/paste, please be aware of the limitations, found above.

public class InheritanceJsonConverter : JsonConverter
{
    private const string JsonGenericTypeKey = "T$";
    private Type[] _typesLoaded;

    public override bool CanConvert(Type objectType)
    {
        return(objectType.IsAbstract || objectType.IsInterface) || objectType.IsGenericType;
    }

    public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
    {
        JObject jObject = JObject.Load(reader);
        var targetType = IdentifyDerivedType(objectType, jObject, jObject.Properties().Any(p => p.Name == JsonGenericTypeKey));

        object target;
        if (targetType.IsGenericType)
        {
            var genericProperties = jObject.Properties().FirstOrDefault(p => p.Name == JsonGenericTypeKey);
            if (genericProperties == null)
                throw new Exception(string.Format("Generics definition property not found ({0})", JsonGenericTypeKey));
            var genericPropertiesArray = genericProperties.Value.ToObject<string>().Split(',');

            var genericTypes = new List<Type>();
            for (int i = 0; i < targetType.GetGenericArguments().Length; i++)
            {
                genericTypes.Add(Type.GetType(genericPropertiesArray[i]));
            }
            target = Activator.CreateInstance(targetType.MakeGenericType(genericTypes.ToArray()));
        }
        else
        {
            target = Activator.CreateInstance(targetType);
        }

        serializer.Populate(jObject.CreateReader(), target);

        return target;
    }

    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    {
        JToken t = JToken.FromObject(value);

        if (t.Type != JTokenType.Object)
        {
            t.WriteTo(writer);
        }
        else
        {
            JObject o = (JObject) t;
            o.AddFirst(new JProperty(JsonGenericTypeKey, string.Join(",", value.GetType().GenericTypeArguments.Select(a => a.FullName))));
            o.WriteTo(writer);
        }
    }

    private Type IdentifyDerivedType(Type baseType, JObject jObject, bool isGeneric)
    {
        if (_typesLoaded == null)
            _typesLoaded = AppDomain.CurrentDomain.GetAssemblies().SelectMany(a => a.DefinedTypes.Select(d => d.AsType())).ToArray();
        IEnumerable<Type> candidates;
        candidates = _typesLoaded.Where(t => baseType.IsAssignableFrom(t) && !t.IsAbstract && t.IsGenericType == isGeneric);

        var scores = candidates.Select(candidate => new Tuple<Type, double>(candidate, ScoreCandidate(candidate, jObject))).ToList();
        return scores.OrderByDescending(item => item.Item2).First().Item1;
    }

    private static double ScoreCandidate(Type candidate, JObject jObject)
    {
        var propertiesAndFields = candidate.GetProperties(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.FlattenHierarchy).Select(p => p.Name).Union(candidate.GetFields().Select(p => p.Name)).ToList();
        if (!propertiesAndFields.Any())
            return 0d;

        int countClass = jObject.Properties().Count(item => propertiesAndFields.Contains(item.Name));
        var matchClassToJObject = countClass / (double)jObject.Properties().Count();

        int countJObject = propertiesAndFields.Count(item => jObject.Properties().Any(j => j.Name == item));
        var matchJObjectToClass = countJObject / (double)propertiesAndFields.Count();

        return matchJObjectToClass + matchClassToJObject;
    }
}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s