Dynamic JSON deserialization of complex polymorphic data models

16 grudnia 2020 Dariusz Wawer

Introduction

In modern Java serialization and deserialization of simple objects to and from JSON is simple, quick and eficient. There are several libraries that do the job well, in particular Jackson and GSON are very broadly used.

In our project we faced a challenge regarding JSON deserialization. We were a consumer of a predefined polymorphic JSON data entities. The model was given to us as-is, with little to none possibility of extension. Initially we had an UML model, but nowadays even the Java data classes are defined externally and we obtain them as a library.

We had to receive the data, preferably in a polymorphic fashion as we wanted the data to be cleanly handled by internal logic.

Polymorphic JSON deserialization in Java

The simplest approach to polymorphic data is: do not use polymorphism at all. A single data class can contain all the fields that particular types of objects occurring in the same place. That way such objects are instantiated to the same type and all data has the same structure. There’s often a lot of nulls though.

All-in-one „polymorphism” has significant disadvantages. First of all, the distinction between particualar types of objects will still exist, but it will exist only in the code that handles the data. All validators, converters, business logic entry points will have to contain a lot of conditional statements that determine the real object type. The more complex the model, the more complex the logic.

Both Jackson and GSON provide some ways to handle polymorphic deserialization. Jackson’s annotation-based configuration allows for determining object type through through a field value. It may either be a concrete class name or an indirect value that points to a particular class through configuration.

Gson, on the other hand, provides the TypeAdapter abstraction. It can inspect incoming json at runtime and inject any class instance, based on any criteria, to the deserialized object/collection.

Both implementations have some advantages and some drawbacks. Jackson’s annotation-configuration model is easy to understand and implement. However, it is bound to the model (1-1 relation) and it requires you to have control over the classes of the model. You actually have to add the annotations to the data classes. If the model comes from an outside source it will not be convenient to add the required annotations. Updates to such outside model will also be time-consuming.

GSON’s dynamic approach is verstatile. It fully controls the deserialized objects and their contents. It is also separate from the model itself. While it does depend on it (it has to!), it can be implemented for an external set of classes. It’s main disadvantage is that it requires a lot of code. Each polypmorphic type requires its own adapter. It has to be registered in GsonBuilder factory object. Only then you can to create a deserializer object based on that configuration. Finally you can actually perform the json deserialization.

Our goals

We formulated the following reqiurements for our deserialization code

Data objects have to be polymorphic. We rejected the previously mentioned all-fields-in-one-class approach.

It has to be separate from the model. Meaning that we did not want to modify the data classes, we wanted to leave them as POJOs.

It has to be short and easy to read. So that its configuration (and reconfiguration) would be quick and simple.

The deserializer has to be separated from its configuration. It’s quite obvious, but let’s state it anyway. We wanted to have the deserialization rules defined outside the technical deserializer implementation.

It would be nice-to-have it to allow for multiple, separate rules for the same object. We noticed that some low-level, detail data objects were reused accross multiple high-level separate-domain objects. To make sure no conflicts occur, no unwanted code overlapping, we decided that we need a separate deserializer/configuration for each high-level object.

Lastly, performance was not an issue. Even if the deserialization was 10x slower than pure Jackson/GSON it was not a problem. There was simply not enough incoming data.

The decision

Our application used Jackson deserialization library already, but for simple cases only. We did not want to have 2 different JSON libraries in one app.

Because of GSON’s TypeAdapters that matched 3 of 4 of the requirements stated above, we considered switching the deserialization library from Jackson to GSON. It would have required a significant amount of work. Jackson’s annotations would have to be converted to GSON’s annotations, and some internal deserialization binding would also have to change. It was possible, but would take time and could cause regression errors. And of course we would still have to write an abstraction of our own. Especially for simpler and faster rule definition that would dynamically create the TypeAdapters for us. Without it, configuring the JSON deserialization would be a very tedious task.

Using Jackson for the purpose of polymorphic deserialization seemed like it would require much more work. It would, however, not introduce any changes to existing code and hence it was deemed safer.

So we decided to build our abstraction on Jackson, as we could afford the additional work, but we did not want to face the risks.

JSON deserialization implementation

The code was implemented more than a year ago and is currently successfully running in production. There are a few thousands users using daily. Not a lot, but enough to know the code runs fine.

We recently converted our solution to an open-source library, with code at github.com/Pretius/pretius-jddl and jars available through maven/gradle dependency (check the newest version at mvnrepository).

The deserializaiton rules abstraction consists of three main interfaces.

For each json node our implementation tests each DeserializationRule’s DeserializationCriterion and if the predicate is matched, the DeserializationAction is called to perform JSON deserialization. The library provides a set of basic rules by default, so simple POJO’s deserialization requires no configuration.

Custom rules can be created using factory methods for rules, criteria and actions or through implementing custom classes implementing given interfaces. A simple configuration for cliche Animal/Cat/Dog abstraction may looks like this:

Or, by using predefined visitor-pattern implementation:

Note that static imports significantly shorten the code above, but they were not used to make the code clearer.

Using pretius-jddl in your project

First of all, use your favourite tool to include the library in your application. Get the latest version information from search.maven.org. For maven it will look like this:

Whenever you need to prepare custom object handling with pretius-jddl there are three steps you have to take:

  • determine the conditions you want to test for in your data
  • determine the actions you want to perform with the data
  • prepare configuration for your deserializer(s)

The first two steps may just involve checking the available methods in DeserializationRuleFactory, DeserializationCriterionFactory and DeserializationActionFactory. It is qute possible that the methods you need are already present there. If not you will have to implement your own DeserializationCriterion or DeserializationAction.

Preparing the configuration is entirely business driven. Once you have the reuqired conditions and actions you can build the rules however you wish. You may want to use the DeserializerConfigurer or DeserializerConfigurerAdater to organize your rules. Preferably usenames relating to business objects, but any other distinguisher you choose is fine.

Future

First of all, there are many more factory methods for rules, criteria and actions to be implemented. Initially, the library provides only a few, but we expect their number to grow as the library userbase grows. Feel free to submit your proposal at github as a ticket. Or simply implement the method you want and submit a pull request. Check out the library repository at github/pretius-jddl.

The deserialization abstraction in pretius-jddl does not strongly depend on chosen technical implementation. Right now it utilizes Jackson, but it is possible to extend jddl to use GSON internally. We will consider doing that if it is a requested feature.

We look forward to interacting with the programming community. As this is our first public java library, we would love to hear your feedback. You can reach us on facebook or twitter.

Clean coding, everyone!

Zapraszamy do kontaktu!

Pretius jest firmą tworząca oprogramowanie wspierające biznes.
Tworzymy aplikacje webowe wykorzystując: Java, Oracle DB, Oracle Apex, AngularJS.
Skontaktuj się z nami, aby porozmawiać o tym jak możemy pomóc w realizacji Twojego projektu!