As it goes with all software, initially you think you have it all figured out and it’s all clear, but as soon as you start writing it you realise there are a couple of things not as straight-forward as you would have liked. The implementation of the ObjectSavingFramework is no exception to this, even though it was designed to be a very simple application.
The first version only had to have 3 different concepts:
- Objects: which contains other elements (potentially other objects).
- Values: which just indicate a certain value.
- References: pointers to objects.
The references are an important aspect to avoid infinite nesting, or object duplication. An object that contains another object will have to decide whether this object is private to him, or if it is shared. If it is a private object, the whole object needs to be written as a child element, if it is shared then the reference is required. It is however this reference concept that caused most of the problems.
I wanted to use the address of an object as the reference, the object itself then would write its address as an id. This all works perfectly when writing the data, but reconstructing it from the saved data turned out to be impossible. The cause is a big gap in the responsibilities of the objects. The framework, which contains the data does not know how to create any of the objects, so it can not be held responsible for creating the object. This means it can not resolve the reference to the actual object. At the same time the object itself can not resolve the address either, because it only gets an address from the framework. Moreover, the object itself should not even know that it is returned an address because of the abstraction I put into it. But if the address would actually be useful then we could just use some C++ magic to cast it to a real object, but the address that was saved does not need to be the address of the newly created object.
I did not find any solution for this, expect to push the responsibility completely to the application. The whole concept of reference does not exist explicitly in the framework anymore, instead the application should write special fields that it can use to recover the objects. There is a hidden advantage to this, for the references to work, I had to write the id (chosen to be the address) of each object, but how? In XML this didn’t cause any problems as I could use an attribute for that, all values are elements anyway, so it is impossible there is a clash. But in other formats, such as JSON or SQL this separation does not exist. The only option then would be is to have certain keywords that are reserved. This is annoying as it restricts the freedom of the application, but what’s even worse is that you could lose the freedom to easily swap between formats, as one mapper may have chosen a different keyword then another.
I did discover a couple of other problems with my current design. The API is not very helpful, as it is very limited and leaves all the work to be done to the application itself. Moreover, I have started to doubt the need for the intermediate format all together. The whole concept of this intermediate format was to avoid having mappers from any source format to any target format, but in this cause all the source formats are the same, as it is all C++ objects. Such an intermediate layer could be useful if we would want to extend it to different types of languages, but this is not happening anytime soon. At the moment you can just see that the intermediate layer is a complete duplicate of the actual C++ object. To fix this, I will come up with a different API/design that works with events between objects and mappers directly, I do however still want to keep the type of mapper hidden from the object.
I also discovered that the way to save and reconstruct the objects is pretty annoying. On the one side you are very limited by the supported types (currently only string), which forces you to do a lot of conversions manually. The lack of support for lists out of the box is difficult as well, since you can only get values if you know the name of the field, you have to give each element of the list a suffix (or prefix) depending on the index. On top of that you also have to save the size of the list to avoid the framework from crashing because you asked for a non-existing field.
On the other side you have overhead of ‘closing’ objects, both during saving and reconstructing. During saving this is required because we need to know whether a new object is a child or a sibling, during reconstructing we need to know when we have to go back to the parent. For safety reasons, I even forced to provide the name of the object when closing it, this is to avoid having a child close the parent object and potentially ‘hacking’ the tree.
The current idea would be to return a map with key-value pairs, this allows a more easier lookup, but perhaps just supporting lists would be sufficient as well. It is a nice concept that you need to know the name of the field to get the value, it really makes the reconstruction of the object very clear and straight-forward. Another idea would then be to flatten the hierarchy and use a separator in the keyname to identify the different levels. This way it would be easier to find all elements belonging to a certain object.
The current implementation also contains some casts in the mappers because of the specific treatment of object nodes. Since these nodes can contain child nodes, but this is not part of the general node object. A missed chance for some clean inheritance, and something I will definitely fix in the next version.