More Efficient Serialization with Regard to Size on Disk

Mar 15, 2014 at 4:47 AM
I've been using Sterling for some time now and I'm largely happy with it.

I serialize an observablecollection of MyObject with about 2000 instances of MyObject. MyObject conatains about 50 properties.

The size on disk is quite large compared to competing OODB's. It seems that most of this is due to the fact that the binaryserializer is repeating those 50 property names 2000 times.

Have I made a mistake? Is there some way to configure this more intelligently?

I've been considering using protobuf-net for a network based implementation passing this same collection of objects in which size becomes even more critical. Is it possible to use protobuf-net as the serializer in Sterling?
Coordinator
Mar 16, 2014 at 2:10 PM
Sure. Sterling had some plans to come up with a dictionary-type index for this. The reason the properties are serialized is because we had issues in early versions with compatibility, i.e. adding or removing a property, so it was easier just to include them until we could come up with drivers that can do smarter indexing and compression on the properties. I have been unable to allocate time to creating this version but am very open to patches and suggestions if you know someone in the community who is eager to contribute.

One thing I made sure to do with Sterling is abstract the persistence from the serialization. In other words, there are drivers that manage actually writing to disk. It is here you can choose to build your own driver if you like. For example, you could take the existing isolate storage driver and make a custom one and overwrite the way it serializes to optimize the behavior. That enables you to do more fine-tuned scenarios without deviating from the centralized engine.