My recent attempt at a substrate led to the following discoveries regarding serialization and persistence.
In order to understand the following please note that I am writing it in Java/Kotlin and targeting JVM. The language inside the substrate is a simple functional language, currently Clojure. (Long live embeddable languages!)
The input events have no more complexity than a tree. Input cannot be a lattice (non-trivially), because I/O or perception is “flat”. Just think about your mode of perception. Your perception is a space, which can be divided into regions. Two disjoint regions of your perception space are independent. There is no let ... in in perception. There is also no pointer.
The input is also immutable and stored on disk. So, I would like to serialize the data in a compact way.
But then, according to my plan, the user also needs to maintain some state folded from the input. I initially used Protobuf Message.Builder (which is a mutable, typed message before serialization) for all the state. For a while I thought this style of mutability was necessary. Then,
- State requires more complexity than a tree. It can contain a loopy graph. Every off-the-shelf solution, including Protobuf, serializes this kind of data as a tree with references inlined, which makes the data structure wrong.
- A similar limitation was noted on LoCal page 58. LoCal can only serialize lattices, not loopy graphs. It must have used some equivalent of
let ... inconstruct. - If you want to store a loopy graph in a tree, you have to store the adjacency map. The authors of LoCal did not take note of this. We in fact have a capable design!
- State does not necessarily use the same data schema as input. This is the whole point of using
foldoverreduce, and is also the reason why I decided to leave the state serialization to the user.
This approach adheres to the “parse, don’t validate” principle pretty much everywhere, giving freedom to higher-level code.
In summary, loopy graph > lattice > tree in terms of complexity and serialization cost. LoCal can do a lattice. My substrate allows trees in input, and loopy graphs in state.