Hi sph! Your questions here seem almost exactly like those I have about OO. Some comments:
Data cannot have “bugs”. Data is a static representation of facts. It is truthful by nature.
I think… I wouldn’t quite go so far as to say that “data is truthful”, but I also think I understand what you’re saying.
I’d say rather that data is finite and static and self-contained by nature. While “code” (functions, algorithms, objects) creates or infers structures/spaces that are infinite (potentially uncountably infinite, in nasty and paradoxical ways) when run; can potentially self-modify even when just “read” (yikes!!!); and also has lots of dependencies including to specific hardware. It is much, much harder to make code safely transportable between different types of computing machinery, as opposed to data (in the form of sequences of symbols) that can live for decades or centuries.
Clay tablets from Sumeria and rock inscriptions from Egypt are “data” that we can still process today, given a fairly thin layer of symbolic encoding (a character set like Unicode, say). Interpreting that data can be much harder. But we can at least uncover them, dust them, photograph them, sketch them, digitize them… because they are “dumb” physical objects that exist in one simple form and don’t change when we “read” them.
But what if our records from ancient societies, instead of clay tablets and scrolls, were all “objects” - complex calculating machines like the Antikythera mechanism, broken after centuries of disrepair - or worse, centuries-old iPads - and we had to successfully “run” them without error in order to access the data inside? I think the task of archeologists would be immensely harder.
And that’s just the task of “transmitting information across time”. I think transmitting information between dissimilar computing platforms is similarly complex, and we tend to find the same level of problems. Including needing to be an “archeologist reassembling the Antikythera Mechanism” just to find out what, say, Smalltalk-72 software was like to run on a Xerox Alto. And that’s for very well documented systems for which we have working hardware and the designers are still living. Successfully transporting code across systems gets much harder from there, and that’s why OS and platform lock-in is a thing.
Here we define a Pen as a structure, and create a set of methods (function that have an implicit
self
argument) that are able to operate on it. A user is always able to create new “verbs”, i.e. to expand the model through which we interface with data. But also a user is free from information hiding, so they might create a better implementation of Pen, which will be 100% compatible with other Pens, which nothing are than a struct. Additionally, Pen is easily serializable, can be transmitted onto wire because it’s nothing else than a bag of bytes.
Yes, this idea has some merit to it I think. It’s abandoning the ideal of “encapsulation” and instead reducing an “object” to something a lot simpler: “a memory protected namespace”. (I assume the implementation of “struct” here would prevent C- or Forth-style random access to memory via arbitrary integer pointers - that’s not nothing!)
Javascript pretty much gives us this. Javascript is very large now, and I wish it were a lot smaller. (Lua is almost that smaller Javascript, but has some nasty edges to it that give me pause).
The syntax I’m exploring here is OO without inheritance and encapsulation.
Alan Kay I think would agree that OO should not have inheritance, but he would disagree about encapsulation. But I’m on the verge of thinking, as you do, that encapsulation itself is also a problem and not a solution when we’re dealing with large distributed systems. It’s all well and good to say “just send messages between computers”, but to actually send messages in the actually-existing Internet, we have to send messages not as Turing-complete computers but as dumb, strictly-defined data formats. And we’re also finding that data is even better if it’s not just dumb but also immutable (it parallelises well and it lets us save and restore histories, and we can save a lot of unnecessary computing power if we let our transmission packets simply be our storage records and vice-versa).
So if we were to make local desktop computing more like the actually-existing Internet of 2024 (the good/working parts, that is) and not a theoretical 1960s idea of an Internet yet-to-be-built… we should probably think more in terms of immutable data records with clearly defined, system-independent formats, and less in terms of “little virtual self-modifying computers which can do literally anything they want with messages” - much less “send an infinite regress of little virtual self-modifying computers between other virtual self-modifying computers until it’s all a completely unpredictable Turing-complete mess”.
I dunno. I’m very torn. Encapsulation and information-hiding sometimes is helpful: it might even be required, in small and carefully controlled doses, for security. But in many distributed contexts, I feel like unlimited encapsulation can be really, really unhelpful and even extremely dangerous.
At the very least, enscapsulation has to be accompanied by a clearly defined, system-managed serialization/deserialization protocol so that our local systems can examine remote objects (up to and including all the things we do today when we “compile source code”) in order to decide if they’re safe and trustworthy or not.
I keep thinking: what if our fundamental data type was “an immutable record, accompanied by a function pointer (very similar to an object constructor), where the core system/langage kernel makes a very strong assertion EITHER that the associated function has BEEN run on this record and returned True, OR that it can be logically proven that the function IF run WOULD return True”? How far would this get us? This would give us similar data type/shape guarantees to objects, but without either inheritance or encapsulation. Would also be very much like a type system except that the type system is just a function (but perhaps doing the “IF this were run THEN…” inference would require a type system).