This is a provocative thought, that’s very core to the Object Oriented philosophy, but also one that I feel is deeply wrong. If it is right, the it’s very nontrivially so, because it implies a lot of things that seem to have been contradicted by our experience so far.
(Edit: I see that Stephen Kell’s article “Reversing abstractions” is making much the same point I make below, where he talks about “existential” vs “universal” abstractions. Yes, the Internet worked because its design, especially around TCP/IP, is “universal” in Kell’s sense, ie it defines concrete specifics of an implementation. And yes, this way of thinking seems currently alien to computer science and especially the philosophy of programming language design. And this mismatch seems very interesting.)
Surely the essence of data formats is and must be that they travel between systems. (If data doesn’t need to travel, it doesn’t need to have a defined format.)
And right there is the problem: if data in a defined format travels between systems (and it does), then which system should implement the “procedures” that operate on it? Surely all of them need to? Then how can all of those procedures (each very different, because written for different machine types, OSes, languages) possibly be defined in “the same place” as the data is defined? The two systems might not even be in the same continent - or the same time period. The data might have been written centuries ago for a machine long dead, and now we’re trying to make some sense of fragmentary pieces available to us, often with the surrounding context lost, and we’re trying to do things that its writers never intended to solve problems they couldn’t imagine. Much of the world’s cultural and scientific knowledge is this kind of data. The computing world we will increasingly face as the American-led Cloud centralises and then collapses will also be like this.
And indeed, Kay’s recommendation is not how we built the Internet! We defined data formats very strictly and in one well-known place (the IETF RFCs) as well as protocols (concretely specified) for how sequences or interactions of messages in those data formats should interconnect - down to exactly what bits must be set in which byte for a message to be valid… but left the procedures to be followed by machines when they processed those data formats and implemented those protocols, completely up to each individual system builder.
The only way I can understand Kay’s idea as making any possible kind of sense - in a world where data must travel between systems and into the future - is for transmittable objects that seek to combine data and methods to travel as some kind of centrally specified, rigidly defined, universally understood, and never changed bytecode that runs on a universal runtime. And we don’t really have that. We’ve at times came close to wanting and almost getting that, with Java and .NET, and with Javascript and WASM, but we’ve always pulled back from trying to specify a universal runtime as a decentralised protocol. Maybe with Squeak and Croquet? The story has always been “the runtime and its object model should stay local to the machine/process, and only data and not objects should be transmitted”. And with Java and Javascript, at least, the VM itself updates so rapidly, multiple times a year, that it can’t possibly be a survivable archival format. I imagine WASM is probably also churning as fast as web browsers do.
This kind of statement from Kay (he’s said similar things many times) is why I talked a few months ago about wanting a “universal runtime”. Because it seems to be what is required, as an absolute minimum, for anything close to Kay’s vision to even begin to occur.
But I’m still not even sure if this vision (if I’ve even understood it correctly) would be a good thing if we did get it.
This, however:
A program should contain its source code, the binary of the compiler and the source code of the compiler.
It should be able to self-replicate itself, change its syntax and evolve independently from any other piece of software.
I agree. I feel like a program should be made out of very small parts that can each be changed independently. This feeling is why I tend to dislike the process of compilation and how it separates software artifacts into “source” vs “object” forms, where one form is editable by humans and the other is runnable by one very specific kind of machine (but not any others). It seems like it ought to be possible for software to exist in one form that is sort of just “human-editable object code” - otherwise, it becomes increasingly hard to change anything. (Because any change to a compiled program can only occur at the level of a “compilation unit” and its built “artifacts”, which can be extremely large - these days, compilation often routinely requiring the spinning up of Internet cloud infrastructure which may require permission from the cybersecurity teams of a hostile corporation or government.)
Very simple languages like Forth and Lisp seem to have this dynamically human-editable property, mostly, sorta. Usually by allowing complete programs to be as short as single words or short sequences of words, by making the “compiler” so small that it can be ignored for a naive implementation, and certainly by allowing “compilation” to be a thing that happens to small, dynamically generated, pieces of code at runtime.
Maybe this is the same as what Alan Kay wants, I don’t know. It is possible that I too want two separate and contradictory things. If it’s impossible for us to have a universal archival-quality runtime (needed in order to transmit objects whose methods would be defined as binary artifacts of that runtime), then we need to understand what we can feasibly have in a world where data needs to be transmitted between very different types of machines.
The simplest answer still seems to be what the 1970s Internet chose, “Just transmit data and define protocols, don’t even try to transmit procedures”. But at some point we still need to catch up with the 1990s Web which decided that to get anything done on the client side, it also needed to transmit at least the source code of a simple scripting language. (However, not a particularly extensible one - a Lisp would have been 1000x better, and then CSS and HTML could just have been S-expressions, syntax nesting issues would have evaporated, Markdown wouldn’t need to exist, the whole browser runtime could have been much tinier, etc, etc.)