Resistance to total software failure, a requirement for malleability.

Apostolis · 17 June 2026 13:26

Have you ever worked with a smalltalk system only for the system to crash and be unrecoverable? This is a total system failure, meaning that all your data and changes are lost.

Thus, even though the system is malleable it is pretty scary to do any changes. The common pattern is to load a fresh system and reload any changes you have saved outside of the image.

Organisms are resistant to mutation total failure, and have been an important influence to my thought.

https://www.nature.com/articles/d41586-026-01883-0

What mechanisms can we find that protect users from change in their code?

natecull · 18 June 2026 04:11

I remember working with Jade, a Windows-based software environment heavily inspired by Smalltalk, about 20 years ago. All its objects were persistent (or could be selected as persistent or runtime-only) making it both a programming language and a database. It stored all its objects in a database called the “schema”, which I assume is much like a Smalltalk image, and used Smalltalk-like browsers/inspectors to do all the code editing. Though you could export and import the class method code (not the object instance data, sadly) into text files for safekeeping.

Exporting your code was necessary because the schema would reliably crash at random times, making your database garbage and your complete system unusable. At that point all you could do was create a fresh schema and import your exported code, and of course you’d lost all your “persistent” data.

That experience gave me a bit of a sour view of the whole Smalltalk experience. It’s not okay for me for a runtime to crash ever (unless the physical machine it’s on has a hardware fault), but here it was, a supposedly enterprise-ready production system, crashing every day.

Jade’s still around today ( https://www.jadeplatform.com ) and the UI feels very much like it did back then. Not sure if they’ve fixed the crashing problem.

What mechanisms can we find that protect users from change in their code?

I’ve always thought that an object-based system ought to start with a fundamental mechanism of an object representing a changeset or a snapshot. Ie, a base environment plus a set of changes (additions, deletions, or mutations); if you query it for something not in the changeset, it passes the query to its attached environment object. You could make it read-only or read-write, because read-only objects have massive superpowers that it’s worth using. Then, you could just stack up an arbitrary number of those objects to be “semi-permeable membranes” at whatever level of scale you want. You’d use them to represent imports/exports, backups, diffs, commits, transactions, views, applications, volumes/drives, containers, virtual machines, software defined networks, private clouds… etc. You could even create one of them to represent “an arbitrary chunk of the entire Internet, or at least the slice of it that your application cares about”, and bundle it with your app so it can work when the Internet servers it was built for eventually go offline.

The resistance-to-failure part is that at every level, every snapshot could just be reverted if any of the changes “inside” it break (reverting being especially easily if you have read-only snapshots). You could have everything from small transactions (interactive undo/redo) to sandboxes to entire coordinated versioned releases of large networks represented by the same snapshot concept.

Sadly, nobody ever defined this abstraction or built it, so we just keep reinventing it over and over, badly and lossily. But I’d love to someday see it happen.

If we wanted to have a go at making this today, we could probably start with JSON as a base. Not that JSON is a great data model, but it’s everywhere now and everyone knows it. Dictionaries with string or integer keys.

Imagine if you could scale a single JSON data structure to the size of the Internet? By allocating a server to each key. Then scale it down so that your local LAN is one subkey, your local computer is a subkey of that, your current login session is a subkey of your computer, every running app is a subkey, every variable binding in a function call is a subkey… like Plan 9, but as JSON rather than a filesystem. Preferably not automatically routing every access call though untrustworthy Silicon Valley hyperscalers. I feel something like that would be really useful. It would be an actual “cyberspace”, with data locality, snapshotting and change management built in as core required concepts.

khinsen · 18 June 2026 19:00

I did crash Smalltalk images to the point of non-recovery, other than by starting from a fresh image and loading the code from scratch. However, I wouldn’t call this “scary”. It is a pretty rare event, unless you do it intentionally.

In modern Smalltalk practice, as opposed to the early Smalltalk-80 days, image files are considered ephemeral, being kept for at most a few days. Many developers start from a fresh image every morning. Source code is stored in Git repositories, and is loaded automatically from a startup script. So from a purely pragmatic point of view, it’s not an issue.

Theoretically speaking, it is indeed a big issue, and it’s easy to imagine scenarios where such a behavior is inacceptable. So the question “can we do better?” is clearly relevant.

The comparison to organisms is interesting but I don’t agree about mutation resistance. Many mutations simply kill a cell. In multicellular organisms, this kind of failure happens at reproduction, which likewise often fails.

But there is also a problem with the comparison. For a computing system, you care about the individual. In biology, you care about the species. The biology equivalent to your dream Smalltalk system would be an organism that could survive changes to its genome.

Back to Smalltalk: I doubt you can give the user the power to change everything and at the same time prevent crashes. It comes down to something like Gödel’s theorem: whatever formalism you use to guarantee the absence of failures will also prohibit some valid and potentially useful code.

Apostolis · 18 June 2026 19:14

The nature article I referenced, explains one way that this is done.

In another one, which I can’t find now, it said that proteins can maintain their functionality under multiple mutations, though it didn’t explain the mechanism behind it.

neauoire · 19 June 2026 01:56

Baker wrote about this multiple times, but in regards to objects, he made this point, which I think is very interesting:

Simula [Dahl66] started a revolution in computer languages which is not yet finished. It proposed the equivalence “programs = physical objects with behavior”, but forgot to throw away the previous equivalence. Smalltalk [Goldberg83] picked up the ball, but the mathematical expressionists then converted the metaphor of physical objects into polymorphic mathematical type systems–i.e., C++ [Stroustrup86]. The elegance of the mechanical metaphor was thus buried underneath mathematical mysticism.

Linear logic [Girard87] can be viewed as the latest attempt to bring back the physical object metaphor, but stripped of its polymorphic pretensions. For the first time in 50 years of computer science, a metaphor of programming has been proposed that most people can relate to–objects have true identity, and objects are conserved. As in the real world, an object cannot be copied or destroyed without first filling out a lot of forms, but on the other hand, the transmission of objects is relatively painless. An object is localized in space, and can move from place to place. (Only computer-literate people must be told that the transmission of these objects does not create copies.) Linear logic finally makes precise the high school notion that a function is like a box into which one puts argument values and receives result values, and that a truly “functional” box does not remember its previous arguments or results.

The mathematical expression metaphor must be sacrificed to make way for this linear/conservative object-oriented metaphor, which–as we have seen–is not a great loss, as computer programs today deal with non-mathematical objects most of the time. [footnote 1] The few remaining mathematical expressions–e.g., the quadratic formula–can be computed, but require a more process-oriented description. In such a description, we must first make copies of b and a, since they are both used twice, and then we can compute the result. The requirement for making explicit copies of b and a is obvious to any 12-year-old, but computer scientists have spent 40 years eliminating this one trivial task while greatly increasing the costs for everything else.

A linear logic language achieves its elegance through a starkly simple rule–a bound variable name can be “used” only once. Thus, variable reads are destructive and hence variables are “read-once”. Any attempt to use a name twice (or not at all) is flagged by the compiler. A use of the variable name as an argument in a function call means that the object referred to by the name has been given to the function. Unless the function returns the object as one of its results, the object is gone, and cannot be referenced by the caller. Therefore, many operations on linear objects have the policy of returning them when they are done–e.g., the length function returns the length of a list as well as the list itself. A function like “+”, which accepts two values and returns their sum, can be thought of as “consuming” its argument values and constructing a result value. Not only is this metaphor appealing, but it can be very efficient in practise–e.g., in multiple-precision arithmetic, the storage utilized by the arguments to “+” can be reused to construct the result.

The requirement for making explicit copies of b and a is obvious to any 12-year-old, but computer scientists have spent 40 years eliminating this one trivial task while greatly increasing the costs for everything else.

Linearity doctrine for objects could lead to reversibility if all operation is a state transformation that can be undone. It’s actually one of the point I find most interesting about malleable systems, personally.

khinsen · 19 June 2026 05:46

What the Nature article describes is one of the many error correction and compensation mechanisms in organisms. We have those in technical systems as well, but not so much in software. Possibly because we still stick to the belief that software can, in principle, be made error-free by repeated analysis, inspection, and fixing. We should know by now that this simply doesn’t happen in practice.

Following the ideas of nature would mean robustness via redundancies. Maybe we will get there. It requires moving away from efficiency and performance as design priorities.

khinsen · 19 June 2026 05:48

I have heard people discuss linear logic for many years, but so far I haven’t seen a real-life software system built on its principles. Do you know of any? What I wonder is whether linear logic is just too alien to be tried out, or if people tried and failed for some interesting reason.

Marcel · 19 June 2026 13:14

I’m wondering: Is it an intrinsic property of malleable systems that you can break everything or is that just an accidental property?

As a Smalltalk user, I can understand the desire and benefits of self-hosting very well: Smalltalk is bootstrapped into a self-supporting image and from then on, everything happens inside the image. If you make improvements to the Compiler class or to text editing or any other part of the system, the benefits apply to all code.

However, by breaking your text editor / your compiler / whatever, you can very easily get yourself into a broken state. The root cause is that after building better tools and abstractions, we “pull up the ladder” behind ourselves.

Could we perhaps force the computing/tool bootstrapping pipeline to stay alive and be maintained? I imagine something like a Jupyter Notebook: An environment with a sequence of code blocks where code can only affect the parts that come after. You could still improve the development experience itself – build better compilers, structured editors, new interactions and tools – but you could only use them after you defined them.

In such an environment, you can still improve the understanding of the bootstrapping pipeline (with additional documentation, interactive visualizations, traces, etc.) but you can’t replace your bootstrapping pipeline with tools that depend on the bootstrapping.

The most critical benefit is that you know what code can’t do: It can’t break existing tools. When working on some code, you know that your editor won’t break.

Additionally, I think being forced to serialize your code (and domain concepts!) into a linear acyclic sequence could help understandability, though I’m not sure. But I know that self-hosted systems like Squeak/Smalltalk tend to connect concepts when convenient instead of enforcing a strict layering (for example, why does the Object class have a method called “hasModelYellowButtonMenuItems” again…?). A linear, non-recursive sequence of small code chunks that only refer to previous code means you can read everything from beginning to end to deeply understand the entire system.

akkartik · 19 June 2026 13:36

Yes, I think it requires changing people rather than changing software. The tool should include a curriculum.

But a curriculum is a heck of a lot of work, and building it takes a very different set of skills.

neauoire · 19 June 2026 14:49

Movable Feast Machine and robustness-first computing is the equivalent in software, David has been exploring this for years, the livestreams are excellent to watch on the topic

I have heard people discuss linear logic for many years, but so far I haven’t seen a real-life software system built on its principles. Do you know of any? What I wonder is whether linear logic is just too alien to be tried out, or if people tried and failed for some interesting reason.

I’m sure you’ve yourself tried linear types in one language or other, or wrote some Forth, or Cat which was an entirely linear language at one point, or maybe you’ve tried writing some interaction nets. Languages that have linearity, will often also give escape hatches(DUP, POP), almost always.

Vine is an excellent language with linear doctrine built in that’s very usable, I’ve linked to it in the Malleable Computing article above. Typically people like to mark regions of a program as linear(-o in prolog) and have escape hatches (with !). Purely linear system makes it difficult to duplicate or destroy nested objects.

I work with multiset rewriting a lot, and this is how I categorize the possible state transformations:

A fraction is Linear if every resource involved is consumed exactly once and produced exactly once. The resource exists in both the numerator and denominator, and its exponent on both sides is equal. No resource is created, and no resource is destroyed. It is a state transition. apple/dollar
A fraction is Affine if it is allowed to throw data away. A resource exists in the denominator, but its exponent in the numerator is smaller. You are consuming a resource and intentionally forgetting it without replacing it. []/trash
A fraction is Relevant if it requires a resource to exist, but duplicates it rather than consuming it. The resource’s exponent in the numerator is greater than its exponent in the denominator. The resource is acting as a catalyst. cell^2/cell
A fraction is Unrestricted if a resource can be copied or ignored at will. x^y/[]

Looking at Baker’s linear lisp, you can feel the catlang running the show behind the scene:

(defun fact (n f)
  (if-zerop n (progn f (1+ n))
    (let* ((n n-prime (dup n)) (f f-prime (dup f)))
      (* n (funcall f-prime (1- n-prime) f)))))

Lastly and only somewhat related, I came across this excellent line in Koffman’s Knots paper yesterday, which reads like someone who realize that the logic foundations that make up early computer history and not account for resource usage are missing something, but you can’t quite put your finger on it..

“Boolean logic is obtained in notation by ignoring the existence of intermediate states.”

khinsen · 21 June 2026 08:36

Thanks for all those examples! Not sure I’d call any of them “real-life” but that’s obvious a matter of definition. Forth is definitely real-life enough for me, but not 100% linear. Baker’s Linear Lisp article is a very good explanation, but I wonder if anyone has ever used Linear Lisp in practice.

I don’t see the relation between linear logic and Dave Ackley’s work, but then I am only superficially familiar with the latter. I’ll put it on my reading list again!

Apostolis · 21 June 2026 15:51

Good points!

In addition, we haven’t really considered programming and science as a historical process. We don’t have the tools to do so.

In Science, we have research papers that reference each other. After a while, some concepts are clear, and a book is written that omits all the historical information of the inceptions of the concepts.

Historicity is ignored, but this is the way we program.
We don’t know everything at the beginning. But the remaining artifact, does not record the social process.

Does git store that information? Or the issue list / pull request list? I am not sure.

natecull · 21 June 2026 23:51

I remember about 20 years ago (hanging out on the original Ward’s Wiki - the Portland Pattern Repository, where there was a lot of interesting programming language theory talk) reading about some guy from the IBM mainframe world who had built his own “dataflow” based system. One of that system’s main properties was that objects always had “ownership” which they handed off to one process or another. So you couldn’t copy, only “give”. Apparently it made for a very reliable system. That seems to be a similar property to this.

And of course stack languages are very much linear in their access mechanisms, at least to the stack.

I do wonder though how this linear-access property can work with heap storage. Because I don’t think we have any “pure” stack languages, we always combine a stack with a heap (or a completely unmanaged memory space? Like Uxn does.) We generally want pointers between objects when we can do it, rather than multiple copies of data, because it both saves space and makes for fast access…

… except that both the Internet and multiple layers of cache really don’t like pointers very much. And relational databases are all about values rather than pointers.

So we seem to have a weird sandwich at the moment of:

actually-existing CPUs seem to use register files rather than stacks for some reason, probably because it uses more silicon
stack is nice and fast and local and cheap to build and is quite easy to think about and has linear access properties, which match real-world hardware (except for all the physical CPUs we’ve built with massive register files that hate stacks)
heap is fast and local but not always especially linear, so we have to think a lot about how we manage it - and out-of-bounds access and use-after-free are hungry ghosts which stalk our entire Internet infrastructure and cost us billions in ransomware each year
but our actually-existing heap, even in a single machine, is now a whole sub-sandwich of L1, L2, L3 cache, RAM, virtual RAM on SSDs, maybe also slow magnetic drives for poor people running old machines, and the big problem is now how to keep data local and avoid cache misses
beyond one process’s memory space, we have to either talk to a database, or marshal values on and off the wire (copying rather than pointing) to talk to another process, or send to an Internet service, so we’re back to somewhat linear access (but not entirely - it’s not possible to know whether an Internet file or app server has actually deleted its copy of data - though lower-level TCP/IP packet routing is very much linear)
do we want something like a Web server where, every time you make a HTTP GET request, it deletes the page? Because that’s what a linear access model would look like if we scaled it to the Internet, right?
But perhaps we need to be thinking carefully about the difference between read-only vs read-write objects, which maybe need very different access models each? A read-only object can be content-addressable, like by a hash, and downloaded/stored/cached locally. A read-write object, now we have to deal with write contention and locking and access permissions and a linear access model makes a LOT of sense there.

Can we simplify this sandwich at all? Make every resource in a system use the same kind of linear access model? or is that not physically realistic and we just have to have multiple conflicting resource-access models and constantly juggle their conflicts and sharp edges in our heads?

khinsen · 22 June 2026 06:15

Side note: the pattern language wiki, the first wiki ever, is still around, though in read-only mode: https://wiki.c2.com/

Marcel · 22 June 2026 10:05

True. The History as a First-class Citizen - #61 by khinsen thread also discusses this tension between having a small, curated, thoughtfully distilled artefact, and having a rough, raw, immutable stream of imperfect events, capturing everything about the process.

In my view, deeply understanding how and why a tool evolved a certain way can not be solved with technology alone. Just capturing all the data (git commits? pull request comments?) is technologically possible, but I think the valuable part is reflecting on the process and what forces drove you to make certain decisions. I believe this is not something that can be done up front: Pull request comments of design decisions make sense in the moment, but you are always smarter afterwards. People learn, their opinions evolve, and dumb decisions are only obvious in hindsight.

I resonate most with the idea of building small, cozy artefacts, but explicitly reflecting on the process and documenting important historical decisions. This act of reflection and understanding what drives a project is something active, first-class, and not something that can be captured automatically.

khinsen · 22 June 2026 10:47

Exactly. I explain this to my colleagues in academia by an analogy with scientific research. While you do research, you write down everything that you think could matter in a lab notebook. In the end, you publish a journal article, which is a story but not necessarily the true story of discovery, and certainly not the story with all the details. You can publish your lab notebook in addition for added transparency, but that doesn’t replace the journal article which is carefully written for others, not for yourself or your team mates.

Apostolis · 22 June 2026 13:06

In all cases, it is important to have a narrative of the process instead of just collecting data.

We could have stories in lab notes, research papers , books and all those narratives interlinked.

Both the narrative and the artifact, the software stored with its history.

When we update a class or an object in smalltalk we can only have one narrative, the last one and one software instance, when it is very useful to be able to access all of the history, inside the environment.

One of the reasons, as discussed, to be able to recover from a crash.

khinsen · 22 June 2026 15:20

Smalltalk implementation have traditionally kept a log of changes to the code (i.e. the classes). Squeak and its descendants still do. But it is intended for crash recovery, not story telling. There is nothing but the changes themselves in historical order. No grouping into commits. No commit messages.

This reminds me of the days when people discussed Git vs. Mercurial, before GitHub made Git the only choice for nearly everyone. “Rewriting history” was a big part of the debate, with Mercurial saying “history is fact-based and inalterable” and GIt saying “history is the story we tell about the code”. As far as I know, nobody ever considered that the existence of multiple branches would have made is straightforward to store both the commit history and the editable story.