Malleable System Metrics

I saw someone ask the following question, and it hasn’t left my mind since. I’d be curious to know what people on here consider when thinking about this same question which originally relates to languages, but to malleable systems, which goes:

how one goes about determining the “size” of a language, and how the size of a language then relates (or doesn’t) to things like the size of the compiler code base or the number of tests.

Some things to consider:

  • Is the size of the EBNF grammar a good way to measure the size of a language?
  • The number of productions or the number of words in the grammar?
  • What about the number of node types of your AST?

What other metrics do people think are useful or interesting?

1 Like

Lately I’ve been thinking about the idea of a “deal with a demon”. If you’re ever faced with a demon trapped within a pentacle, one of the safest things to ask for is a proof to the Riemann Hypothesis. Ideas close to math have fewer side effects that someone can legalistically find loopholes in to screw you over. But ask for immortality and receive pain forever, ask for a loved one back and receive a zombie, etc., etc. It’s hard to capture a desire in ironclad language because as embodied beings we have so many desires.

On this spectrum programming languages are not very math like at all, so any metric depends on what you care about, and you’ll repeatedly discover that you care about things you forgot to mention. EBNF and AST only capture syntax, not semantics. Attempting to capture semantics will raise questions of, “in terms of what machine?” For example, if you have a Turing Machine a good metric would be the size of an interpreter for the language. But then you’ll realize that you also want programs to run in a blink of eternity, which raises questions of translator complexity. Etc., etc.


Languages (human, programming, …) being open-ended, a first question to ask is what the boundary of the language is. Is the Python standard library part of the Python language, or is it an add-on?

You could go for a minimal kernel in which everything else could be defined, but then you end up with practically useless subsets. Case in point: the famous Smalltalk syntax on a postcard. Smalltalk syntax is useless without a minimal object system to get started, and even a minimal object system is a lot more complex than the syntax on the postcard.

1 Like

Other examples like the postcard, I guess would be the lisp self-evaluator, which requires all sort of unaccounted for scaffolding(GC, etc…), and here I was about to say the Meta II compiler, which is itself a string-parsing machine that generates bytecode. So I got thinking, could the metric of a language be the smallest subset of itself that can compile itself?

In the case of the Python stdlib, I’d say that the metric of the python language probably not include the whole of the stdlib, but the parts necessary to compile itself, or evaluate itself? Would the boundary to a circular autonomy to generate itself, be a good metric?


Yes I think so. You’re in good company here with Niklaus Wirth. His performance metric with Pascal and Oberon was always time to compile the system using itself. This includes both compilation time and quality of compilation output.

I did something similar with Mu as well. Simplest language to compile when all you have is x86/SubX.

One property both these examples share: they don’t fit on a postcard. Invariably the simplest language that tries to account for all dependencies ends up looking intimidating to human eyes.


It’s teetering close to the edge of the tarpit for sure.

I can’t imagine this metric to be in any way an index of how good a language is, maybe an index of resilience, or bootstrappability. But to escape the tarsand, you’ve got to spill over a few postcards lengths.


“Smallest unit that can build itself” is a useful metric in the sense that it’s hard to cheat on, intentionally or by neglect. But it is comparable between languages/systems only if the bedrock platform that it’s built for is the same. The size of the compiled version would then be something similar to a Kolmogorov complexity.