Fearless extensibility: Capabilities and effects

jryans · 1 November 2024 16:33

Here’s my submission!

natecull · 1 November 2024 21:40

Effects are a (relatively) newer programming language concept that allows tracking user-defined side effects as types (e.g. IO, memory access, exceptions) and also supports effect handlers to take some action when these effects occur. They’ve been percolating in experimental languages (Koka, Effekt, Unison) for a while now, and are starting to appear in more established ones (Scala, OCaml).

Ah! So that’s what people mean today when they say “effects”: not just “side effects” but “side effects as types”. Interesting.

khinsen · 21 November 2024 14:27

I finally got around to (1) read your contribution and then (2) learn enough about effects and effect handlers to make sense of it.

What isn’t quite clear to me is how your proposed solution actually solves your example problem of the color picker. You consider a language in which function calls are implemented as effects. Your extension would then presumably implement an effect handler for function calls, and somehow narrow this down to the specific function it wants to modify. But as you say, the extension needs to re-use parts of the implementation of a host function. To do that, it’s not enough to intercept function calls, you need to inspect the code of the function.

khinsen · 21 November 2024 14:29

As for your question about similar mechanism in dynamic languages, my first idea is the Common Lisp condition system. My impression is that it does everything that effect handlers do (assuming that I understood them correctly). But there is no type checking involved, so all your safety features are lost.

jryans · 21 November 2024 17:37

Thanks for doing that work, it’s great to hear your thoughts! I should consider writing another version that folds in more of relevant context so there’s less burden on the reader to run off and learn a bunch more things first…

Hmm, yes, for some reason I was overly focused on just calls in post… I suppose I was assuming you could achieve quite powerful extensions by “manipulating” just calls and their return values (via these newfangled call effects and handlers), but of course (as you say) that’s not enough in general.

You might want some fragment of a function’s behaviour without calling the whole function (since it might have other parts you don’t want). So then it starts to feel like you need to be able to extract part of a function and run only that, but of course it will depend on various bits of encapsulated program state and intermediate computations that would have happened the original function it’s being lifted from… The approach in the post was already feeling a bit complex, and this ratchets us up yet another level… I do feel like there may be a decent enough way to achieve it, but it’s too fuzzy of a thought for now…

Ah yeah, several facilities from Common Lisp do seem related. There’s conditions, as you’ve mentioned. The aspect-oriented bits of CLOS (with before / after / around methods) also achieve some parts of this story. But yeah, all of that would be untyped.

Type safety feels important to making this approach actually provide some guarantees. It can provide extension-time (as in, the moment you install / apply / add the extension) assurance that the current versions of the host and extension are compatible at the boundaries where they meet. It can also highlight mismatches that might be introduced by future versions on either side, which is very helpful for maintenance over time.

Anyway, as you’ve correctly pointed out, the details are still quite fuzzy in the post. I would like to try to construct a prototype system with just enough bits to prove out these ideas with some examples. I think that’s probably the best path to make the argument more convincing. The trap I keep falling into at the moment is the “prototype” system I imagine in my head keeps growing until it has so many features and layers until it becomes too much… I need to sit down and very strictly identify only the things that are truly required to prove out the idea.

khinsen · 22 November 2024 07:33

What would really help is a solution sketch for your initial example. Pseudocode is fine, if there’s no language that has the features you need. But it should look credibly implementable.

natecull · 23 November 2024 02:43

Hmm.

What would it mean to take a “fragment of a function”? Wouldn’t that just be the same thing as dividing your function up into smaller functions and publishing those smaller functions instead?

I get that the problem of writing extensions is that often the host program author hasn’t divided their functions or objects up in to small enough pieces.

In the Interactive Fiction community of the 1990s through 2010s, this was a big problem: there were custom-built languages and libraries with complicated world-modelling features, but every world model was never quite what any given game author wanted. For a while there was an attempt to use OOP to add “hooks” - a bit inspired by “aspects”, I think, with on-before and on-after functions, to override the default behavior. But this never really worked right, particularly when following a hierarchical class model.

One answer, used in Inform 7, was to abandon OOP almost entirely, and move the world-modelling into runtime-reflectable “rules” which could be dynamically turned on or off and overridden by the author’s code. This is maybe similar to what you’re proposing. The rules model did allow for quite a lot of granularity and reflectivity.

This of course only works because in IF everything runs in a small local VM and it doesn’t matter that the game code has root access to the entire “OS” of the VM. Whether this approach could ever be safely scaled up across the Internet - or even to the local desktop - is one reason why I felt nervous about it. (All the IF VMs now run in web pages as Javascript, and Javascript has the same security problem.)

Another answer was to just treat the libraries as source code, and have each project heavily modify the libraries themselves. That also doesn’t scale well across the Internet, but it does have the advantage of simplicity.

khinsen · 24 November 2024 10:00

From the point of view of extensibility (and malleability), rewriting systems are interesting indeed. In my digital scientific notation, which is a pure rewriting system, you can trivially extend any piece of code by including it in your own. However, that works only because it’s about pure computation without any effects. No I/O, in particular. For my purposes that’s just fine, but there are perhaps not so many use cases for pure computation.

Apostolis · 25 November 2024 09:24

Don’t you perform a computation with your rewriting system? By giving it a term and rules and letting it do the computation? And then viewing the result and giving another term ad infinitum?

I believe you do. What is missing is the formalization of that process.

khinsen · 25 November 2024 09:56

Not sure I understand the question. Yes, I do perform computations in my rewriting system, and I show their results, embedded into the document that contains the rules and the initial term (and more). But there are no effects outside of that document, so the document is a kind of sandbox.

Apostolis · 25 November 2024 12:27

You are the effect. If the document was not readable by anyone, then it would be a true sandbox.
There is no formalization of the effect in the language and the document , but the effect exists.

It is provided by the substrate, glamorous toolkit. But we do not model glamorous toolkit.

khinsen · 25 November 2024 15:04

Ah, I see, you are wondering about effects in the beyond-computational environment. Which exist, of course, otherwise there would be no point in doing the computation.

But if we interpret “fearless extensibility” from this wider point of view, we will have to examine all processes that have our computations as inputs. Meaning that we have to worry about our software being used for planning wars, for example.

Apostolis · 26 November 2024 10:36

My point is that the formalization / language presupposes a specific séparation of a system from its environment.

In this case, I think that your system is pure computation but if we include the environment of the document, then it has input output, it simply isn’t formalized in your language.
A document can have many streams of data as input , or input fields or responses to a server (if it is a web page).

I am not an expert on the effect system, but similarly , it makes the decision to consider the outside world as the environment and treat it differently than what is inside the computer.

Another way is to treat it the same way but to abstract it. For example, the existence of a world object where you send all outputs and receive all inputs.

khinsen · 26 November 2024 12:46

Indeed, and if we cut up the world object into smaller pieces, we get a big part of the capability model.

What I don’t quite understand is what the “effects” crowd cares most about. As @jryans suggested, I looked at the Web site of the Effekt language. It describes effects more as a control structure, a type-checkable version of Common Lisp conditions, rather than as a mechanism for dealing with effects a program has on the outside world. Unfortunately, that’s about all I know about current work in this field.

Apostolis · 27 November 2024 18:01

This is a good introduction :

People who have programmed in haskell with monads and monad transformers might understand better why they help. They replace monad / monad transformers.

jryans · 27 November 2024 20:30

Martin Odersky recently gave an ICFP keynote talk on “Capabilities for control”, which focuses on effects, capabilities, and other related topics. That may also be useful intro to this space.

jryans · 27 November 2024 21:55

First of all, thanks for this feedback!

Roughly speaking, yes, it’s the same as dividing up functions, but as you say, the extension author might want a function to be divided up in such a way that doesn’t currently exist. In particular, they might need non-contiguous regions of a function. It would be a bit tricky to specify exactly which parts you want, especially in a manner that continues to works as software evolves.

natecull:

In the Interactive Fiction community of the 1990s through 2010s, this was a big problem: there were custom-built languages and libraries with complicated world-modelling features, but every world model was never quite what any given game author wanted. For a while there was an attempt to use OOP to add “hooks” - a bit inspired by “aspects”, I think, with on-before and on-after functions, to override the default behavior. But this never really worked right, particularly when following a hierarchical class model.

One answer, used in Inform 7, was to abandon OOP almost entirely, and move the world-modelling into runtime-reflectable “rules” which could be dynamically turned on or off and overridden by the author’s code. This is maybe similar to what you’re proposing. The rules model did allow for quite a lot of granularity and reflectivity.

What a coincidence that you mention IF and Inform 7…! I was just recently exploring Inform 7 again last week. Anyway, yes, it does feel like the model of extensibility and overriding I have in mind would fit in a bit more naturally with that kind of rule-focused system. (My Inform 7 knowledge is currently limited, so I might easily be wrong…)

Yes, you’d certainly have to be very careful with the security implications. I don’t have the full picture yet, but the rough version in my head at the moment would be something like the following… Rather than restricting extension and modification in the name of security, address security through other means:

use memory-safe languages
add run-time assertions for key security invariants
use types to check what you can at compile time
apply sandboxing to isolate bits of code (e.g. module-level or similar) so that extension access to resources can be controlled and reviewed

With various security defences like those in place, my hope is that it would be okay to allow more extensibility than we see today.

On a semi-related tangent, I do think it’s quite powerful to have a typed AST of both the host and extension programs on hand. This would be one route to enabling extension-time checking of the bits where extension and host interact. Scala’s TASTy format is one example of this. Scala uses it to enable quite a few important features like separate compilation, typed metaprogramming, language server abilities for editors, program analysis and optimisation, etc.

Thinking about fragments of behaviour smaller than single functions is definitely a powerful, complex ability that would take a good deal of experimentation to get right, and I don’t have all the answers now. In the ideal case, it wouldn’t be necessary to break things down at such a fine-grained level, but sometimes it’s all you have, especially if host platform is not interested in providing the precise mix of behaviours you need.

natecull · 28 November 2024 09:40

Hi Jryans

I’m glad you’ve also discovered Inform 7. It’s a very strange language in many ways, and I don’t like a lot of what it does, but it does so many things so very differently from everything else that I find it a very useful test-case for theories about language design. It breaks most modern assumptions immediately.

For an actual language for programming IF, I think I much prefer Linus Akesson’s Dialog (Dialog) which is essentially Prolog but with some odd quirks that are worth thinking about (for example, every predicate call comes with an implicit ‘cut’ unless you explicitly ask for multiple results: this both simplifies the mental model and works really well for specific-special-case-with-default-fallback kind of definitions, which come up a lot in IF programming). But in terms of metaprogramming or runtime reflectivity, it doesn’t really have any - it’s an oldschool C-style compiler that relies entirely on importing source files, and having the source code to all the libraries that you’re using, and being very careful to import them in a strict order.

In particular, they might need non-contiguous regions of a function. It would be a bit tricky to specify exactly which parts you want, especially in a manner that continues to works as software evolves.

I guess I’m still trying to understand what you can possibly mean by “regions of a function”. Pieces of an object I can grasp, because an object is a structure with named members. But pieces of a function don’t generally have names. A function is pretty much just a tree of literals, function definitions and function applications, isn’t it? There’s not room for very much else in there.

Are you thinking of a piece of code importing a function from an external library and saying “hi, I’d like entries 1.2.3, 4.5 and 7.9 from the big tree of function-calls-with-arguments that defines you, in that very specific order, and please don’t ever be refactored to call those in a different order”? Or “I’d like private variable X in your internal environment (which only exists at some point during a call), and please don’t ever be refactored so that private variable X is renamed private variable Y”?

If you absolutely do need access to someone else’s code at a level below “function”, wouldn’t it be much simpler to just copy-paste that code into a file or object that you control, so you know it won’t change out from underneath you?

I guess there’s the Emacs Lisp approach of using a lot of global variables, or dynamic rather than lexical binding, so that you can change the definition of a variable/symbol before you call a function, that way sorta-kinda getting access to its internal environment. There’s probably a good argument that some level of dynamic environment is needed to get full extensibility (and I feel that John Shutt’s vaus are maybe a safe-ish way of doing this: macro-like functions which take the current runtime environment as an argument. Although this still delegates a lot of trust – far too much trust – to every macro call, much more than function calls).

A problem in the IF world that has made me wonder a lot about variable scoping and environments, is this:

Suppose I have an adventure game with magic spells, as in Zork / Enchanter etc. And the result of a spell is defined by a rule. And I also have NPCs in the game with basic character AI which would like to do goal-planning, involving predicting the effects of what casting a spell will do. And I would like to not write my spells twice: once in “code” instructing them to change the live game, and again in “data” describing for NPC AIs how they might change the live game if run.

I would then like spells to be defined in such a way that they don’t immediately alter the game environment, but rather they return a data structure describing the effects that they will have on the game. This could end up quite tricky: even if a spell is defined by a function, that function can’t necessarily query anything in the current “live” environment, but rather would need to have an environment passed to it, from which it can compute a new environment, or a set of changes to an environment. So that then the AI of an NPC can check that list of changes, work out what the result of casting a spell might be, and use that to figure out when it needs to use it. Which is all fine, except that that means that now absolutely everything in the game world has to be defined not by ordinary objects or functions, but rather by things that can be put into such a speculative “environment” and passed around.

And having environments, and changes to environments, being fully dynamic, runtime-reflectable data structures, seems to make name-binding of variables behave in non-intuitive ways compared to how we normally think, with today’s programming languages.

When I last looked at this problem, which was a few years ago now, I found myself constantly frustrated by the limits of programming languages: doing these two steps of dynamic indirection always felt awkward and wrong, but it feels like something that ought to be very simple. And it felt like something that could yield interesting language-design insights if solved in the simplest possible form. I probably should fire up Dialog again to see if I can model the problem.

jryans · 28 November 2024 14:34

Thanks for mentioning Dialog, I’ve also recently stumbled on that as well… I’m far too new to these IF languages to offer any commentary just yet, but hopefully one day…! I definitely agree Inform 7 is quite different than many other languages… For now, I’m trying to reserve judgement and absorb what I can from it, as it may unlock new ways of thinking.

natecull:

I guess I’m still trying to understand what you can possibly mean by “regions of a function”. Pieces of an object I can grasp, because an object is a structure with named members. But pieces of a function don’t generally have names. A function is pretty much just a tree of literals, function definitions and function applications, isn’t it? There’s not room for very much else in there.

Are you thinking of a piece of code importing a function from an external library and saying “hi, I’d like entries 1.2.3, 4.5 and 7.9 from the big tree of function-calls-with-arguments that defines you, in that very specific order, and please don’t ever be refactored to call those in a different order”? Or “I’d like private variable X in your internal environment (which only exists at some point during a call), and please don’t ever be refactored so that private variable X is renamed private variable Y”?

If you absolutely do need access to someone else’s code at a level below “function”, wouldn’t it be much simpler to just copy-paste that code into a file or object that you control, so you know it won’t change out from underneath you?

Sure, I haven’t been very clear thus far… I suppose I’ve mostly been thinking in an imperative language context. Extending the snippet from the submission, a host platform may do something like:

def pickColor() {
    val colorName = colorPicker.choose()
    showNotification("color picked")
    val colorValue = colorName match {
        case "red"   => "#ff0000"
        case "green" => "#00ff00"
        case "blue"  => "#0000ff"
        // many more entries...
    }
    palette.add(colorValue)
}

An imagined extension might want to execute some but not all of this, e.g. it may want to invoke the color picker UI and convert the color name to a value, but it would prefer that to happen silently without a notification and without changing the palette.

Obviously I’ve crafted this in quite a particular way for the sake of an example, but hopefully it suggests some of what I’m thinking. One might hope this host function was already broken up into several functions, but that’s not always the case. Yes, you could copy the color name to value conversion into the extension, but perhaps new color names are added, and now you’re out of sync with the host. It would be better to have only one copy of that around. Copying the behaviour that interacts with the host’s color picker UI may not be practical depending on the language’s scoping / namespace / accessibility details.

I would guess that with other language families (functional, logic, etc.), you’re less likely to have bits of behaviour “strewn about” a large function, so it’s mostly an imperative language concern.

It does seem challenging to ensure different modules are all cooperating nicely in fully dynamic systems such as this. My current feeling is that having both type safety together with reflection and metaprogramming (that can still be type checked…!) is a good potential path here. Scala (somewhat uniquely…?) seems to achieve this, so that’s why it’s on my mind lately.

natecull · 29 November 2024 10:07

It does seem challenging to ensure different modules are all cooperating nicely in fully dynamic systems such as this.

The situation I was thinking about with “spells” is not quite the same as your “color picker” example, but it’s possible it might be connected at some point.

One of the ideas was that I would be building something like an always-online MUD or virtual world. (This came from using a MUD, and being dissatisfied with how frustrating it felt to develop in it). Such a system would need all the rules which describe its virtual world, to be editable at runtime by the players - while the world was running, and and in tiny pieces - meaning that notions of “compilation” and “typing” and even “scope” became quite problematic to define. And you couldn’t use a mainstream language like C++ for the players to code in. An “object-oriented” approach would seem to make sense in that environment, as indeed many MUDs of the 1990s were. Lua or Javascript would have been obvious choices, but both were quite painful to write in. And OOP failed so dramatically for IF that the trend moved away from objects and towards rule-based systems. And I was trying to figure out if there were any general rules suggesting how knowledge-based systems, which don’t really have a notion of locality or message-passing like objects do, could work with an always-on, never-fully-recompiled system. And if possible, that this system wouldn’t be hamstrung to a tiny VM built in the 1970s for text adventures, but could be something I could develop broader desktop applications on.

I was also thinking about the idea of a language that is user-facing needing to be maximally expressive, right down to the syntactic level - so that it can describe user-level (ie domain-level) concerns as directly as possible, in terms as close to the domain as possible. While still preserving the locally-based, always-on parts of the object-oriented vision.

I didn’t manage to figure any of this out, and I still can’t quite define the elusive spark that I was trying to figure out. But it was related to the idea that in knowledge representation (for example, the definitions part of an Inform 7 game file - but also in personal data bases, and almost everything we have on our desktops), what we store mostly aren’t functions (instructions on how to compute a value) but rather properties of entities. We’re not saying “do this, then that”, nor are we saying “X is the computed value of Y”. Rather, we’re saying something simpler and more fundamental: “X has property Y”. And then, some very small bootstrapping kernel (perhaps term-rewriting with lambdas or combinators, perhaps Robinson resolution with logic variables and backtracking, or perhaps something different entirely) should allow us to build computation out of that, as and when we need to, but without forcing pure declarations into the shape of computations if they don’t fit. There seems to be a very important difference there, but trying to trace that difference out leads one first into the crumbling ruins of logic programming, and then beyond that, the howling paradox-haunted wastelands of logic itself. Which is a very scenic country, in a 1920s German Expressionism kind of way, but doesn’t have much to say about how to build pleasant user-facing languages.

The big problem I have is that everything we work with in programming is just too darned complicated. The core essentials of a language - if it’s to be a knowledge-representation language, that you can use as a user interface, and build other languages on top of - have to be tiny, clear and simple. Even small, powerful ideas like “variable binding” start to look too big. The user wants to say something (containing variables) like “You can cast Frotz on any Carried X that is Frotzable; X will then have the Lit property” and have the NPC AI understand from this that “in order to get a Lit Lamp, since the Lamp is both Lightable and Frotzable, one possible route is to acquire the matches and some kerosene and first Fill and then Light Lamp; another is to first memorise Frotz from the spellbook, then cast Frotz on the Lamp”. A lambda function doesn’t seem quite the right way of representing the definition of the Frotz spell. Or how Filling and Lighting and Carrying work, either. A Prolog rule comes very close to capturing what we want - and that’s when one starts to look too closely at the true nature of Prolog variables, and madness sets in.