Concatenative Event Streams and Reactive Programming

The term “Functional-Reactive Programming” as a concept of computation is thrown around a lot today, and a bit like with Object-Oriented Programming, every person with an implementation of it seems to have their own definition.

Wikipedia: Functional reactive programming - Wikipedia

Basically, the idea is that everything in a software system (an application inside a web browser, usually, but if it’s a valid model of computation then theoretically it could be an entire application, OS, or network of computers) is a “pure function from input to output”. Pure functional languages usually struggle with I/O, because it involves change. The various approaches loosely described as “FRP” try to remove this element of change by pushing the change itself into a “framework”, while the components which handle this change remain “pure”.

It’s easy to glimpse the loose spirit of FRP by looking at a graphical window. There’s a bunch of components. Events come in (mouse clicks, keyboard presses, updates from the data source). Events go out (screen rewrites, changes of state of the components, changes to the data source). Naively, you’d think “this window is basically just a big function from a bunch of input events to a bunch of output events; why can’t I just write it as a function? A big pipe of data go in, data come out? Wouldn’t that be simpler than all this object stuff of setup and teardown, callbacks and promises, invocation ceremonies, component registries, mockups and tests and such?”

An even more naive, blurry, look at a window would make you think further “well this window is literally just a view of some data. A view is a function. Why can’t this window just be… a function call?”

Someone coming from the command-line world, especially Unix (or Powershell on Windows) might also think “All my text commands are pipes. Why isn’t my window a pipe?” And especially if they’ve used Tcl/Tk, they might ask “there’s a scripting language where windows literally are pipes, why is the whole world not like this?”

And presumably some of this thinking seeped into the huge Internet companies writing Web frontends, and became things like React, Reduct, Vue etc.

I’ve been watching the FRP scene from a distance for the last 20 years or so, and remain confused about how exactly the core paradigm is supposed to work. My feeling is that while the edges can be as complicated as you like, the core mechanism of “how computation happens” ought to be fairly simple and graspable. In Lisp, it’s “lists are rewritten into other lists”. In Forth, it’s “words put stuff on the stack”. In Smalltalk, it’s “objects send method selectors and arguments to objects and get replies”. In Fortran, Algol and all the little Algol children, it’s “I do this then I do that, then I go somewhere else and do what it says there”. 28 years after it was first introduced, FRP remains … not quite that clear. Something updates, or changes, or doesn’t change, or gets “reduced” or doesn’t, but what, and how?

One idea that was buzzing in my head around 20 years ago - under the influence of Joy and Factor at the time, and the revival of Forth - was this:

“FRP systems are really event-based programming. A program then looks pretty much like a pure function of a stream of input events, to a stream of output events. A graph of components, yes, but essentially a tree starting from the machine input. Each component in that graph/tree then could be expressed as a function which takes an input event, and replaces it with (zero, one or many) output events, and also an updated function to catch the next input event in the stream”.

I felt like this concept ought to be pretty simple and easy to grasp. Maybe too simple - it probably handwaves away all of the important detail. But I wanted to see it implemented.

I never did end up implementing it. Maybe it’s time to try.

20 years ago, I imagined the language and the evaluator as being a prefix-concatenative language, which are super rare, and haven’t been worked on in the years since. It seemed important at the time as a way of capturing the intuition that a software system might be able to represented just as an event stream with functions inside it, with functions plus arguments rewriting themselves to “output plus next version of function”. But a postfix language would probably have a simpler evaluator and might still be able to capture some of the essence of the idea.

Essentially, functions (or Joy-like combinators, or Forth-like words) would be coroutines: they could produce one value, then yield control to the next function up the pipe, then if reactivated, would recurse on themselves (updating some state, just like a “fold” or “reduce” function). I assume that this is essentially how modern web FRP frameworks like React work, since they use the term “reduce”. But React is a library over the whole object-based complexity of Javascript, the DOM and the browser. This would be something much simpler.

I feel like this must have been explored, and yet I don’t see where it has been explored.

I now have a vague sketch in my head of a modified two-stack postfix machine that feels like it ought to be able to naturally do coroutines over infinite streams of events. It would use a layered stack and a “yield” opcode so that calling between coroutines could be fairly efficient and also safe. I hope. I’ll try to build it to demonstrate what I mean. Whether this approach could actually help in reducing event-processing complexity, I’m not sure.

Edit: It turns out that the Wikipedia article on “Reactive Programming” is much more useful than the article on “Functional Reactive Programming”. I guess FRP lost its F in the last few years?

I’m not at all sure that the two-stack concatenative coroutine machine is enough for doing concatenative streams. But the idea so far is:

We have the usual Work and Return stacks. We augment each of these with either a type tag or a second stack, such that each of the Work and Return stacks can store either a data item or a code address. (Data items can be addresses of code, but by “code address” I mean particularly an address which represents code-being-executed). Normally, the Return stack stores code addresses (of the next program counter after a Call opcode) and the Work stack stores data items. But the Return stack can also store temporary data items (which is normal in Forthlike stack machines)… and the Work stack can also store temporary code address (which is not normal). A code address on the Work stack represents a paused coroutine. We get a paused coroutine by using the “yield” opcode, which leaves the top data item on the top of the Work stack, but puts the code address of a resume address under it. A code address on either stack acts as a stack floor, preventing normal access to data items under it, except using the Call/Return and Yield/Resume mechanisms.

When an opcode executes that hits a Work stack underflow condition (ie if there’s a code address on the Work stack and not enough data items above it) then the program counter goes on the Return stack, and the code address on the Work stack is Resumed to generate more data .

(There’d be a little bit more complexity for the Yield and Resume states: we’d probably want to move any data items on the Return stack to the Work stack on Yield, and any data items on the Work stack to the Return stack on Resume.)

If the Data stack is ever actually empty, then a system “input” opcode executes that waits for input, or reads from a buffer, and puts one data item on the stack. Similarly, if the return stack ever is empty, a system output opcode emits one data item from the Data stack (resuming any yielded continuations needed to do that).

I would also pair this with some form cons memory and a garbage collector, so no raw RAM access.

These primitives might not be all that’s needed - we might also need to manipulate a yielded coroutine in some ways - but hopefully what this might give us is a stack machine that’s strict about memory and stack safety (data items can’t be accidentally or maliciously executed) but flexible enough to handle realtime streams of events without separate I/O instructions.

Thanks, lots of interesting thoughts here.

This was a distinction I was trying to get at in my post at The Data-First movement - #7 by Bosmon but I realise I tend to write posts that are far too long with too many different points in them so I should try to condense!

  • “Reactive programming” is an older, more fundamental and at the same time more diffuse discipline than FRP and has somewhat been eclipsed by the discussion bubble around the latter
  • The cornerstone of FRP was originally the elegant type system around continuous time behaviours, but this has more recently been muddied by various kinds of discrete-time FRP which I think should properly be considered varieties of regular RP
  • On the one hand whilst it is fun I don’t consider the continuous time formulation adds something of primary value to typical end-user programs, and on the other hand distracts from issues which I see as crucial, such as the potential for glitches:
  • Note that the RP article has a substantial section on glitches whereas the FRP one has none. A competent implementor of an FRP system (as suggested in the in the 2012 Bainomugisha et al Survey on Reactive Programming) will tend to make sure that their system is glitch-free but interestingly will not talk much about this.

To me, the discipline of malleable systems is about making intelligible, visible systems with tractable and stable behaviour, with as few as possible hidden abstractions (sources of “divergence”). So whilst I appreciate the elegance of stack-based concatenative languages, I tend to worry that there might be a requirement for the user of the system to think too much about what is “on the stack” without it necessarily being a relevant part of the visibly executing system - what do you think about this?

1 Like

That sounds like a good definition!

Yes, stack machines quite possibly are a side path with a lot of misfeatures. The only reason that they’re interesting to me is that they seem easy to understand and to implement as a fundamental primitive (though not necessarily always easy to use or reason about). I feel I can grasp roughly how a stack machine works. I can’t wrap my head around how register-based CPUs or VMs work, and that means that there’s a whole opaque layer of compiler magic between me and the lowest layers of a modern machine. I don’t much like that feeling of helplessness, and all the rest of our current API and platform distress (up to and including “Silicon Valley” as the less and less trustworthy each year “social API of the planetary Cloud”) seems to build on and extend that feeling.

But of course even stack machines have RAM as well, which doesn’t fit the stack metaphor. And stacks probably don’t play nicely with memory discipline (a malware Forth word that snuck into a system can immediately rewrite both the data and control stacks, which is already handing it the “game over” goal of most Windows malware). And most of the time we really do want something like objects (or at least hash tables) with named variables/slots, because the act of assigning something both a “name” and a “place” seems to be a large part of what programming is. The machine doesn’t care about names, and compilers want to erase them to improve efficiency - but the humans using and maintaining a system and trying to understand it very much do care about names and about the metaphor of “place”, and don’t want either of them erased.

I’m also really not sure if the “everything is literally just event streams” idea here is useful or not. I was hoping years ago that it might help as a bedrock abstraction layer, that users might not use directly but that compilers, etc, could target. But I’m not sure if it’s really at all the same thing as Reactive Programming, or can be one-to-one mapped to it. A good bedrock RP VM abstraction seems like it would automatically handle both the ideas of “caching a function application and rerunning it when one of its source values changes” and “sometimes not caching a function application and just running it because it’s faster and simpler”.

The thought was that if we have a very simple data-stream-based abstraction, then we could implement the important “caching” part (including name-lookup, which may involve writing some kind of hashing function, or many of them - potentially an infinite rabbit hole of complexity) over top of that.

However, I generally feel that the less lossy one-way “compilation”-like processes there are, and the more direct manipulation of long-lived data structures, the happier we’ll be. The trick is to find suitably small and stable such structures; the OOP world doesn’t seem to have yet settled on either “small” or “stable”. Javascript objects are too large to be such building blocks if we have to implement all of Javascript. Java and .NET, likewise. Lua objects, maybe, but I’m not totally convinced.