Spaces of Data

natecull · 30 July 2024 10:10

So here’s a half-baked idea which has haunted me since around 2006. I don’t have much hope that I will ever get it to fully baked, but I want to put it out there in case there’s something which can be salvaged.

I was thinking at that time about the overlap between the Haskell idea of “curried functions” (ie functions of one argument) and the object-oriented idea of “sending a message”, and it seemed to me that given the appropriate language, you could pretty much get both at once. Ie:

foo bar baz

can mean both/either “send message foo to object bar, then send message baz to the result” or “apply function foo to the value ‘bar’, then apply the resulting function to the value ‘baz’”.

So far so good. There isn’t, yet, a language that quite combines the curried-function idea of Haskell with the message-sending of Smalltalk, and does it without a compiler and a type system, but it seems like it shouldn’t be too hard to whip one up.

But the next idea was: okay, so then also the operation of “send message to object” or “apply function to value” also looks like the idea of “traverse a graph or node structure”. Ie:

“from a root, go ‘foo’, then go ‘bar’, then go ‘baz’”

which is much the same as a filesystem or object path, as in file “foo/bar/baz”, or “foo.bar.baz” .

Still not too strange. Just like Unix and also like JSON. So imagine that we have this one big massive unified “data-space”, which is a huge/infinite set of nodes (imaginary/virtual) which we traverse by either “following labelled arrows on the graph” or “moving into named subdirectories” or “traversing JSON-style object keys” or “sending messages to objects”, or “applying functions to strings/atoms/keywords”. All just the same operation: message-send/apply-eval/move-viewpoint.

The next idea is that, obviously, some of these “nodes” are computed, it’s not just a finite data tree structure. That requires coming up with a language/syntax/semantics… something in the way of Haskell or Lisp or Smalltalk… a bit trickier, but not that hard, right? Just a small matter of definitions…

(figuring out which small definitions would be needed took me nearly 20 years and I got lost multiple times, and remain lost. It turns out there are an almost infinite number of ways to construct a Turing-complete language, so you can’t do an exhaustive search. But a lot of those infinite number of ways don’t work.)

The next idea (and these ideas were all entwined, coming at me at once) was a bit harder. This data structure would be immutable and pure-functional, of course, so we need a way to model input/output and dynamic data. “Functional reactive programming” and React hadn’t quite hit yet, but “dataflow” was in the air, so I figured: yes, something like that. This tree/graph of data nodes would be linked by pure functions between each nodes, but some of those nodes could change over time. Ie, they’d be what the reactive paradigm calls “signals” or suchlike. You’d “attach” to them by referencing them, that would add you to a subscribe list, so you’d still be a pure function of that value, but each time the value changed, you’d be reevaluated. So we’d need a way to cache and incrementally compute all these so that you weren’t re-evaluating the whole tree every clock tick. Oh, and potentially store a bit of state at each node, so some primitive like “access last value” or Elm’s “fold over time”. Oh, and make sure that everything was synchronised. And find a way to somehow “freeze a value” if you actually wanted just its value at a certain point in time (triggered how?) And make sure to garbage collect references whenever you move out of scope. And add them in again if you moved back into scope, but not lose any states of the signal. Simple, right? A week’s work at most.

(I got completely stuck here, bouncing hard off the internals of Elm and React even when those were built. Just couldn’t get my head into what they were doing and how.)

But that wasn’t the end of it. The next part of the idea was that, well of course the data in this big tree/graph wouldn’t just be numbers, it would be database type of stuff, or logical assertions like in Prolog, Datalog or RDF. Ie it would be something like a “conceptual map” that could represent arbitrary relationships between things.

(I was thinking of Inform 7 here, and how it does adventure game modelling: an I7 file starts out as what look like a bunch of logical assertions in faux-English. “The house is north of the park.” This implies that “the park is south of the house”. I figured, just encode this as something like (house north-of park). But then if you looked for, or “moved your view to”, the space point labelled (park south-of), you’d find a (computed) value there called “house”. Obviously, you’d like to use this for not just adventure games but for personal note-taking, databases, defining type or class relationships in programming languages, etc…)

So each node could conceivably be an infinite multidimensional table of recursive arbitrary sequences of lists and atoms…

So just implement Prolog over Lisp, of course, and clean up its semantics, a weekend’s work… well, since nodes/predicates/functions could be so big, you’d obviously need to just do something really simple, like make them lazy lists or something… another day’s work…

(got stuck for nearly 20 years there, again)

And next, “concatenative languages” like Forth (and Joy and Factor) were having a moment in 2006, and Forth words operating in a pipeline looked very much like “send string of messages to an object” and also “Function operating on string of keywords”, so I thought, yeah, it’ll probably make it simpler to just whip up a concatenative language for this, then I won’t have to worry about how variable semantics work…

(20 years lost there)

Anyway, at the end of this, you’d have something like a completely virtual/imaginary/ hyperlinked “cyberspace” of computed data that you could “cruise through”, by creating “views” (instantiations) of it, and it would be functional, but also object-oriented, but also a Prolog-like logical database where you could “infer nodes” by making a query. And you could model networks or windowed desktops or arbitrary communicating systems with it. It wouldn’t be quite the same as OOP because messages would flow only in one direction between nodes, but there would be a little bit of state at each node. And it would all be super-simple and self-evident once you saw it.

Well, that was the thought in 2006. It was a nice thought. It is very blurry around the edges and isn’t anywhere near coming into reality. Except that I feel like “actors” are basically one glimpse of “nodes”, and the dominance of React in the Web frontend space (yet its complete absence of influence anywhere else in computing) is another glimpse of how nodes might operate. Dialog took my initial inspiration, Inform 7, and did exactly what I wanted to see in terms of “doing that but in Prolog” - but still very closely bound to the Z-machine and not usable as a general-purpose programming language. Kanren gives some glimpse as to how to encode Prolog-like searches into pure functions, although a frustratingly awkward and incomplete glimpse. Forth (Dusk, Retro, Uxntal) in 2024 is giving concatenative semantics another moment like Factor did in 2006, but Forth itself with its open RAM model isn’t quite suitable for this node-based virtual reactive logical cruisable data-base-space idea. SUSN is the most I’ve been able to salvage so far just in terms of “a recursive space of personal text-based data” and it’s not quite even the right shape let alone having any computing capability.

I wonder if there is anything to it.

alanz · 30 July 2024 17:34

Just throwing in a half-baked thought.

What about propagators? In one view you just end up with a graph of static related data, but the attached processes can flow an update through if anything changes.

natecull · 31 July 2024 01:40

I’m not sure exactly what a propagator is defined as, and in which paradigm, but perhaps? There are plenty of mostly-similar reactive/dataflow systems out now.

The idea was basically that all nodes would be “views” of (a pure function applied to) the data stream they were subscribed to. The function itself would also be a data stream, so if the definition of the function changed (ie due to a software update) then the view would also recompute / re-cache.

That is, every function call-site would be a join point of two potentially changing “signals”: the function, and the argument (or argument list).

So there would be two different kinds of relative “movement” in this graph, which could be interpreted either as the values in the node/function graph changing, or the graph itself “moving” (being applied to, as a giant function) a new abstract set of coordinates (parameters):

the movement of the user’s “viewpoint” (roughly like “changing directory” in a file system, or “clicking” in a file/object browser), which, seen another way, would just be changing the “parameters of the view function”. This would also be the same as “the node graph itself changing”, because you’d “build up” or “tear down” elements of the graph as a result of “moving” (changing your parameters/coordinates).
the flow of updated data between the nodes of the graph itself, as the external signals/streams that it was referencing changed. But this “internal flow of data” could also be seen the other way, as just the whole graph of viewpoints, itself, “moving through the abstract/immutable data space”.

This idea probably works out exactly isomorphic to Objects or Actors. Except that there seemed to be a one-way publish-subscribe flow of data - more like the React approach - rather than the two-way flow that we usually have with Objects, where a “downstream” consumer of an object can send a message back “upstream” to it and change it for all future invocations.

Weirdly, even though I was fascinated by this core dataflow-graph idea, I found I couldn’t get interested in actually-existing React, Redux and friends! Due to laziness on my part, no doubt. But they all just seemed too complex, too ad-hoc, and too linked to the web browser environment, and I didn’t want something that was a giant complex library that ran in a browser, but which was simple enough that I could understand and implement it myself.

Spoiler: I have not managed to implement a dataflow graph myself. I keep blocking on "but how does a pure function of a signal get to ‘mutate’ itself so that later versions of itself remember a decision made? I guess we can just by fiat declare that any function can keep one state variable around, write to it, and reference the last version. But I thought there must surely be a more principled mathematical formalism for this, and so far I haven’t found one.

This state variable wouldn’t be quite the same as a Scheme “closure”, where invocations of a function can mutate an environment. It might not either quite be the same as “become” in the Actors model. The state here would be attached to an invocation (instantiation / call site) of a function, not to the function itself, and would only potentially mutate when one of its “upstream references” (function or argument) changed. That is, the state attached to a “function call” would be a pure function of all of the previous values of the argument of the function - but it could change as those values change, allowing it to “reach back in time” and reference a past value. So you could have a view which keeps a “history” of its previous inputs - but not of its consumers.

How actually “sending messages” or “commands” would work, I still don’t really know. But it would all be pure-functional, so I guess you’d have to construct a data record containing the command (parameter) you wanted to “send”, then construct a new view of the result of applying that command to the “receiver”. Or, in a network, you’d have something like a local router which would subscribe to all the local hosts on its segment, receive their pure-functional current status which might include “messages to send”, bunch those up, loop them back to “receivers”, etc. There are subtleties here, and exactly how much “better” it is than the old Objects way I don’t know. Yet, React et al do seem to be heading in this direction, and databases have done it forever: bunching up state into atomic “transactions” and then projecting forward a new view of the dataset based on accumulating those transactions. It seems to really rely on Messages being very first-class entities - which Alan Kay would approve of, and Smalltalk does - but which not all OOP languages allow.

Ie, the sort of “message sending” I would envisage, in a network based on this idea, would be much more like creating “feeds” in, say, Scuttlebutt or NOSTR - or in Content-Centric Networking (Content centric networking - Wikipedia) - than sending packets in TCP/IP. You’d create immutable messages, keep them in your local message store (spoiler: another big blocker here is “but how do I delete? because I do need to actually delete secrets and nasty things”) , and then the network would “observe” your feed as you publish it (via a hierachy of routers/proxies) rather than you specifically “sending” it. And same for database updates. Would be more like Git too: you’d create “patches”, transactions, posts, etc, they’d all stay local forever, the network and any subscribers would pick them up as they observe them, then if you wanted, you could subscribe to other nodes and observe what they had.

Feels like there’s still something there. But, many sharp edges left to be worked out. Including the “no but seriously, how delete?” one.

One of the other sharp edges is: fast recomputation would heavily rely on “deltas”, ie differences between one (potentially huge) dataset and another. And you’d want to store those deltas just like normal data, because among other things, they’d be your Undo and your Backups and your Diffs and your Patches and your Transactions and your Posts. Like, imagine that your desktop is one function: it’s gonna be a function over your entire filesystem, plus all remote networks. That’s terabytes to petabytes of data you have a function over. You are NOT going to want to send updates of any of that as a whole thing. So we would need, pervasively, at all levels, a simple calculus of “changes to filesystem-sized datasets”. Which turns out to be surprisingly tricky, and very few people seem to be working on it. I mean you’d think for, say, JSON objects, we’d long since have worked out a way of specifying a JSON object which is a “set of changes to all nodes of another JSON object”, including adds and deletions, and including unambiguous differentials between “empty object”, “null”, “undefined” and “no literally I DELETED the value, there’s nothing there, not even an undefined”, and then, “no, I DELETED the value in the previous delta set which said DELETE, that delete has now gone”. An unambiguous standard for this would be a minimal requirement (not that JSON itself is a good format for basing a system like this one; it’s not.)

Google Wave looked like it might be a small step in this direction - aaaaand it went before they could even launch it. Elm’s “record” type seemed similar: but they hand-crafted their own language with their own syntax and a compiler and a typesystem and a gorilla and a jungle. I just wanted to know “how do you canonically encode a delta Record as JSON, cos I wanna use that everywhere?” but that didn’t seem to be at all interesting to the Elm people. And now Elm seems to have become unfashionable and will presumably vanish like the rest, with its compiler and gorilla and jungle and all.

alanz · 31 July 2024 17:10

I am referring to the kinds of networks described in Propagation Networks: A Flexible and Expressive Substrate for Computation Alexey Radul.

There was also a recent write-up about using them in a web environment, Functional reactive user interfaces with propagators.

And while I am at it, your musings remind me of ZigZag

natecull · 1 August 2024 02:02

Ah right, those. I do remember glancing at that 2009 thesis years ago. I didn’t find it particularly helpful then, but I can be very dense when reading theory-heavy papers. That 2024 post looks much more interesting though!

Ah, frick: From that post, I see even Elm dropped FRP in 2016! And the whole POINT of Elm was FRP!

See, that’s why I lost interest in it. The Web frontend sub-industry can be so frustratingly fashion-driven, chasing after shiny nonsense (of interest to trillion-dollar companies for the purpose of creating dependencies on their central backend services), and not getting down and solving the actual problems that user-programmers need solving. (Because web frontend people don’t even really believe that such things as user-programmers exist.)

The reason why FRP is so appealing to me (on paper, at least) is that it allows for writing interactive programs declaratively. With FRP, I can simply describe the relationships between the various time-varying components, and the system wires it all together for me.

Yep, that’s where my head’s at too.

The whole “propagators” idea needs some thinking about: I’m sure this 2009 thesis is very, very clever, they always are, but, it still seems much more complicated than it needs to be, or at least much more complicated than “literally just functions, but recomputed in realtime” that I had in my head. The idea of propagators accumulating “information about” values rather than just the values themselves, seems a bit like types/constraints, and also logic variables in Prolog, which is all potentially interesting, but… can’t we just get basic function evaluation working first, and leave the really clever meta stuff to the next pass? And if possible, just implement all that as macros, or preferably, fexprs/vaus?

The dense wall of macro-heavy Scheme in the examples is also not endearing this approach to me. Surely, function application can be simpler than this? Like, “lambda calculus” simple. Whatever this model is, it feels like it hasn’t been digested down to its core essence yet.

The talk of “glitches”, yeah, that’s some of the tricky timing stuff in FRP that scares me and that I don’t have a good intuition about. And the question about loops, and where you put them. In the back of my head, I feel like carefully managed feedback loops inside each function evaluation are a key part of the answer. But I can’t get the back of my head to explain itself clearly to the front of my head.

And while I am at it, your musings remind me of ZigZag

Well, Xanadu and Ted Nelson’s “hypermedia” vision as a whole rather than just specifically the ZigZag phase of Xanadu, but yes, very much so!

A lot of stuff about Xanadu, that had been previously hidden behind patent walls, was appearing in programmer forums in 2006, and “data spaces” was my attempt at trying to reinterpret from first principles (mostly functional principles) what I thought Nelson might have been trying to do.

Trying to get “the front of my head” around what that “simple” FRP model might be. There are several levels of time-complexity of data:

Simple immutable data literal (like a cons cell)
A function over simple immutable data cells, or over other functions, etc, all the way back to an immutable environment. ie Lambda calculus, SKI calculus, or the immutable, pure-functional core of Lisp/Scheme.
A data cell which can change over time (ie, a system “input port”), or a cell which has been computed from a function over such. We need a notion of time-varying “signal” at this point. I want something very simple, so I think discrete steps are fine. We don’t imo need all the fancy time-curve machinery of original FRP, that’s way overkill.
A function over one or more signals - a “view”. We need a scheduler that can organize the automatic reevaluation of views, and probably a caching mechanism. I say probably, because we don’t always want to cache (it might be faster, and certainly will use less RAM, to just recalculate whenever we’re notified that our input signal has changed), and also, we might be able to derive caching from…
A way of “looping back” part of the result of each view - a “reducer” I guess – what Elm called “temporal fold” or “foldt”, but “reduce” is the more common term these days for fold. Perhaps just one special magic function that takes a pair of (state0, function) and returns a function that, externally, just takes time-changing signals (input → output) but internally, takes (stateN, input) → (stateN+1, output), and handles the loopbacking mechanics of “state”. And reducers would be our equivalent of a “process”: they would spin up when invoked by a view, and would be garbage collected when all views over their values vanish (how exactly, is perhaps the trick). Perhaps reducers could only be called from other reducers? Because I imagine they probably would have to be persisted in a reducer’s state to avoid going out of scope, being garbage-collected, and restarted from a new initial state? Perhaps, though, the global system “environment” is already a reducer, and so it can hold references to system-wide reducers, meaning most views can just be simple functions over that environment. But if they create their own reducers, then they’ll probably need to create and persist them in their own state.
Since “reducer” is React/Redux jargon, presumably React has already been there? But if so, why is React so darn more complex than just this one idea?

sph · 1 August 2024 11:03

Ah @natecull, I feel we are touching upon the same topics in our research. I will share some of my thoughts:

natecull:

Still not too strange. Just like Unix and also like JSON. So imagine that we have this one big massive unified “data-space”, which is a huge/infinite set of nodes (imaginary/virtual) which we traverse by either “following labelled arrows on the graph” or “moving into named subdirectories” or “traversing JSON-style object keys” or “sending messages to objects”, or “applying functions to strings/atoms/keywords”. All just the same operation: message-send/apply-eval/move-viewpoint.

The next idea is that, obviously, some of these “nodes” are computed, it’s not just a finite data tree structure. That requires coming up with a language/syntax/semantics… something in the way of Haskell or Lisp or Smalltalk… a bit trickier, but not that hard, right? Just a small matter of definitions…

While thinking of my object-oriented platform, it started to become apparent that you can model the entire system as a tree of objects. At the root there’s the Kernel, which spawns a few accessory objects of its own (Drivers, Processes). Processes contain their own objects as well, and are effectively “mounted” into the Kernel tree.

Likewise, you program this system by taking an object and sending messages to it. If the entire scope/environment is an object, resolving a name is nothing more than sending a message to the scope.

So the system becomes a tree of objects, programs are basically their own tree “mounted” into the kernel tree when they are started, and the entire thing is quite similar to the UNIX filesystem model, but instead of files and directories, it’s all objects.

Another nice thing, is that system security becomes trivial: an object can only message other objects it knows about. If you don’t want your TextEditor object to talk to the Internet, don’t give it access (i.e. do not mount it in its namespace) to the “NetworkConnection” object.

Nor I have issues trusting third-party code: a GUI program only needs access to the Screen and Keyboard objects, for example, and I’m certain they can’t do anything more than operating a screen or reading from the keyboard. See Object-capability model - Wikipedia

Not sure how why you believe this object model to be functional, but I wonder if you have read John Backus’ seminal paper “Can programming be liberated from the von Neumann style?” where he described his own “dataflow” system confusingly called function-level programming (but completely different than today’s functional programming).

This is where I’m stuck now. I’m trying to envision an object model whose language is based on concatenative message sends. You start with a receiver, explicit or implicit, and send messages to it. The stack model of Forth & co. I believe are an unnecessary complexity for anything higher level than Forth, so I haven’t managed to find an expressive middle way where you don’t have to bother with stack manipulation (DUP, SWAP, etc.), which do not map very well with message passing.

Yes! That’s the goal. In this system tree of objects, why not have objects which are remote? If objects == actors, you have parallel computation for free.

And if you can send these objects across a wire, you don’t need Kubernetes either.

I don’t know, but I am stuck I feel on the same problem and I wonder if there’s anything to it either. Intuitively, it does make sense to me, more than how we write and evolve systems today.

sph · 1 August 2024 12:30

Adding to my previous post, the idea of this “tree of objects” made my hamster spin.

I’ve been thinking about the concept of reification of the scope, which is fancy talk for “what if the scope in which we create objects and variables would be made explicit?”

In modern compilers, scope is an abstraction through which we interact indirectly when we define a variable and doesn’t really exist in the final program.

My question is what happens when you make it an explicit object? What if defining a variable can be implemented by sending a message to a Scope object? Scopes can be nested within each other and resolve names by traversing the parent link, so you could implement closures very easily this way.

By making the Scope explicit, your entire tree of known objects is an explicit construct you can operate, and change at run time. Scopes can be mounted within each other, turning the set of all objects into an actual “file system” that can be queried as if it were files on disk.

Is this a silly idea?

natecull · 1 August 2024 22:04

Very much not a silly idea, in my opinion. “Scope” is pretty much what is called “environment” in Lisp/Scheme systems , and there are various take on Lisp/Scheme which revolve around making Environments first-class objects that can be passed to functions.

John Schutt’s “vau calculus” is one such idea, and I think it has a lot of promise. The idea is that we can both simplify and clean up the Lisp/Scheme idea of “macros” (which has always been problematic, and various attempts at “macro hygiene” have made it confusing and worse), by allowing a certain class of function to take the current Environment as a parameter (and also not immediately evaluate all its parameters but receive them as raw data). This is the old original Lisp 1.5 idea of “fexprs”, but given a modern spin to rescue it from an old thesis that claimed it was inconsistent (and I think really the main idea is formalising how it works at the calculus level, which I don’t claim to understand).

Vaus/fexprs are also very close to how Smalltalk 72 works: functions/objects (there isn’t really a distinction between them in '72) receive the calling environment plus the unevaluated parameters as data. So Smalltalk '72 is all macros, all the time. Being macros is what makes message passing possible imo.

So yeah. Reifying scope/environment I think is key to making domain-specific languages, and I also think the “messaging” idea behind Kay’s OOP vision is the same as little state machines with their own DSLs. In the Scheme “closure” view of objects, each object is a pair of (function + environment), so environments/scopes are closely linked to objects in that model too.

But more to that, I think separating out and tracking scopes/environments and having processes explicitly operate on them, can also help us bridge the gap between “runtime” and “compile-time”. These “times” would just be two diffferent scopes/environments in my view: and there could and should be many more of them. “Deploy time”, “data-read time”, “parse/validate/typecheck time”, “test time”, “transaction time”, “resolve-conflicts time”, etc. All the stuff we do “outside” of a program and “in between” each component in a large system…managing changes, patches, configuration files, updates, continuous integration, modifying software-defined networks… all of these should just be different scopes, and what’s “compile time” and frozen versus what’s “runtime” and changing would be a matter of relative viewpoints. Everything is some process’s “runtime” somewhere - and it’s also some other process’s previous frozen computation given to them as an unchanging environment (for the duration of their existence, which isn’t forever).

I think there’s potentially some interesting similar ideas in Margaret Hamilton’s “Universal Systems Language” which is basically an OS as a tree of functions and types, but where the function calls have extremely hard realtime bounds on them (and recursion is very strictly managed) so a higher-level function can act like a process and also a scope. Universal Systems Language - Wikipedia

natecull · 1 August 2024 22:43

My (admittedly limited) understanding of Backus’ “function-level programming” is that he was literally just rediscovering “higher-order functions” – in a point-free rather than variable-binding style – but for some reason, despite being a genius, got stuck on a very simple point and somehow didn’t understand that a higher-order function just is a function. Not a special non-function thing! Just an ordinary everyday function that happens to operate on other functions. He would have been writing well after Lisp, so it’s baffling that he didn’t grasp this, but perhaps in a Fortran-heavy context, and before the “ML” family of languages which led to Haskell, and which I think were heavily influenced by his desire for “function-level” programming.

But we now know (or I think we know) that there just isn’t such a separate thing as “function-level”. Just functions. Functions all the way down. I can literally do “function-level” programming in Javascript - and I do. I use pipelines of higher-order functions to make a very simple query language. It’s great, and we couldn’t do this in BASIC in the 1980s.

Possibly what Backus was thinking with “levels” was what’s now called type theory (which is I think quite different from what was called type theory in his day). Ie, putting a type signature on your functions (as ML/Haskell etc love to do - but Lisp doesn’t) to enforce that your higher-order function-making functions can only take functions of the appropriate signature as arguments. And so he’s now got exactly what he wanted.

(Trying not to be too glib here: there were big wars early on over functions and types and whether it was at all “nice” for a function to be recursive. Lambda calculus before Lisp was considered a failed project in logic, a broken toy sitting in a drawer, and received many sideways looks from mathematical theory people before Lisp demonstrated that, paradoxical or not, you could at least use lambdas to compute useful stuff. But Computing was seen as a distinctly lower class than Mathematics. And there was a gender element too: mathematicians were mostly men, while “computers” were mostly women. As a direct result of these wars over completeness and paradox, I think there are deep problems with type theory and the way it goes the other direction, stratifying software into two separate languages: the programming language (dirty working class approximate Computing), and the type system (noble aristocratic exact Mathematics), with a massive monolithic compiler sitting in the middle. This whole “splitting off languages into caste levels which must not mix,” concept was very much of its era - yet it’s still dominating ours. I don’t much like the ML/Haskell tradition because of this. I’ll blame Backus directly for that. There was severe damage done here to the programming mind that we have not yet recovered from.)

(I really am not qualified to be so harsh towards Backus and ML/Haskell! And I probably have deeply misunderstood what his FLP was about. But, ML-derived languages do make me angry in a way I find hard to quantify. They break a principle I consider very precious: that every part of a software system should be “the same kind of thing” and have “the same rights”. Same with the “client/server” distinction. Directional flow of data/evaluation is one thing, but hard category walls are something I dislike, and I think are technological weapons being abused by tech billionaires right now to bring back an actual social class system. “Who gets to write the compilers / run the app stores / host the continuous-integration cloud?” is not an abstract question when every social interaction is mediated by an app: there are very real social consequences, imo, to how we choose to view concepts like the layering of types over functions - and whether the same group of people get access to defining both concepts, or whether one group gets privileged access. The key sticking point to me is: in ML-derived languages, can a mere programmer with no access to the compiler writing a program/function compute a new type and return it as data, passing it to another function? And if she can’t, then you’ve created a social caste barrier right there: between the people who write and run the compilation mechanisms (possibly cloud-hosted, so requiring industrial levels of capital), and the people who write and run the programs.)

But to why I think my model here is “functional” and not just objects:

I’m thinking of a pretty strict one-way information flow, although with some explicitly defined loopbacks that essentially just store one piece of state at that point. This is roughly the functional-reactive view, I think. This differs from normal objects in this way:

Say I’m an object/component/function/view “B” and I hold a reference “A” to an “upstream” object (a parameter passed to me, or a name I’m looking up in my environment). And I’m evaluating/reevaluating myself because one of my inputs (a parameter, or the environment) has changed.

In a normal object-oriented system, this would be a “method call” that I’m executing. But in this system it is not a method call, it’s a reevaluation of a pure function driven by the language runtime. That means the language runtime has scheduled when this occurs (as in a spreadsheet cell) - it was not directly caused by one of the other objects.

Also, in a normal object-oriented system, I as B could evaluate the expression “A.doBehaviour(foo)” and that would cause, immediately, interrupting all other pending computations, A to run its doBehaviour() method, which could mutate itself or any of the graph of objects it knows. I as B would be “reaching upstream” and mutating one of my inputs, and that would violate the referential integrity of B being a pure function over A.

In my sketch of a model, however, I can’t do that. I can perhaps call the pure-functional method-like-thing “A.computeValue(foo)”, which can do some computation but can’t directly change A. If I did want to actually change A, I would need to instead evaluate to a value which is a record suggesting that A - at some point in the future - should possibly change, if that’s all ok with it. I can’t directly define how that record is interpreted or what component interprets it (it would be one of my “upstream” components to which B “belongs”, and that value being integrated back into A, and how it happens, is a bit of a mystery and kind of key to the whole deal).

So Elm-style “records” and how they work, what their syntax and structure is, and how and where they’re interpreted as “messages”, are much more of a big deal in this “reactive” paradigm than in ordinary OOP. They are NOT “behaviour”, acted on instantly and directly by the receiver. They are values/transactions, at best “suggestions”, they are mediated through multiple layers of other objects… and managing them, and the arms-length style of changes that they require, is probably where this model becomes complicated and possibly breaks down entirely.

I don’t know if this model actually works. It’s attractive to me, but I’ve not successfully built an evaluator or anything in it. Evidence so far suggests that it’s very hard to write, eg, a video game in FRP. Whether that’s because the FRP model is subtly broken and not “this model”, or whether the breakage is in “this model”, I’m not sure.

natecull · 2 August 2024 02:01

This is also something I’m also wondering about at the moment, due to Retro/Dusk/Uxn (Uxn being the most fun to play with). Given a tiny stack machine, or if I custom-built a tiny stack machine, how can I do “message sending” so it’s super fast but also capability-secure? And can I accumulate multiple parameters for a message send on a Page 0 stack, without needing to build objects in main RAM? Ideally we don’t want actual stack shuffling, because “the stack” is a system-root-level object that most other objects shouldn’t have direct access to - if you can read another object’s messages, or worse, rewrite the return stack, you’ve broken the capability-security model and you might as well just give up at that point.

My benchmark is: If this were 1977 and I had an 8K Commodore PET 2001 - or 1982 and a Commodore 64 - and my younger self could reach forward in time and grab a new ROM with the best possible language in it: what could that language look like? Could we run a “Tiny Smalltalk-72” with no bitmap display in 8K RAM and 12K ROM?

Specifically, I’d like to get to a tiny command-line system where someone could write a prompt-driven game (David Ahl’s Star Trek, say, or Scott Adams’ Adventureland) entirely using the system prompt and messages/words. Just “activate a context/object/environment” which is your game - or an “input mode” of your game - and all the commands and data you enter at that prompt, until you typed say ‘BYE’, would be parsed by the system parser but executed as verbs/messages by your game code. You could sort of do this in a Forth which had multiple dictionaries; that might be the best achievable in 8K in 1977, but what if…?

(This idea is not the same as “data spaces”, and perhaps contradicts it. I am large, I contain multitudes.)

sph · 2 August 2024 14:50

Yes I have some basic understanding of how FEXPRs work, and their shortcomings however useful they seem to be.

A cool think about Smalltalk’s idea of making everything a message send, is that you don’t even need macros if everything is an object. Remains to be seen if making literally everything, even the scope, a runtime object isn’t a performance dead-end like FEXPRs have been in Lisp dialects.

Thanks for name dropping this, I’ve never heard of it. It’s still funny to me how some research in computer science from the 1960s-1980s nowadays reads like science fiction. We’ve lost a bit of ingenuity and have been digging into the same local maxima I reckon.

I think this is still unexplored territory, so making performance a day 1 goal might be setting yourself up for failure. Maybe it’s better to explore the problem space, and think about performance and elimination of message sends later. It worked for Self, and managed to break new ground in the JIT and dynamic recompilation space.

I have a similar, but crazier pipe dream: what if you could program an entire operating system from within itself? You start from a basic command line, and “exploratively” develop and interact with one object after the other. A bit like the Forth experience, if Forth was object-oriented and message passing.

Hence my quest to find the most minimal yet expressive object semantics (and the most minimal language semantics to fully interact with hardware) and make those the basis of an entire system. The only way out of the tar pit is by starting from scratch!

natecull · 3 August 2024 10:28

(Universal Systems Language)

Yes, USL and Hamilton’s accompanying software engineering philosophy “Design Before The Fact” are not very well known, and all I’ve been able to find out about them are essentially marketing materials, very light on actual details. My guess is that this is because the work she did after Apollo was for military missile systems - so, probably classified, as well as with commercial secrecy on top - and also, a community where they didn’t mind spending a lot on up-front engineering to get guaranteed realtime performance, but what the civilian software community would consider fairly unexciting capability.

I’ll give that some thought. It’s not really speed as such that’s blocking me, but implementation simplicity. The Smalltalk-72 macro approach of “an object is a program that reads tokens from the calling program’s text” intrigues me, but it’s still several orders of complexity higher than a RPN stack machine to understand what it’s doing.

That of course is the Smalltalk promise, but I’ve always bounced off any actually-existing Smalltalks I’ve tried to interact with, because of the oldschool GUI patterns. (I have the same problem with Emacs). So I agree, “a Forth, but objects” with a text interface would be a good beginning target.

Kirtai · 3 August 2024 13:38

I don’t know if that’s possible but I do know that Rosetta Smalltalk (St-72 based) and TinyTalk (St-76 based) both ran on eight bit hardware with very small memories.

Kirtai · 3 August 2024 13:41

Have you seen Ian Piumarta’s Id and COLA systems?

neauoire · 3 August 2024 18:03

I don’t want to suggest that you use Uxn for your project, but have you tried writing something like Uxn objects(where label scoping is all you get)? I use them everyday whenever it’s practical to do so, I found them extremely useful to protect bits of data, and expose clean APIs. They don’t do much but it helps with readability a bit, here’s an overview, you’re already family with the uxn syntax so maybe you could translate this into your dream system:

@some-object

	&some-value $1

	&some-method ( a -- b )
		,&some-value ADD 
		JMP2r

	&some-getter ( -- value )
		[ LIT &&some-private-value $1 ]
		JMP2r

( And to extend it further down the program )

@some-object/buffer $10

This works reliably well, here are some real examples in context(one is from drifblim, the other from a game we’re working on):

@Lambda
	&push ( -- name* )
	[ LIT &count $1 ] INCk ,&count STR
	DUP [ LIT2 &&ptr =&mem ] INC2k ,&&ptr STR2
	STA
	( >> )
	&name ( id -- str* )
	DUP #04 SFT hexc SWP hexc ,&id STR2
	;&sym JMP2r

	&pop ( -- )
	,&&ptr LDR2 #0001 SUB2 LDAk /name <create-symbol>
	,&&ptr STR2
	JMP2r
	&sym cebb &id 0000 $1

@event-warp =&on-draw =&on-collision
	&on-draw ( self* -- )
	POP2 .Screen/y DEI2 #0028 SUB2 .Screen/y DEO2
	;warp-chr !<draw-tall>

	&on-collision ( self* -- fail )
	<animate-warp>
	#00 JMP2r

OOP programming on a stack machine is pretty fun, your communication is done entirely on the stack. The static implementation for something like a OOP frontend that doesn’t need to be GC-ed and has no indirection might be a weekend project at most. And doing capability matching might not be such a stretch to do like:

@blue ( -- Color* )
@pen
	&set-color ( Color* -- )

blue pen/set-color
10,10 120,240 pen/draw-line
...

natecull · 14 August 2024 22:28

Something new (ish) that’s popped up on Hacker News: Tony Garnock-Jones’ (of RabbitMQ) “Syndicated Actors Model”. He’s been working on it for a few years now and I think I bounced off it at least once before, but it seems to share a bunch of features with my half-formed vaporware idea, and he even uses the term “dataspaces”. One of the features I recognise is the need for a “data language” (Preserves, in his case) and that actors would “assert” and “retract” (Prolog-style) statements. The idea of “observations themselves being assertions” is not in my simplistic model and might be an important part of what’s needed.

The language name “Syndicate” I assume means that he’s a fan of Scheme and Racket, with their increasingly gangster-oriented language naming convention.

Syndicate Language: About this project—Syndicated Actors

Synit, an attempt to use Syndicate for the init layer of a Linux OS, which seems very exciting: Introduction - The Synit Manual

HN discussion: The Syndicated Actor Model | Hacker News

emery · 3 December 2024 16:23

If you are interested in a vulgar example of FRP via Syndicate I managed to bolt the Nix evaluator onto Syndicate for reactive Nix. Basically you have an evaluation state that can be published into a space of data and advanced to other states. I’ve consider doing the same thing for reactive Dhall but doing this with Nix is almost useful.

khinsen · 5 December 2024 13:22

This sounds interesting, but unfortunately your writeup requires a lot more prior knowledge than I have. I haven’t even heard of Syndicate before. Do you have pointers to a concise introduction?

Apostolis · 5 December 2024 13:43

https://syndicate-lang.org/
by Tony Garnock-Jones

Apostolis · 5 December 2024 13:48

I do not think I can give a precise description.

It is a number of tuple spaces, that synchronize through bridges. The interesting thing though might be the conversational concurrency it talks about , which I personally haven’t figured out yet.

Maybe he could give a better answer.