Hi Chase. Thanks for the detailed reply, I’m starting to get more of a picture of what PLAN might be. It seems less of a Lisp and more of a pure Lambda Calculus system.
(Which is okay, except that I personally have a particular fondness for Lisp because of its Lists as opposed to its Lambdas. The Lists were there before the Lambdas were - in IPL Information Processing Language - Wikipedia - and the importance of Lists as the primary construct is what’s missing from the entire ML derived family of functional programming thought, which owe much more to the compiler-centric Algol way of thinking about things, where source code is considered an almost existentially different class of stuff from the contents of a computer’s executing memory. As opposed to Lisp where everything in the entire computing universe, inside or outside the computer, source code or object code, is literally just sequences of other sequences, and a sequence is not always a function but a function is a certain kind of sequence which happens to generate other sequences: which I think is a cosmic-level revelation about universe-as-parseable-text and something that ML/Haskell haven’t yet fully digested. That’s my personal prejudice, anyway, which may be too narrow but colours my approach to every functional programming system.)
So with my Lisp-vs-ML prejudice out of the way:
A pin is conceptually a node in a merkle DAG. It’s a hint to the runtime system that tells it to hash whatever value is inside the pin. You can think about it as a memory page where the value inside the pin is stored in a contiguous region in memory (and on disk) and is addressed by its hash.
Ok, that makes sense. I’ve been wondering what would be the best way to go about assigning secure hashes (SHA-256/512 etc) to something small like a function or cons or object, and it seemed to me that chunking lots of small items into a sequentially allocated “page” like construct would have to be the way of doing it; we need to make sure that the amount of data associated to a large (32 or 64-byte) identifier is enough to absorb the inefficiency of the identifier and the work of hashing. So a hard-coded operation to do all of this in the VM would be logical. That would also make sure that the programmer can control when and where the work of normalising and hashing takes place.
A law is a user defined function. They accept a fixed number of arguments and return a single value. Laws can’t contain nested laws, but can reference other laws as constant values.
Ok so a “law” is roughly a Lambda, but :
-
it uses de Bruijn indexes (numbers) and not names for the parameters, so that it doesn’t need a symbol table, okay, that’s reasonable to make a lambda easier to normalise
-
however, every definition of a lambda always associates a name with it? So it’s what in a Lisp would be a lambda wrapped inside a let? In which symbol table is this name bound, how is that table defined, how is it processed? And you can’t ever define an anonymous lambda?
-
what does “Laws can’t contain nested laws, but can reference other laws as constant values” actually mean? Can a law/lambda create a new lambda and return it as a value, or not? Can it reference a parameter/argument passed to or a variable defined by a “parent” lambda/law ? (ie what I woud consider a “closure” in Scheme or Javascript) Or is it restricted to only being able to refer to its own immediate arguments?
Is this actually Lambda Calculus , or is it a restricted form of it, and if so, what’s the implication of the “no nested” restriction?
Oh yeah - it’s also a lazy VM, so applying a function to an argument presumably doesn’t immediately reduce to a value but creates an “application” object… which is cool, but at what point does actual reduction take place?
const k
By “const” I assume you maybe mean what in other programming languages would be called a “literal” rather than a “constant”?
Since an immutable function language presumably only has variable bindings of the kind that are considered “constants” in other languages).
Pattern match on a natural number (is it zero or (n+1)?).
Hmm, so your built-ins don’t include silicon-friendly mathematics as in Uxn (ie add/sub/mul/div/bit-operations), but do at least include “increment” and a combined “decrement and test for zero” operation, which is quite cute and I like it.
Only having natural numbers is probably very nice for mathematicians, but raises the question of: How exactly is the ordinary programmer going to represent integers, floats, strings, etc? Do all of these have to be lambdas of some kind? If a function is passed a lambda, how does it know what kind of thing it represents? Although PLAN is basically untyped lambda calculus (or perhaps a very restricted subset of lambda calculus, see above?) extended with literal natural numbers, is practical usage of it going to need strict typing, because there’s no way at runtime to “unpick” a lambda/law/app and find out what class/kind/type of object it is?
Your choice of “natural number” as the lowest concrete kind of value in the system… again is interesting to mathematicans, perhaps, but I’m not sure exactly how it’s “hardware friendly”. No computer hardware gives us raw natural numbers - it gives us unsigned integers, yes, but in fixed word sizes. If the VM has to manage unsigned ints of arbitrarily large sizes… that might potentially cause problems? What happens if a piece of source code off the Internet tries to allocate a single natural number whose representation in RAM takes say, 5 gigabytes? 5 terabytes? 5 petabytes? Can a VM ever throw up its hands and say “sorry, number too big” or does doing that mean it’s failed to correctly implement the spec?
We don’t share any of the strange language choices made by Urbit, we’re not opposed to meta-programming or modern type systems, we have no baked in hierarchical naming system, and no one is interested in global monarchy. Urbit is an art project; Pallas is not.
That’s good.
I’m not sure what you mean when you say that it doesn’t define its data types, could you elaborate?
So what I’m referring to is that, in a definition of a VM, I like for the base constructs to be fairly concrete, so an implementation can be tested as to whether it’s compliant or not. And in a programming language, I like for “data literal” to be a thing that’s a defined part of the language (and therefore of the VM and of the calculus that the VM implements). And further, because a thing I like to do a lot on my computer is to store and process and retrieve sets of data, I like for the “data literals” that my VM processes to include some kind of structured type. While I’m aware that my computer deals fundamentally with bytes grouped into words grouped into sequential pages of RAM and disk… I’d like for my VM to give me some way of talking about chunks of data that include numbers, strings, and some kind of structure (preferably not more than one or two of those kinds: Lisp’s conses and Javascripts numbers/strings/arrays/objects sound about the right order of magnitude to me, with Javascript being a few too many base data types).
PLAN appears to define one data literal (“natural number”) of apparently arbitary size, plus lambdas, applications, and pins - but no data-structure types like “cons/pair”, “sequence”, “array”, “Unicode string” etc. If I have, say, an array of strings or numbers or object-like things to store… do I have to store it as a single massive “natural number”? Or as a lambda/law? I feel like those are two extremes: it would be nice to have something more like “typed data object” that’s a little bit in the middle.
there is no scoping, no symbols and no opaque terms.
Hmm. Let me unpick this argument, because I agree strongly with some of what you state here, but query some of it.
What is this #<procedure>
thing? If Lisp was fully reflective you would expect functions be encoded as a bunch of cons cells, no?
So: Yes, in principle, I very strongly agree that non-cons objects like “compiled procedures” and “vectors” that exist in a lot of modern Lisps, are extremely poor and we can do much better. Yes, non-cons things can’t be sent across a network interface, and even conses need a lot of preprocessing to avoid loops and/or express the distinction between two conses with the same contents but different ids, and that’s very bad. Yes, I’d like a Lisp-like thing that was built entirely out of conses (or something similarly small) and took that that restriction seriously.
However: There does exist at least one modern Lisp that does have this quality! PicoLisp really does build absolutely everything out of cons cells. Everything. Including symbols and procedures. So, it is possible. PicoLisp does a number of odd things that I don’t like (it’s dynamically bound, for instance, and extremely not-immutable). But it’s an example proof of one, that there could be a Lisp family tree that really stuck close to “cons is everything”.
You can’t write a function value (or better yet, a partially applied function value), send it across a socket and have the other side execute it.
Yes, in fact in PicoLisp you can do exactly this, and that’s how its HTML server and GUI library works. Probably this causes heaps of security problems, but yep. At least I think so from glancing at the manual- I’ve never actually tried to do GUI coding in it. It’s a very odd and very retro language in a lot of ways.
We have changed what is essentially a global variable and have changed how the function executes.
Hmm. Yeah, (define) is definitely something that doesn’t fit with the immutable concept and although it is occasionally very nice to be able to rebind global variables - and the Smalltalk approach is all about late-binding and rebinding of names everywhere - I agree that we shouldn’t have that if we’re doing the pure-functional thing.
Vector doesn’t have a canonical form as a list if you try to inspect them.
Yep, I don’t like this, and PicoLisp “solves” this - again in it odd way - by simply not having vectors. If that’s too much of a downside, then I’d argue that another way to solve it in a Lisplike manner is to extend the list serialization syntax with something like “#” – but as an actual magic symbol that exists inside a cons and so can be reflected on at runtime, not as a raw syntax thing. This was the basis of the idea I was playing with a decade or so ago, which I’ve never developed into an actual language because language development is hard and I’m lazy. I still think it’s cool though.
PLAN, by comparison, encodes function application with arities on top of binary trees of natural numbers.
Hmm. So “a binary tree of natural numbers” sounds a little like a cons cell, but in fact isn’t, quite? How exactly is this binary tree represented at a hardware level? Are there well-known encodings specified for things like, eg, integers, floats, and strings? Or are the encodings to and from natural numbers entirely up to each VM, in which case, how will integers, floats and strings be transferred between VMs?
When you suggest encoding a closure with variables encoded in a let block, I presume you mean as a quoted piece of code?
I’m assuming that the VM has more access to what’s inside a function than the user-level programmer does, yes, and that a function has something like a symbol or variable table, and that therefore the VM can add a header to the function that’s the equivalent of “bind this value to this variable”. That’s the thing that I would think of as being roughly equivalent to a “let” declaration that the user could write around code they had access to.
How this is done in Lisps (or Scheme) was quite a chunk of historical work as I understand it - and what led to the use of the term “closure” to describe “functions that carry their environment, or a reference to their enclosing environment, with them, as opposed to old-school dynamically-scoped Lisp 1.5 functions that don’t” so it’s maybe not always an easy thing to get right.
But I imagine your reified “application” objects/terms accomplish the same thing, without needing a symbol table.
Since deconstructing an actual procedure is not possible, you can’t turn those into serializable values.
Well, but deconstructing a procedure and serializing it could be very possible if you want it to be! There’s no law of VM design that says you must throw all this valuable data away when you compile or execute a function. If nothing else, you could just have the VM attach an AST of the source code or environment of every function to the compiled form of it (cost: one pointer in RAM per function, plus of course the tokenised source code itself - but we have large RAMs these days so why not?) Being able to have all this metadata on hand is perhaps hard to imagine outside of a VM that implements a Lisp cons-cell environment, but almost impossible not to imagine if you do have such a VM. Then, the question becomes not “can” we do this but “should” we - how much reflection about source code and debugging/meta-information should we attach to the runtime form of objects, how could this information be abused by bad code, how do we regulate access to it, etc.