I think that the ability to understand a system is a necessary perquisite to the ability to change it.
One methodology is denotational semantics, attaching meanings to terms.
I think that @khinsen you are doing the same with Leibniz.
The overall idea is the same, but Leibniz attaches meanings to terms informally, in prose, whereas denotational semantics maps code to some pre-existing mathematical formalism. For the use cases I have in mind, there is no such pre-existing formalism, so traditional denotational semantics is not an option.
I’d say there are two situations in which one can safely change a system:
- Fully understanding it - that’s for formal systems
- For informal complex systems, explore the system’s reaction to perturbations, starting with very small ones.
In the real world, method 2 is the only applicable one.
Whether it is mathematical formalism or prose is not relevant. The important thing is that it reduces a complex process / data structure into something easier to understand.
The interesting thing in this discussion is the informal representation of Q ^ 2 as lines, meaning the pictures of the lines, what we see with our eyes. We don’t need to have symbols, only pictures.
And then we can use a system like dynamicland to play with those pictures. We can put two lines together, to construct a new one.
My point is that compositionality of software should follow the compositionality of meanings, then we can construct computable systems that do not even require symbols.
Then everyone could be able to construct new computations.
Both 1 and 2 are necessary.
Formal systems do not necessitate full understanding of a physical process. There are levels of precision and it is up to us to decide which one to use.
I feel like the useful part of formal systems is being able to state with confidence what a system will not do when you make small changes to it. It is not enough to just throw up our hands and say, “ennh, predicting the behaviour of a complex system is hard, software is a complex system, therefore, no-one can predict what any software will do, therefore any kind of behaviour is ok. The Halting Problem, lulz, QED.”
For example, we must have software systems where we can play with it and be certain that no change we make at the keyboard will accidentally introduce malware that will 1) erase/corrupt our entire hard drive or 2) drain our bank account.
Without some kind of formal methods, we do NOT have these guarantees from most software, not even minimally so.
Remember, we’re now constantly plugged into an Internet of 20 billion computers, each sending thousands to millions of packets a second. Some possibly quite large fraction of those computers and packets are actively infested with malware that is deliberately trying to infect everyone.
We only need our software to be missing a few decimal points of non-bugginess in order for both of these negative outcomes 1 or 2 to happen to us with probability approaching 1.
Faced with this constant storm of evil packets that all our computers have to process, it is not really good enough to say “ennh, just fiddle with your software until it works”. Nope. Even if your software is 99.0% safe from security bugs, that means every 1 in 100 evil packets will get through. You are going to be rooted - and your entire life destroyed - within minutes.
To just be minimally safe on today’s Internet - which in 2023 is literally, not figuratively, an active cyberwar battlefield, involving multiple countries which are at literal physical shooting war with each other, and the drones flying literal bombs are Internet-connected - we need something like formal methods to prove what our personal software on our personal machines will not do in response to all of those hostile packets. Otherwise, we need to physically unplug our personal machines.
We do need formal methods to become a lot more easy to use than they currently are, though.
Also, I think formal methods need to be augmented from the low-level side: with virtual machines that severely constrain what kind of unpredictable behaviours the system can do. For example, we shouldn’t really be using raw C or Forth style unmanaged memory access at all, ever, on any machine that processes packets from the Internet; that’s the software equivalent of walking blindfolded down a highway with headphones on. You are going to get hit by a truck (malware) if you do that. No ifs or buts. It is going to happen. So don’t do that. Yes, formal methods and type theory applied to sufficiently complex compilation could “prove” that unmanaged memory access is “safe”, in some very limited contexts, but, I’d feel much safer if there was also a virtual machine or memory manager enforcing those memory safety properties in a brute-force manner as well.
I agree with pretty much all of that. What I see as a main obstacle to improving real-life use of computers is extremism and radicalism among CS researchers and software developers, who turn questions of formal methods, and in particular type checking, into religious issues. You can adhere to the Smalltalk religion or to the Haskell religion, but you cannot build a software system with a safe core and malleability at the UI level.
I find a lot to love in each philosophy, but I also kind of wish I could somehow get a language (or rather first, a programming philosophy) with the best features of all three approaches. I’d like a language and OS to be as malleable as possible (so I can change anything I want), as minimalist as possible (so I can hold it all in my head, because I can’t change what I don’t understand), and also to strictly enforce any logical laws I require it to (so once it’s changed it stays changed and doesn’t get changed by someone else).
Not entirely sure how to get there. I love Smalltalk’s heavily reflective and context-dependent approach - to a point - but there’s a point where “just send a message to an object and hope it doesn’t do something utterly awful as a response” isn’t as helpful as it might be. Microsoft discovered this when they plugged object-oriented scriptability into all of Microsoft Office in 1998, and spent the next 25 years chasing the swarm of macro viruses that came out. And, from the other end, trying to write a Smalltalk VM from scratch - compared to writing a Lisp or a Forth - isn’t really a thing you can do, so it’s not especially minimal.
I guess what I want is a very minimal OO engine, that doesn’t store any state in each object, or almost none, and where methods strictly and provably obey contracts (which can be arbitrarily complex, perhaps defined by other objects)? I don’t know if that’s even possible, yet it seems to be something like what we need.
I’ve just been poking at Glamorous Toolkit, which makes heavy use of Pharo’s “pragma” mechanism, and, seen out of the corner of the eye, a pragma sure looks a lot like a type declaration smuggled into Smalltalk. An arbitrarily complicated expression which describes something about a method. Pragmas aren’t enforced by anything of course, but it doesn’t seem like it would be impossible for them to be, if we wanted. Just write a DoesThisMethodActuallyObeyItsPragmas class maybe. It almost feels like if we built an entire programming language out of something like pragmas, we might have a sector of unexplored language space that might be interesting and useful to inhabit.
A basic Smalltalk VM is not that complicated. There’s a nice demonstration at GitHub - dbanay/Smalltalk: By the Bluebook implementation of Smalltalk-80, which is a C++ implementation of the Smalltalk-80 VM, written on the basis of the “Blue Book”, which is the official definition of Smalltalk-80. Relatively compact and portable C++ code, a single dependency (SDL for graphics). Not bad.
What makes modern Smalltalk VMs complicated is JIT compilation for efficiency. I suspect that will be a bottleneck for any attempt at creating a simple foundation for large computing systems.
An example of a minimalist but type-safe foundation is Minimacy. I haven’t used it, but I attended a seminar by its author. He’s a software consultant and actually uses Minimacy to develop software for his clients. There’s nothing revolutionary in Minimacy, it’s tried-and-tested ideas carefully chosen for minimalism at each level. It’s basically a minimalist version of OCaml.
Pragmas are a very interesting feature in Pharo. One use case is adding formal annotations on methods, which are often similar to type declarations, and are processed by special-purpose validators rather than some universal type checker. Another use case comes from Pharo’s “code as a database” structure: it is straightforward and cheap to search for pragmas across the whole system, so they are used as labels for plug-ins, documentation, etc.
Does Minimacy provide ways to sandbox programs? So that the computer owner can restrict the capabilities available to them?
No, but you can run the entire system in a sandbox. A Linux container, for example.
Except a container is an extremely poor sandbox by @natecull’s definition above
Agreed. But then, all of today’s sandboxes are poor by that standard.
The most convincing (in theory at least) approach I know of is object capabilities. I haven’t used it yet, except for some minoring tinkering with Newspeak a while ago.
Yeah, totally. A new system like Minimacy has a huge opportunity here to differentiate itself, using either an effect system or object capabilities.
I spent some time a year ago trying to decide whether to try to build a whole new programming language + platform with sandboxing as a core design goal. but I’m not quite there yet
Minimacy is designed for single developers writing stand-alone apps for small-scale deployment. Sandboxing within such a system doesn’t make much sense, and would add a lot of complexity.
I agree that a foundational layer with good sandboxing would be nice to have. But one problem it would face is starting from a blank slate: almost no existing code could be used with it, except for purely calculational stuff.
I believe there are some object-oriented and/or functional languages which do reasonably good sandboxing in a capability-security sense, though I can’t remember which ones off the top of my head. They tend to be niche and experimental, though becoming less so with the sudden rise of interest in “reproducible builds” from the Cloud + Big Data crowd (where all the language development money is right now). The Guile Scheme variant used by Guix I think is heading this way.
The main feature of these languages is that they are “pure” object oriented (or Actor, or functional) languages in that they don’t have a global “System” object. If you want an object or method to have input-output, for example - or maybe even just to be able to create new objects - you have to pass a reference to that facility in as a parameter to it or its constructor.
What you have seen about Guile is probably Spritely Goblins. Guile on its own is a pretty standard Scheme implementation.
It’s interesting to see you mention computational notebooks. They are very fashionable in my environment (computational biophysics in academia), but nobody seems to consider their security implications. It’s perhaps even more surprising that notebook-based attacks are so far unheard of. One possible explanation is that it’s actually hard work to get a downloaded notebook to run on one’s own computer, because of obscure dependencies that need to be installed first. Security by obscurity rediscovered.
In my use case of Pharo powered computation notebooks for civic tech (in grassroots innovation/empowerment and information, archival and library sciences) I have not implemented any Goblins alike capabilities, sandboxing or dependency management, beyond declaring Monticello packages at the beginning of the data narrative – See for example this data narrative draft on approaching/demystifying AI (“A” for “Apparent” ) by building a simple chatbot. Maybe dependency management can be us
But, I wonder if capabilities confined to the image or restricted folder writing/reading could be implemented. Something like allowing the
FileLocator class read only restricted subfolders of the user (some subfolder of the
~/Documents or the image subfolder). For the moment, I think that something like author signatures, like in GNU/Linux distros could work, with keychains of trusted users that provide levels of trust to each other and, by proxy, to the computational notebook reader/explorer (it have worked pretty well for Linux/Unix package managers).
My temporal approach to social “trust” is having a readable format where you can share in the web and see for activity from other notebook readers and the evolution of the data narratives (which are hosted in Fossil, including data narrative history). For example this is an Hypothesis powered annotated data narrative, done by learners during my classes and workshops. It is not author signed, but showcases readers doing things with the code there and making comments. This allows me to focus my effort and limited resources elsewhere, while providing a social mechanism to know if a data story has been used/trusted.
Combining social and computational mechanisms for trusted computing seems the most organic/progressive approach to this problem. In our case, we’re combining annotated data stories reading with data narrative history as a way to do it… a combined practice that I have not seen deployed a lot in other places. For example, Jupyter notebooks are pretty human hostile because of the choosing of JSON as a storage format and most data narratives only include in the best of cases their history in Git, but not social context where such data stories are read/explored collectively.
While I agree that combining social and technical trust mechanisms is the most promising direction, it’s also badly supported by existing technology. You mentioned Jupyter, but Git isn’t particularly human-friendly either. And if you want people to PGP-sign their commits, it gets a lot worse. There is a lot of experimentation going on around decentralized identities. I hope some of this will end up being usable by non-techies.