I think that the ability to understand a system is a necessary perquisite to the ability to change it.
One methodology is denotational semantics, attaching meanings to terms.
I think that @khinsen you are doing the same with Leibniz.
The overall idea is the same, but Leibniz attaches meanings to terms informally, in prose, whereas denotational semantics maps code to some pre-existing mathematical formalism. For the use cases I have in mind, there is no such pre-existing formalism, so traditional denotational semantics is not an option.
Iād say there are two situations in which one can safely change a system:
- Fully understanding it - thatās for formal systems
- For informal complex systems, explore the systemās reaction to perturbations, starting with very small ones.
In the real world, method 2 is the only applicable one.
Whether it is mathematical formalism or prose is not relevant. The important thing is that it reduces a complex process / data structure into something easier to understand.
The interesting thing in this discussion is the informal representation of Q ^ 2 as lines, meaning the pictures of the lines, what we see with our eyes. We donāt need to have symbols, only pictures.
And then we can use a system like dynamicland to play with those pictures. We can put two lines together, to construct a new one.
My point is that compositionality of software should follow the compositionality of meanings, then we can construct computable systems that do not even require symbols.
Then everyone could be able to construct new computations.
Both 1 and 2 are necessary.
Formal systems do not necessitate full understanding of a physical process. There are levels of precision and it is up to us to decide which one to use.
I feel like the useful part of formal systems is being able to state with confidence what a system will not do when you make small changes to it. It is not enough to just throw up our hands and say, āennh, predicting the behaviour of a complex system is hard, software is a complex system, therefore, no-one can predict what any software will do, therefore any kind of behaviour is ok. The Halting Problem, lulz, QED.ā
For example, we must have software systems where we can play with it and be certain that no change we make at the keyboard will accidentally introduce malware that will 1) erase/corrupt our entire hard drive or 2) drain our bank account.
Without some kind of formal methods, we do NOT have these guarantees from most software, not even minimally so.
Remember, weāre now constantly plugged into an Internet of 20 billion computers, each sending thousands to millions of packets a second. Some possibly quite large fraction of those computers and packets are actively infested with malware that is deliberately trying to infect everyone.
We only need our software to be missing a few decimal points of non-bugginess in order for both of these negative outcomes 1 or 2 to happen to us with probability approaching 1.
Faced with this constant storm of evil packets that all our computers have to process, it is not really good enough to say āennh, just fiddle with your software until it worksā. Nope. Even if your software is 99.0% safe from security bugs, that means every 1 in 100 evil packets will get through. You are going to be rooted - and your entire life destroyed - within minutes.
To just be minimally safe on todayās Internet - which in 2023 is literally, not figuratively, an active cyberwar battlefield, involving multiple countries which are at literal physical shooting war with each other, and the drones flying literal bombs are Internet-connected - we need something like formal methods to prove what our personal software on our personal machines will not do in response to all of those hostile packets. Otherwise, we need to physically unplug our personal machines.
We do need formal methods to become a lot more easy to use than they currently are, though.
Also, I think formal methods need to be augmented from the low-level side: with virtual machines that severely constrain what kind of unpredictable behaviours the system can do. For example, we shouldnāt really be using raw C or Forth style unmanaged memory access at all, ever, on any machine that processes packets from the Internet; thatās the software equivalent of walking blindfolded down a highway with headphones on. You are going to get hit by a truck (malware) if you do that. No ifs or buts. It is going to happen. So donāt do that. Yes, formal methods and type theory applied to sufficiently complex compilation could āproveā that unmanaged memory access is āsafeā, in some very limited contexts, but, Iād feel much safer if there was also a virtual machine or memory manager enforcing those memory safety properties in a brute-force manner as well.
I agree with pretty much all of that. What I see as a main obstacle to improving real-life use of computers is extremism and radicalism among CS researchers and software developers, who turn questions of formal methods, and in particular type checking, into religious issues. You can adhere to the Smalltalk religion or to the Haskell religion, but you cannot build a software system with a safe core and malleability at the UI level.
It certainly does frustrate me how Software Engineering has splintered into multiple mutually unintelligible philosophies. A multi-front cold war between C (āReal programmers donāt need no array bounds-checking, that stuff costs entire bytes and nanoseconds!!ā), Haskell (āWhat part of āa monoid in the category of endofunctorsā donāt you understand?ā) and Lisp/Smalltalk (āThe truly enlightened masters just hack their operating system until it becomes their application, then ship their hard driveā)ā¦ with sub-temples of each philosophy set up at Forth, C++, Java, Javascript, Pythonā¦
I find a lot to love in each philosophy, but I also kind of wish I could somehow get a language (or rather first, a programming philosophy) with the best features of all three approaches. Iād like a language and OS to be as malleable as possible (so I can change anything I want), as minimalist as possible (so I can hold it all in my head, because I canāt change what I donāt understand), and also to strictly enforce any logical laws I require it to (so once itās changed it stays changed and doesnāt get changed by someone else).
Not entirely sure how to get there. I love Smalltalkās heavily reflective and context-dependent approach - to a point - but thereās a point where ājust send a message to an object and hope it doesnāt do something utterly awful as a responseā isnāt as helpful as it might be. Microsoft discovered this when they plugged object-oriented scriptability into all of Microsoft Office in 1998, and spent the next 25 years chasing the swarm of macro viruses that came out. And, from the other end, trying to write a Smalltalk VM from scratch - compared to writing a Lisp or a Forth - isnāt really a thing you can do, so itās not especially minimal.
I guess what I want is a very minimal OO engine, that doesnāt store any state in each object, or almost none, and where methods strictly and provably obey contracts (which can be arbitrarily complex, perhaps defined by other objects)? I donāt know if thatās even possible, yet it seems to be something like what we need.
Iāve just been poking at Glamorous Toolkit, which makes heavy use of Pharoās āpragmaā mechanism, and, seen out of the corner of the eye, a pragma sure looks a lot like a type declaration smuggled into Smalltalk. An arbitrarily complicated expression which describes something about a method. Pragmas arenāt enforced by anything of course, but it doesnāt seem like it would be impossible for them to be, if we wanted. Just write a DoesThisMethodActuallyObeyItsPragmas class maybe. It almost feels like if we built an entire programming language out of something like pragmas, we might have a sector of unexplored language space that might be interesting and useful to inhabit.
A basic Smalltalk VM is not that complicated. Thereās a nice demonstration at GitHub - dbanay/Smalltalk: By the Bluebook implementation of Smalltalk-80, which is a C++ implementation of the Smalltalk-80 VM, written on the basis of the āBlue Bookā, which is the official definition of Smalltalk-80. Relatively compact and portable C++ code, a single dependency (SDL for graphics). Not bad.
What makes modern Smalltalk VMs complicated is JIT compilation for efficiency. I suspect that will be a bottleneck for any attempt at creating a simple foundation for large computing systems.
An example of a minimalist but type-safe foundation is Minimacy. I havenāt used it, but I attended a seminar by its author. Heās a software consultant and actually uses Minimacy to develop software for his clients. Thereās nothing revolutionary in Minimacy, itās tried-and-tested ideas carefully chosen for minimalism at each level. Itās basically a minimalist version of OCaml.
Pragmas are a very interesting feature in Pharo. One use case is adding formal annotations on methods, which are often similar to type declarations, and are processed by special-purpose validators rather than some universal type checker. Another use case comes from Pharoās ācode as a databaseā structure: it is straightforward and cheap to search for pragmas across the whole system, so they are used as labels for plug-ins, documentation, etc.
Does Minimacy provide ways to sandbox programs? So that the computer owner can restrict the capabilities available to them?
No, but you can run the entire system in a sandbox. A Linux container, for example.
Except a container is an extremely poor sandbox by @natecullās definition above
Agreed. But then, all of todayās sandboxes are poor by that standard.
The most convincing (in theory at least) approach I know of is object capabilities. I havenāt used it yet, except for some minoring tinkering with Newspeak a while ago.
Yeah, totally. A new system like Minimacy has a huge opportunity here to differentiate itself, using either an effect system or object capabilities.
I spent some time a year ago trying to decide whether to try to build a whole new programming language + platform with sandboxing as a core design goal. but Iām not quite there yet
Minimacy is designed for single developers writing stand-alone apps for small-scale deployment. Sandboxing within such a system doesnāt make much sense, and would add a lot of complexity.
I agree that a foundational layer with good sandboxing would be nice to have. But one problem it would face is starting from a blank slate: almost no existing code could be used with it, except for purely calculational stuff.
I believe there are some object-oriented and/or functional languages which do reasonably good sandboxing in a capability-security sense, though I canāt remember which ones off the top of my head. They tend to be niche and experimental, though becoming less so with the sudden rise of interest in āreproducible buildsā from the Cloud + Big Data crowd (where all the language development money is right now). The Guile Scheme variant used by Guix I think is heading this way.
The main feature of these languages is that they are āpureā object oriented (or Actor, or functional) languages in that they donāt have a global āSystemā object. If you want an object or method to have input-output, for example - or maybe even just to be able to create new objects - you have to pass a reference to that facility in as a parameter to it or its constructor.
Iād like to see more of these kind of theoretically unlimited, but restricted-by-default languages. The current crop of āscriptingā languages that we have (Node.js, Python etc) are absolutely insane in the amount of terrifyingly powerful, entire-system-destroying capabilities that they allow arbitrary code to run. That makes them pretty poor choices for, eg, computational notebooks, or games platforms - or Web browsers - where the whole point is running and sharing arbitrary code snippets off the Internet. Javascript in a web browser gives out far too many permissions, which is why the last 20 years or so of Web browser development has been about bolting complex and centralizing lockdowns on top of this insecure-by-default foundation.
What you have seen about Guile is probably Spritely Goblins. Guile on its own is a pretty standard Scheme implementation.
Itās interesting to see you mention computational notebooks. They are very fashionable in my environment (computational biophysics in academia), but nobody seems to consider their security implications. Itās perhaps even more surprising that notebook-based attacks are so far unheard of. One possible explanation is that itās actually hard work to get a downloaded notebook to run on oneās own computer, because of obscure dependencies that need to be installed first. Security by obscurity rediscovered.
In my use case of Pharo powered computation notebooks for civic tech (in grassroots innovation/empowerment and information, archival and library sciences) I have not implemented any Goblins alike capabilities, sandboxing or dependency management, beyond declaring Monticello packages at the beginning of the data narrative ā See for example this data narrative draft on approaching/demystifying AI (āAā for āApparentā ) by building a simple chatbot. Maybe dependency management can be us
But, I wonder if capabilities confined to the image or restricted folder writing/reading could be implemented. Something like allowing the FileLocator
class read only restricted subfolders of the user (some subfolder of the ~/Documents
or the image subfolder). For the moment, I think that something like author signatures, like in GNU/Linux distros could work, with keychains of trusted users that provide levels of trust to each other and, by proxy, to the computational notebook reader/explorer (it have worked pretty well for Linux/Unix package managers).
My temporal approach to social ātrustā is having a readable format where you can share in the web and see for activity from other notebook readers and the evolution of the data narratives (which are hosted in Fossil, including data narrative history). For example this is an Hypothesis powered annotated data narrative, done by learners during my classes and workshops. It is not author signed, but showcases readers doing things with the code there and making comments. This allows me to focus my effort and limited resources elsewhere, while providing a social mechanism to know if a data story has been used/trusted.
Combining social and computational mechanisms for trusted computing seems the most organic/progressive approach to this problem. In our case, weāre combining annotated data stories reading with data narrative history as a way to do itā¦ a combined practice that I have not seen deployed a lot in other places. For example, Jupyter notebooks are pretty human hostile because of the choosing of JSON as a storage format and most data narratives only include in the best of cases their history in Git, but not social context where such data stories are read/explored collectively.
While I agree that combining social and technical trust mechanisms is the most promising direction, itās also badly supported by existing technology. You mentioned Jupyter, but Git isnāt particularly human-friendly either. And if you want people to PGP-sign their commits, it gets a lot worse. There is a lot of experimentation going on around decentralized identities. I hope some of this will end up being usable by non-techies.