understandability => malleability

I think that the ability to understand a system is a necessary perquisite to the ability to change it.

4 Likes

One methodology is denotational semantics, attaching meanings to terms.

I think that @khinsen you are doing the same with Leibniz.

1 Like

The overall idea is the same, but Leibniz attaches meanings to terms informally, in prose, whereas denotational semantics maps code to some pre-existing mathematical formalism. For the use cases I have in mind, there is no such pre-existing formalism, so traditional denotational semantics is not an option.

1 Like

Iā€™d say there are two situations in which one can safely change a system:

  1. Fully understanding it - thatā€™s for formal systems
  2. For informal complex systems, explore the systemā€™s reaction to perturbations, starting with very small ones.

In the real world, method 2 is the only applicable one.

2 Likes

Whether it is mathematical formalism or prose is not relevant. The important thing is that it reduces a complex process / data structure into something easier to understand.

affine functions <=> Q ^ 2

The interesting thing in this discussion is the informal representation of Q ^ 2 as lines, meaning the pictures of the lines, what we see with our eyes. We donā€™t need to have symbols, only pictures.
And then we can use a system like dynamicland to play with those pictures. We can put two lines together, to construct a new one.

My point is that compositionality of software should follow the compositionality of meanings, then we can construct computable systems that do not even require symbols.

Then everyone could be able to construct new computations.

1 Like

Both 1 and 2 are necessary.

Formal systems do not necessitate full understanding of a physical process. There are levels of precision and it is up to us to decide which one to use.

1 Like

I feel like the useful part of formal systems is being able to state with confidence what a system will not do when you make small changes to it. It is not enough to just throw up our hands and say, ā€œennh, predicting the behaviour of a complex system is hard, software is a complex system, therefore, no-one can predict what any software will do, therefore any kind of behaviour is ok. The Halting Problem, lulz, QED.ā€

For example, we must have software systems where we can play with it and be certain that no change we make at the keyboard will accidentally introduce malware that will 1) erase/corrupt our entire hard drive or 2) drain our bank account.

Without some kind of formal methods, we do NOT have these guarantees from most software, not even minimally so.

Remember, weā€™re now constantly plugged into an Internet of 20 billion computers, each sending thousands to millions of packets a second. Some possibly quite large fraction of those computers and packets are actively infested with malware that is deliberately trying to infect everyone.

We only need our software to be missing a few decimal points of non-bugginess in order for both of these negative outcomes 1 or 2 to happen to us with probability approaching 1.

Faced with this constant storm of evil packets that all our computers have to process, it is not really good enough to say ā€œennh, just fiddle with your software until it worksā€. Nope. Even if your software is 99.0% safe from security bugs, that means every 1 in 100 evil packets will get through. You are going to be rooted - and your entire life destroyed - within minutes.

To just be minimally safe on todayā€™s Internet - which in 2023 is literally, not figuratively, an active cyberwar battlefield, involving multiple countries which are at literal physical shooting war with each other, and the drones flying literal bombs are Internet-connected - we need something like formal methods to prove what our personal software on our personal machines will not do in response to all of those hostile packets. Otherwise, we need to physically unplug our personal machines.

We do need formal methods to become a lot more easy to use than they currently are, though.

Also, I think formal methods need to be augmented from the low-level side: with virtual machines that severely constrain what kind of unpredictable behaviours the system can do. For example, we shouldnā€™t really be using raw C or Forth style unmanaged memory access at all, ever, on any machine that processes packets from the Internet; thatā€™s the software equivalent of walking blindfolded down a highway with headphones on. You are going to get hit by a truck (malware) if you do that. No ifs or buts. It is going to happen. So donā€™t do that. Yes, formal methods and type theory applied to sufficiently complex compilation could ā€œproveā€ that unmanaged memory access is ā€œsafeā€, in some very limited contexts, but, Iā€™d feel much safer if there was also a virtual machine or memory manager enforcing those memory safety properties in a brute-force manner as well.

3 Likes

I agree with pretty much all of that. What I see as a main obstacle to improving real-life use of computers is extremism and radicalism among CS researchers and software developers, who turn questions of formal methods, and in particular type checking, into religious issues. You can adhere to the Smalltalk religion or to the Haskell religion, but you cannot build a software system with a safe core and malleability at the UI level.

1 Like

It certainly does frustrate me how Software Engineering has splintered into multiple mutually unintelligible philosophies. A multi-front cold war between C (ā€œReal programmers donā€™t need no array bounds-checking, that stuff costs entire bytes and nanoseconds!!ā€), Haskell (ā€œWhat part of ā€˜a monoid in the category of endofunctorsā€™ donā€™t you understand?ā€) and Lisp/Smalltalk (ā€œThe truly enlightened masters just hack their operating system until it becomes their application, then ship their hard driveā€)ā€¦ with sub-temples of each philosophy set up at Forth, C++, Java, Javascript, Pythonā€¦

I find a lot to love in each philosophy, but I also kind of wish I could somehow get a language (or rather first, a programming philosophy) with the best features of all three approaches. Iā€™d like a language and OS to be as malleable as possible (so I can change anything I want), as minimalist as possible (so I can hold it all in my head, because I canā€™t change what I donā€™t understand), and also to strictly enforce any logical laws I require it to (so once itā€™s changed it stays changed and doesnā€™t get changed by someone else).

Not entirely sure how to get there. I love Smalltalkā€™s heavily reflective and context-dependent approach - to a point - but thereā€™s a point where ā€œjust send a message to an object and hope it doesnā€™t do something utterly awful as a responseā€ isnā€™t as helpful as it might be. Microsoft discovered this when they plugged object-oriented scriptability into all of Microsoft Office in 1998, and spent the next 25 years chasing the swarm of macro viruses that came out. And, from the other end, trying to write a Smalltalk VM from scratch - compared to writing a Lisp or a Forth - isnā€™t really a thing you can do, so itā€™s not especially minimal.

I guess what I want is a very minimal OO engine, that doesnā€™t store any state in each object, or almost none, and where methods strictly and provably obey contracts (which can be arbitrarily complex, perhaps defined by other objects)? I donā€™t know if thatā€™s even possible, yet it seems to be something like what we need.

Iā€™ve just been poking at Glamorous Toolkit, which makes heavy use of Pharoā€™s ā€œpragmaā€ mechanism, and, seen out of the corner of the eye, a pragma sure looks a lot like a type declaration smuggled into Smalltalk. An arbitrarily complicated expression which describes something about a method. Pragmas arenā€™t enforced by anything of course, but it doesnā€™t seem like it would be impossible for them to be, if we wanted. Just write a DoesThisMethodActuallyObeyItsPragmas class maybe. It almost feels like if we built an entire programming language out of something like pragmas, we might have a sector of unexplored language space that might be interesting and useful to inhabit.

1 Like

A basic Smalltalk VM is not that complicated. Thereā€™s a nice demonstration at GitHub - dbanay/Smalltalk: By the Bluebook implementation of Smalltalk-80, which is a C++ implementation of the Smalltalk-80 VM, written on the basis of the ā€œBlue Bookā€, which is the official definition of Smalltalk-80. Relatively compact and portable C++ code, a single dependency (SDL for graphics). Not bad.

What makes modern Smalltalk VMs complicated is JIT compilation for efficiency. I suspect that will be a bottleneck for any attempt at creating a simple foundation for large computing systems.

An example of a minimalist but type-safe foundation is Minimacy. I havenā€™t used it, but I attended a seminar by its author. Heā€™s a software consultant and actually uses Minimacy to develop software for his clients. Thereā€™s nothing revolutionary in Minimacy, itā€™s tried-and-tested ideas carefully chosen for minimalism at each level. Itā€™s basically a minimalist version of OCaml.

Pragmas are a very interesting feature in Pharo. One use case is adding formal annotations on methods, which are often similar to type declarations, and are processed by special-purpose validators rather than some universal type checker. Another use case comes from Pharoā€™s ā€œcode as a databaseā€ structure: it is straightforward and cheap to search for pragmas across the whole system, so they are used as labels for plug-ins, documentation, etc.

3 Likes

Does Minimacy provide ways to sandbox programs? So that the computer owner can restrict the capabilities available to them?

No, but you can run the entire system in a sandbox. A Linux container, for example.

Except a container is an extremely poor sandbox by @natecullā€™s definition above :slight_smile:

Agreed. But then, all of todayā€™s sandboxes are poor by that standard.

The most convincing (in theory at least) approach I know of is object capabilities. I havenā€™t used it yet, except for some minoring tinkering with Newspeak a while ago.

2 Likes

Yeah, totally. A new system like Minimacy has a huge opportunity here to differentiate itself, using either an effect system or object capabilities.

I spent some time a year ago trying to decide whether to try to build a whole new programming language + platform with sandboxing as a core design goal. but Iā€™m not quite there yet :slight_smile:

Minimacy is designed for single developers writing stand-alone apps for small-scale deployment. Sandboxing within such a system doesnā€™t make much sense, and would add a lot of complexity.

I agree that a foundational layer with good sandboxing would be nice to have. But one problem it would face is starting from a blank slate: almost no existing code could be used with it, except for purely calculational stuff.

I believe there are some object-oriented and/or functional languages which do reasonably good sandboxing in a capability-security sense, though I canā€™t remember which ones off the top of my head. They tend to be niche and experimental, though becoming less so with the sudden rise of interest in ā€œreproducible buildsā€ from the Cloud + Big Data crowd (where all the language development money is right now). The Guile Scheme variant used by Guix I think is heading this way.

The main feature of these languages is that they are ā€œpureā€ object oriented (or Actor, or functional) languages in that they donā€™t have a global ā€œSystemā€ object. If you want an object or method to have input-output, for example - or maybe even just to be able to create new objects - you have to pass a reference to that facility in as a parameter to it or its constructor.

Iā€™d like to see more of these kind of theoretically unlimited, but restricted-by-default languages. The current crop of ā€œscriptingā€ languages that we have (Node.js, Python etc) are absolutely insane in the amount of terrifyingly powerful, entire-system-destroying capabilities that they allow arbitrary code to run. That makes them pretty poor choices for, eg, computational notebooks, or games platforms - or Web browsers - where the whole point is running and sharing arbitrary code snippets off the Internet. Javascript in a web browser gives out far too many permissions, which is why the last 20 years or so of Web browser development has been about bolting complex and centralizing lockdowns on top of this insecure-by-default foundation.

1 Like

What you have seen about Guile is probably Spritely Goblins. Guile on its own is a pretty standard Scheme implementation.

Itā€™s interesting to see you mention computational notebooks. They are very fashionable in my environment (computational biophysics in academia), but nobody seems to consider their security implications. Itā€™s perhaps even more surprising that notebook-based attacks are so far unheard of. One possible explanation is that itā€™s actually hard work to get a downloaded notebook to run on oneā€™s own computer, because of obscure dependencies that need to be installed first. Security by obscurity rediscovered.

1 Like

In my use case of Pharo powered computation notebooks for civic tech (in grassroots innovation/empowerment and information, archival and library sciences) I have not implemented any Goblins alike capabilities, sandboxing or dependency management, beyond declaring Monticello packages at the beginning of the data narrative ā€“ See for example this data narrative draft on approaching/demystifying AI (ā€œAā€ for ā€œApparentā€ :wink: ) by building a simple chatbot. Maybe dependency management can be us

But, I wonder if capabilities confined to the image or restricted folder writing/reading could be implemented. Something like allowing the FileLocator class read only restricted subfolders of the user (some subfolder of the ~/Documents or the image subfolder). For the moment, I think that something like author signatures, like in GNU/Linux distros could work, with keychains of trusted users that provide levels of trust to each other and, by proxy, to the computational notebook reader/explorer (it have worked pretty well for Linux/Unix package managers).

My temporal approach to social ā€œtrustā€ is having a readable format where you can share in the web and see for activity from other notebook readers and the evolution of the data narratives (which are hosted in Fossil, including data narrative history). For example this is an Hypothesis powered annotated data narrative, done by learners during my classes and workshops. It is not author signed, but showcases readers doing things with the code there and making comments. This allows me to focus my effort and limited resources elsewhere, while providing a social mechanism to know if a data story has been used/trusted.

Combining social and computational mechanisms for trusted computing seems the most organic/progressive approach to this problem. In our case, weā€™re combining annotated data stories reading with data narrative history as a way to do itā€¦ a combined practice that I have not seen deployed a lot in other places. For example, Jupyter notebooks are pretty human hostile because of the choosing of JSON as a storage format and most data narratives only include in the best of cases their history in Git, but not social context where such data stories are read/explored collectively.

2 Likes

While I agree that combining social and technical trust mechanisms is the most promising direction, itā€™s also badly supported by existing technology. You mentioned Jupyter, but Git isnā€™t particularly human-friendly either. And if you want people to PGP-sign their commits, it gets a lot worse. There is a lot of experimentation going on around decentralized identities. I hope some of this will end up being usable by non-techies.