Locality, modular analysis

jryans · 6 May 2025 16:57

@khinsen and @akkartik mentioned locality in another thread, and I wanted to lift that up to its own topic.

I’ll repeat the quotes they shared from Richard P. Gabriel’s Patterns of Software:

The primary feature for easy maintenance is locality: Locality is that characteristic of source code that enables a programmer to understand that source by looking at only a small portion of it.

If I look at any small part of it, I can see what is going on—I don’t need to refer to other parts to understand what something is doing.

If I look at any large part in overview, I can see what is going on—I don’t need to know all the details to get it.

Every level of detail is as locally coherent and as well thought out as any other level.

This idea of locality feels related to the “modular analysis” a compiler may do if it supports separate compilation: the compiler considers one unit of work at a time and does not need to see the whole program. There’s an eventual stage (e.g. linking) that puts everything together, but the compiler at least does not need to think in whole program terms.

As a reader of programs (whether human or machine-as-a-compiler), I agree locality is great. I want to be able to understand behaviour by looking only at the part in front of me.

As a modifier of programs though, it’s less clear to me. For example, I would like to modify the behaviour of some intermediate library layer. One can imagine various ways (e.g. aspect-oriented programming) that would allow you to reach in and modify the library from the outside, but this would seem to violate locality for future readers (including yourself): you can no longer understand that library’s pieces in isolation, because there’s so-called “spooky action at a distance” to watch out for now.

Of course, I could also fork the whole library and make my edits, but that’s a lot of work for a small tweak.

How can we support modifications and edits of the code we use while balancing locality as well?

akkartik · 6 May 2025 17:27

I really like how Kragen Sitaker put it:

Often you can find Pareto improvements before you have to start weighing tradeoffs.
I sense an implicit question of, what process or representation or tooling change will help us transcend this tradeoff? And I don’t think you can. A tradeoff is often unavoidable, an opportunity to exercise judgement.

So I’d say it depends on the context. Tools for late-binding locality like LP, AOP and Lisp’s notion of advice can be very helpful in making a program extensible. But the author still has to exercise judgement to use them well.

In general, the more powerful the mechanisms you have access to, the more important judgement becomes.

“You thought, as a boy, that a mage is one who can do anything. So I thought, once. So did we all. And the truth is that as one’s power grows and knowledge widens, ever the way one can follow grows narrower: until at last one chooses nothing, but does only and wholly what one must do. . . .”

– Ursula Le Guin, “A Wizard of Earthsea”

khinsen · 6 May 2025 19:03

Tools and representations can make an enormous difference.

Locality is topological. Everything one hop (or click, or …) away from the starting point is local. There can be as many hops as you want, though our mental capacity probably puts some limitations on the number. But it’s a lot more than the two directions you get in a linear text file.

Glamorous Toolkit is the example to look at. Whenever it can determine the class of a message receiver (and that’s surprisingly often), it lets you expand method source code inline. So when you look at a piece of code, most methods being called are just one click away. Compared to old-school Smalltalk environments, that makes a huge difference.

Now imagine a language and code base designed for such tooling. I am sure there is a lot of room for improvement.

akkartik · 6 May 2025 19:28

! It hadn’t sunk in for me that what you see as close together could be topological.

khinsen · 7 May 2025 06:29

Staying with Smalltalk, its biggest problem with locality is the absence of any notion of locality above the class level. Tooling doesn’t help there because there is no way to express any form of large-scale code architecture in the database that holds the code.

In file-based languages, you can often use the file structure to express some aspects of the code architecture. But that’s very limited.

I’d love to see someone explore the idea of “code as a database” in more detail. In particular “code as a graph database”, where different subgraphs express different levels of cod structure. Is anyone aware of any such work?

indieterminacy · 7 May 2025 07:49

In regards to locality, Ive been using emacs-hyperbole to narrow distances with regards to syntax.

At its core is its ability to identify grammars and provide corresponding actions depending on the circumstances (which is easily reconfigurable to capture modes of behaviour and context).

While it is focused on PEGs (parsing-expression-grammars), it also has functionality for operating off regular expressions.
For example, defil provides an ability to launch a command based upon where the cursor is, using 3 different regular-expressions (prior to the cursor; on the cursor; and beyond the cursor).

With regards to the concept of “code as a database”, one of the main drivers for my work in Qiuy is to create both coding and non coding activity as a mechanism for having everything as a graph (albeit using notatiosn to form a fingerprint as a starting point for final layer analyses).

A more simple example here demonstrates the use of aliases in Makefiles to ‘inject’ a semantic meaning over common print statements to provide a description for how to download a repo:

The outcome of which is to generate this section of a readme file:

As such, this approach provides a graph over a language without altering its behaviour.
The advantage of this is that were either the annotations of the command; the naming of the command; of the specifics of the command to be altered that there would be of a representation of such alterations.

The new information of such alterations could as a graph be used to propose improvements to other IS material - whether it is for content in an adulterated semantic form or its ‘true’ and ‘compiled’ form.

Moreover, Ive established how annotations in Qiuy can exist not as abstract symbols but as a multi-dimensional array in a closed circuit.
My experimenting with constraint-logic-programming in that regard have been treating such a graph in a closed circuit.
Particularly so, given that it is flexible with regards to the arrays formed would be specific to the dimensions in a setup (including applicability of semantic pinning to symbols used within each dimension).

jryans · 7 May 2025 10:39

Ah thanks, yes, that’s a new way of looking at it that I hadn’t considered…!

There may not be a way to resolve the tension between locality and powerful language features that can cause changes far away… That may be okay though, because tooling, visualisations, and other helpers can do a lot to assist the reader. For example, unlike the compiler with separate compilation, editor tooling can be aware of the whole system at once and can annotate parts that have been modified by remote components of the system so it becomes “one click away” local.

Apostolis · 7 May 2025 11:41

A notebook, web page or dynamicland table could be such a local view, constructed by people manually, to contain all the information necessary to understand the specific problem.

Changing the code embedded in this view could optimize some properties with regard to this view.

But the same code seen from a different view could in fact deteriorate another property.

In differential geometry, we have a projection of the surface to an R^n space , in other words, we construct a view.

(Some thoughts)

khinsen · 8 May 2025 08:25

The idea behind my HyperDoc project is exactly that: explain a software system by creating views that focus on a specific aspect. For now, those view include code snippets by transclusion, which is much simpler than what e.g. Glamorous Toolkit does. And yet, the combination of transclusions with Miller columns is already a very effective tool to create locality.

There is one more step that I consider necessary but not so easy to do in practice: create formal structures on top of the code and data that provide local structure. My current views are narratives, i.e. informal structures. That’s good enough for simple systems, but for larger or more complex ones, I believe we need structures to be graphs that are explorable by computational tools.

Apostolis · 8 May 2025 11:53

I know that you do the same…

I am going towards this direction as well. I would like the interface to enable communal thinking like in dynamicland, like being in a table , and explorable by multiple users.

I also think that this could replace the current web.

I am personally thinking of including live objects/actors with their code. And use the view as a tool to understand dynamic living complex systems.

For example, supply chains, production, understanding of the local economy (with live objects/actors from the local economy and their code)…

Apostolis · 8 May 2025 11:55

All this is also connected with citizen science that dynamicland also tries to enable.

Apostolis · 8 May 2025 11:56

I also want what you say, code having structure and being explorable.

natecull · 9 May 2025 22:53

Yes, that seems to be related to the “soup” problem that Smalltalk and other OOP VMs have. There’s objects (some of which may be classes; in Javascript, methods also might be objects themselves), and the objects exist inside a VM, and that’s it, there’s no other levels of chunking between “object” and “VM”. Which doesn’t help us when we want to model large, loosely synchronised and chunked things like networks and clouds.. or even desktops with multiple processes.

I often wonder what it would be like if we had objects which were a “container” mechanism, like Kubernetes and such, but at the object level. You’d have a reference to an object inside a container but it would just look like any other object. (The container might even physically contain all its objects, not just references to them, if it represents something like a process, a server or a storage volume.) If you mounted/merged another container into, “under”, or “over top of” that container, all object references would seamlessly update to have the properties defined by the equivalent object in the merged container. And if you unmounted the container, they’d go away again.

Ideally this mechanism would be defined just once, and carefully, at the VM level, and then used multiple times, to implement, eg, inheritance, transactions, patches, modules, package management, configuration policies, user vs system files, backups, import/export of sets of objects between systems, etc.

It would need to do some kind of copy-on-write mechanism for the (very common) case where the container “underneath” is read-only (or read-only from the perspective of the layers “above” it).

It also seems that this would just be a kind of runtime-dynamic inheritance chain, with fully live objects and not just “classes” in the chain. Maybe the dynamism would mess up compilation and typechecking… and it probably breaks locality in a number of ways… but it’s a pattern we seem to keep reimplementing anyway.

khinsen · 10 May 2025 15:55

The closest I have seen (but not used) is Newspeak with its nested classes.

indieterminacy · 10 May 2025 22:00

Perhaps do some research into Web Prolog (which is envisaged as a successor of Pengines) - lots of cools design approaches are hidden inside it.

Apostolis · 12 May 2025 08:40

I still believe that locality is ill-defined in our discussion. We generally mean that the structure of our code, whatever this is, follows closely with its functionality.

From Mossio Glycemia Regulation: From Feedback Loops to Organizational Closure

“We think that the inadequacy of feedback loops takes four forms. First, they tend to favor the idea of a neat localization of functional components. While it usually works for manmade machines because their parts have been designed separately and assembled, a neat localization applies much less clearly for organisms, in which a given function can be jointly performed by several components and a given component or structure can perform different functions. In addition, some functions can be distributed over the entire system and thus are non-localizable.”

akkartik · 12 May 2025 15:10

Structure and function are certainly properties of eldritch reality. Our puny brains can only fit a caricature of them.

In this case, however, the very objective function of locality is how well it fits in a human brain.

I just spent a while last week redoing parts of my app that are inherently non-localizable:

My text editor needs to respond to a mouse click by moving the cursor and a mouse drag by selecting the text. There are a few more scenarios to address; it gets quite complex. Until now I had a state machine spanning the draw, mouse press and mouse release handlers, but it was a long-standing source of bugs. The approach I tried now was to have mouse press and mouse release simply write their information to extremely thin, extremely timeless global variables. Then I have all the information for the state machine available to execute in a single part of the frame (doesn’t matter where it is exactly). The run-time execution is still smeared out over tens of frames, but the static code is all in one place. And immediately I started to notice ways the state machine could be simpler that had escaped me in spite of several attempts in the past year.
I have an app that works on both computers and multitouch screens. The mouse and multitouch events are superficially similar but come with quite different data. On a mobile device, both events trigger. I’d long struggled to chop up all the work that needs to happen between these two handlers. If I put everything into the mouse handler I can’t do things like pinch-to-zoom. If I put everything into the touch handler it doesn’t work on a computer. So I’d end up with ugly state machines that need to suppress one handler from the other. What I ended up doing now is:
- Make the mouse handlers spawn a coroutine.
- Make the touch handlers spawn a different coroutine.
- Have both coroutines live in a shared monitor, so that only one of them is active on any device.
  Now I can reason about each separately. There’s some unfortunate duplication, but locality seems more important here than DRY. The mouse handler doesn’t care about pinch-to-zoom, and it’s nice to be able to write out its responsibilities without worrying about the complexities of multiple ongoing presses.

In both these cases, the context is complex, and the system needs to have some requisite complexity in response. But the audience is unavoidably human, and the system can have good locality by trading off other properties.

neauoire · 12 May 2025 18:28

Hazarding a guess, but this might not be an issue at all for editing that is not purely textual. I normally edit code in something that looks like ST’s object browser

And the locality is handled automatically for me, and there’s is nowhere else to search than this narrow window for an object’s capabilities, and even tho the members of the object might physically be spanning large physical distance in text files, the reading/editing process shows only what is relevant.

As for the mouse handlers, using global seems a bit strange, have you considered having your handlers directly in the UI objects that are responding to the event? So you don’t have branching in your handler to figure out what should respond to the event. Just a thought : )

khinsen · 14 May 2025 09:53

The Smalltalk browser is a big step forward in terms of locality, compared to text files. But it handles locality only for the class you are working on, not for classes whose instances are referenced in your code. That’s what the Glamorous Toolkit editor adds, and it makes a big difference.

Kirtai · 4 June 2025 01:55

For Smalltalk, I’ve been thinking that the Image itself could be considered a kind of module. Maybe even an Actor.