A Symbol-Oriented GUI

This is a very unformed idea that I’m trying to crystallise into a better shape, or any shape at all.

It’s been haunting me for decades, though. Probably since the era of OS/2 2.0, and my disappointment when I first saw its “Workplace Shell” and found that it wasn’t like a shell at all. That is, you could use graphical objects, even send messages to them… but you couldn’t make your own, at least not interactively.

What I’ve wanted since the 1980s is a GUI where the components are “text-like”, by which I mean:

  • A window, or any other graphical component, is composed entirely from a set of smaller graphical components. Those components can all be readily seen, and they can be also be individually selected, copied and pasted. There is nothing in fact to a window other than all its individual visible components. There are, specifically, no hidden elements: no code, no configuration, no binary files. Everything you see, you get, and only that.

  • Obviously this wouldn’t get us to parity with a full GUI toolkit, because you wouldn’t (at least at first) be able to program your GUI objects. This would just be a view layer. So if there were programmed, reactive components, with configuration or code or hidden state, we’d probably want some kind of “control view” (like right-click, only more powerful) which would let us view this state. And however we viewed it (not necessarily as text - perhaps as some kind of circuit diagram), that view would also be text-like, composed of components each individually selectable.

Perhaps this idea could be called a “symbolic GUI”, because it would be composed of symbols rather than text. But “symbolic” is such an already overloaded term. “Symbol-oriented” maybe?

The important part of this idea is the selectability of all the subcomponents. Probably with some kind of grouping construct (eg a box, as the equivalent of brackets). Components would be not necessarily of a fixed size, but they wouldn’t be arbitrarily sized either. They’d probably expand to fit the size of the components inside them.

To achieve this, we’d need at least three things:

  • A set of symbols, with a basic sort of “visual grammar” on how they can connect (eg: left/right, up/down, inside/outside). The equivalent of Unicode, but for GUI elements.
  • canvases which can display these symbols, and allow selection, grouping, insert, delete, copy and paste, analogous to text areas. Those canvases must be completely unrelated to the software which hosts them. You must be able to copy symbols to and from a canvas, ship them between systems, and have them render recognisably and with the same semantics.
  • probably a strict, one-to-one mapping to and from Unicode, though that wouldn’t be necessary if we had decent interoperability between systems that wasn’t just Unicode, but we don’t.
  • possibly multiple “fonts” which would change how the symbols display, but crucially, NOT their basic identity or behaviour. CSS almost does this - and yet also does too much.

This is the fork in the path of GUIs that I’ve been expecting to happen multiple times, and it didn’t. I first thought Workplace Shell might be it, and then OpenDoc, and then HTML, and then Java Abstract Window Toolkit, and then I thought surely KDE or GNOME would, and then maybe OpenOffice Draw… but that path still hasn’t forked, not in the simple, clean, standard, inter-system, composable way that seemed obvious to me as the next step from the 1980s.

Anyway, that’s the idea. Putting it down as a flag because it feels like something that surely has to happen at some point. Until it does, we won’t have an actual “graphical language” because we won’t have the linguistic equivalents of glyphs from which to build larger components.

HTML gets us halfway there. There is a textlike markup language that defines the things on the screen, but there’s also a lot of invisible stuff, and most annoyingly, you can’t select, copy or paste any HTML element from its visible representation and move it elsewhere. And HTML elements all have to exist inside a “page”, rather than on their own.

Postscript /PDF and TeX and OpenOffice XML also, infuriatingly, almost get there, and yet stop well before the point where they would close the loop.

It might be interesting to think about how many of the HTML elements could be repurposed to be completely free of the context of a “page”, and/or how many Unicode codepoints might be able to be used as graphical drawing elements, if things could be put inside them, or they could be used as links between other elements.

1 Like

Something you might want to look at is Glamorous Toolkit for Pharo Smalltalk. Its unique feature (as far as I know!) is a way to see the code behind any GUI element. For the most frequently special-purpose element types, inspector views, it’s just Alt-Click. For the GUI of more complex tools, you invoke the “scene driller” via Alt-Shift-Click, and then you can inspect all GUI objects (that’s objects in the Smalltalk sense), including the code behind them, but it takes some experience to navigate the object graph.

You can also change any code and see the impact immediately, which is very useful for GUI work.

2 Likes

Hmm, looking at the Glamorous project page, I agree, I really should take a look. Their home page is actually speaking my language, and that’s very rare!

“a uniform environment made out of visual and interactive operators that can be combined inexpensively in many ways.”

"Glamorous Toolkit is powered by the Lepiter knowledge base engine. Lepiter is made out of snippets grouped into pages and can be used for various use cases. "

“Each snippet defines its own language. Languages can be textual, like GraphQL or JavaScript. They can be visual, like a Wardley Map. Or even widgets.”

“Every page defines a shared scope for variables. Combined with an extensible proxy model, variables can be defined and reused across languages.”

Very interesting indeed. I’ll download it and have a play. Thanks!

Ok so my first impression, just reading through the tutorial, is… someone mashed up Smalltalk, Hypercard and Wordpress Gutenberg and Doug Engelbart’s original NLS.

The way all the text seems to be paragraph-based, gives me that impression.

It’s also a little like Federated Wiki or Tiddlywiki, with its card-like snippets that pop out new ones when you click on a link, but it doesn’t give me motion sickness like those do.

It’s certainly doing a much better job of Being Hypertext ™ than any other hypertext environment I’ve seen since the 1980s. It might even be doing a better job of Being Smalltalk ™.

What is this strange feeling in my chest cavity? Could it be… hope?

Edit: Um. Well. Maybe not quite yet. I was happily clicking and then I seem to have caused an error, just by clicking. “Error: Detected invalid location!” and up comes something that looks like an entire stack trace, and the tutorial slams to a halt because it was live and I broke it.

(Fortunately I only broke that one page, so doing “next page” gets the tutorial continuing again? But I don’t even know what it was that I did that was wrong. Also, I’m not sure if I finished the tutorial or just fell off an unfinished end of it.)

Another observation: (I guess of Lepiter in particular, since that seems to be what all the “books” are written in): There’s a lot of text and after a while it becomes really, really hard to scroll. The Page Up / Page Down keys aren’t implemented. I’m reading on a notebook with a touchpad, no scrollwheel, no touchscreen, so I have to keep pointing to the scroll bar and dragging it and it’s just… not very pleasant to have to do that all the time. Compared to, say, the Web (even this site) where page up/down just work as you’d expect.

Also, I think Smalltalk syntax will never really feel natural to me. I sort of get the colons and the method calls chopped into multiple chunks, but then there’s all these other random punctuation characters: square and curly brackets, vertical bars, carets? I know huge things have been built on Smalltalk, but the core foundation just doesn’t seem as simple as it ought to be. Especially if this is a grammar we want to host arbitrary domain specific languages over.

First of all, this is Smalltalk. Glamorous Toolkit is a new UI layer for Pharo. There is growing support for working with code in other languages (for now: Python, JavaScript, Rust), but the implementation language and best-supported target languages is Pharo Smalltalk.

Next, there are definitely rough edges remaining, even though the authors decided somewhat arbitrarily to put the version number 1.0 on it in August. But it’s very usable in daily practice, once you know your way around a bit. It’s been one of may main platforms for two years, and it’s clearly the one I enjoy most.

1 Like

I have to say, after spending just a few hours with Glamorous and knowing basically no Smalltalk, I am enjoying the experience of trying to bootstrap my way towards Baby’s First Class Definition a lot more than the “classic” Smalltalk IDE experience. While I have some problems with navigating GT’s UI (it especially doesn’t seem to do small laptop screens well, and isn’t keyboard-centric), it is a lot more approachable. Much more like dealing with a modern blog/CMS or a Wiki than with an IDE.

Have you checked out Oberon?
Some versions such as the latest one have a text based UI.

2 Likes

I remember reading one of the Oberon books back in the 1990s and being enchanted by it, but I have never seen a running Oberon system in the wild. If Oberon is finally having a moment, that’s great!

What I’m reaching for with this admittedly vague concept of a “symbol-oriented GUI” though isn’t quite “just having a text interface”. I mean I grew up on 80s 8-bit systems and then IBM PC clones and I remember Common User Access and the brief joy that was Visual BASIC for DOS with its text GUI and even fully text GUI widgets! And I miss the simplicity of character-mode graphics like PETSCI , ANSI escape sequences, and the BBC Micro’s Mode 7. It felt like the sweet spot for creativity for those of us without artistic training and digitizer tablets. Character graphics was like Legos for art. I miss it deeply.

But what I want is more the idea of “copy and paste” extended to graphical interfaces. So that, as with text but differently, the user could select any widget or graph node or area of the display and reliably copy or move it to any other system.

One of the things needed to make this work would be that any code running within a GUI widget would need to be in its own security boundary. I think even Smalltalk did not enforce this, and I’m not sure that Oberon did either.

The other would be a system-level graphics markup language, which I guess is what Display Postscript and NeWS were about. But somehow when NeXT became the iMac, the idea of modelling the graphical interface as a first-class object expressed as a domain-specific language became lost, hidden under layers of (too powerful) global system services and then (too restrictive) app and app-store silos, with security “bolted on” over the top by means of globally centralized corporate gatekeepers. That’s a route to social privacy and information-control disaster on a planetary scale imo.

I want to see a world of many more domain-specific languages that do information security hyper-locally, and I want a language that’s good at defining / expressing / hosting / acting as a grammar for DSLs. I don’t think even Smalltalk is quite that language. I mean, how would I express Display Postscript or TeX or HTML structures in Smalltalk, as Smalltalk code? Or in Oberon, as Oberon code? And send that sequence of code between objects, as a message which itself is also a first-class runtime-constructed object? That’s where I think the Lisp world still has a lot to teach us about being a “minimal grammar” on which fully general-purpose inter-object messaging DSLs can be hosted. I don’t think Common Lisp or Scheme are quite that language, but S-expressions solved a lot of problems that the Algol-syntax world still haven’t realised yet are problems. I would like a purer S-exp language, though, that for instance does not have reader macros.

Smalltalk-72, very nearly is a general DSL-hosting language. Smalltalk-80 and after, sadly, are not. Too many hardcoded syntactic compromises made to appease the compiler at the expense of general data markup and representation.

2 Likes

Web apps written in Lisp using a suitable framework are probably as close to this dream as we can get today. For example Common Lisp with Spinneret and Parenscript. That gets you very close to UI elements that can be copy-paste transferred, with two important missing pieces: a security model (there is none), and a clean interface between UI elements and the rest of the code (no issue in Common Lisp because there is no security model).

2 Likes

I admit I’ve been looking more in the direction of Smalltalk-76/78 recently.

1 Like

Perhaps you’ll like document-centric environments, where active UI elements can be part of the text.

  1. Research on these dates back to User-tailorable systems: pressing the issues with buttons.
  2. Coda is one modern take on the genre. (Haven’t tried it but was impressed by this interview.)
  3. A more compositional spinoff was Boxer. It also blurs the line between document/UI/app, but adds nesting of collapsable “boxes”, reusing them for every level of structure — from variable scoping up to the level we’d call folders and apps…
    Start from their “Boxer Structures” doc; most relevant/insightful sections are “uses of doit boxes”, “uses of data boxes” & all mentions of “reconstructible interface”.

I’m not sure I fully follow your meaning, but Boxer’s ideal of “naive realism” comes to my mind. As does Self’s UI e.g. The thing on the screen is supposed to be the actual thing.

Getting away from text, Tangible Functional Programming (best to watch the video) does interesting things with visual “grammar” for seeing & combining functions.

5 Likes

Perhaps you’ll like document-centric environments, where active UI elements can be part of the text.

It’s actually less about wanting “active UI elements to be part of text” and more the exact opposite: “wanting UI elements to have the ancient properties of text”. Ie: for any element (or cluster/group of elements) to be readable, writeable, and copyable without the mere act of reading, writing or copying them causing (unknown, distant, possibly large and irreversible) things to happen.

The Web browser before Javascript and AJAX had more of the elusive property that I want than it does now: it was document based, but at least you could pull a document down and have it be totally local as long as you didn’t click any links. A PDF file (as long as you don’t use any weird plugins) is somewhat similar. It’s just that you can’t break either a Web page or a PDF document down into, say, pages, or paragraphs, and exchange those elements separately. Or you perhaps could with a lot of tooling and custom scripts - but there’s a strong culture of “it’s not polite/supported to do that, you’ll break it”.

“Document-centric” is possibly a useful buzzword, though I’m not sure I can find a lot of relevant hits about the UI form of this concept. (I can find a lot of very irrelevant hits about corporate policy around using documents inside an organization). And I really want “smaller than a document”. Node-centric.

Basically what I want is a UI defined solely as a very simple recursive, compositional, document definition language. There are of course hundreds of such UI-ish languages in use today, including TeX, Postscript, HTML, SVG, and OpenOffice’s XML formats. On top of “documents” defined in such languages, the modern UI stack also includes windowing composition language such as X11 (or Wayland), a 3D shader rendering language such as OpenGL (or Vulcan), and a font glyph representation language such as OpenType.

But all of these languages even individually are very large and complex, with the result that none of these languages have become a universal UI layer in the way that ASCII (and then Unicode) has – and none of them even seem to have the ability to be.

First, because we can’t specify any of these UI encoding/rendering languages/layers in terms of any other one. Ie, there’s nothing we can could put at the bottom of a “stack” of such recursive-block-structured UI languages/layers, except “literally a sequence of Unicode characters and/or compiled in-memory bytes”, with an entire renderer program attached.

Second, there’s one property that most of these languages/layers lack that I think is very important: that every individual element/node/container that is expressible in that language could be detached from its content and be moved, transmitted or communicated alone as an individual thing.

Some data representation syntaxes do have this detachable-node property, in themselves (specific data formats expressed in these syntaxes, however, might not). For example, S-expressions and JSON can have individual nodes extracted and transmitted. However, in-memory Lisp and Javascript objects don’t themselves have this property: they may have infinite loops, or may be inherently non-serializable.

XML and HTML almost have this property but in practice mostly don’t, because they add an artificial concept of “document” or “page” over their native “node” structure.

Because of the lack of such a universal UI representation language (and the massive complexity and centralized corporate control over the closest equivalent that we have: HTML), I find myself drifting back to “literally ASCII in a fixed-width font” as about the only thing we have that can construct a simple, universally transferrable (ie, copy-pasteable) interface. That’s not great.

If I don’t want “literally fixed-width ASCII”, then about the only other option available for a small system is “literally a bitmap”. That’s technically more powerful, but not really better. We already use PNGs and JPGs of cropped screenshots of regions of our windowing displays to exchange data that our windowing systems don’t let us exchange: the fact that we have to resort to sending bitmaps is an admission of failure. A small system that only used bitmaps would be just more of that same thing. Perhaps nested bitmaps, or some kind of vector system (like SVG) might be almost okay. We’re pretty much back at HTML at that point, though, since HTML5 includes SVG, so we might as well just use a web browser. What I’d like is something a bit smaller than HTML5 if possible.

Could we potentially start with a font definition language like Opentype (since we’re going to need font glyphs), simplify a bit, add some way of linking glyphs to well-known symbol ids like Unicode… and add recursion, so we can group letters into words, words into paragraphs, or glyphs into icons, icons into toolbars, etc, etc? Something like that.

“Coda” looks to be of no interest to me whatsoever. It’s an online cloud software-as-a-service. That’s the exact opposite of the property I just described. (Being able to copy-paste stuff around on purely local systems).

“Boxer” is interesting, but seems to be an entire Smalltalk-style programming language, which is a little too much. Still, the idea of visibly-marked and non-transgressible execution scope is very nice.

Edit: Just started reading the 1985 Boxer paper (https://boxer-project.github.io/boxer-literature/papers/A%20Principled%20Design%20for%20an%20Integrated%20Computational%20Environment%20-%20ACM%20HCI%20(diSessa,%201985).pdf) and I really love the concept of “naive realism”. Much like “WYSIWYG” from the same era except applied to programming. That and the concept of “understandable systems” seems very important, perhaps more important now than ever.

So as a tiny step towards this, I’m wondering about an ultra-minimal, text-based 1970s style windowing protocol.

Imagine we had a fix-font ASCII screen where the symbols [, ] and | were special, like so:

[big window                 |
 [a window        |[another|| 
  with stuff in it] window ]]

then we could grab and copy-paste that text, run it through a function, and get this:

[big window|[a window|with stuff in it][another|window]]

And be able to reverse the process.

It’s not quite Lisp. It’s a bracket syntax where | just means “carriage return to the start of the current term/window”. The minimalism is so that it takes up as little screen real estate as possible.

Tempted to code this up and see if it works or not.

2 Likes

I happen to be messing around with some gizmo that might be giving you some ideas, this toy is by no mean anything serious, I just made it this week as a way to play with some Fractran programs I did for a zine .

I’ve also been wondering about writing GUI-type things like you, I’m also interested in OISCs and minimal computation systems specifically, and so I spent the last two days on a ridiculous quest to see if I could write a graphical tic-tac-toe game with only two things:

  • multiset rewrite rules,
  • and a simple mechanic of symbols that can trigger evaluation by being dragged over each other.

This program system does away with anything that is not graphical programming. The interesting bit here is that rewrite rules expose a useful visual hinting system to the player, showing what can and cannot be done.

A checkbox

This shows the rules needed for a implementing a checkbox:

Here’s the source code if you’re curious to see how little such a system really needs. I hope you explore your idea further! Keep it up :slight_smile:

3 Likes

Thanks! I have to admit I’m a little baffled by what Tote is doing even by watching it: what do the two symbols in grey to the left of each rule mean/do? Are they just two arbitrary symbols, the equivalent of a “rule name”?

Edit: Ah, I see, the grey area is not related to the rules to the right of it at all, it’s just a kind of “palette” for the symbols.

Yeah, I’m not really sure what I’m reaching for here. Mostly I’m thinking about Node.js, since it’s my current “programmable pocket calculator” equivalent. It has a pretty object display format that is almost yet not quite parseable JSON or Javascript. I’m wondering what (other than S-expressions, or an indented-line syntax like SUSN) might work for displaying somewhat graphically beautiful structured objects that can also be just grabbed, dumped into a different context, and parsed back.

1 Like

Ah yeah, that’s just a palette of symbols I can use. Maybe this is clearer, how to make a progress bar from only two rules and a couple of assets:

Here’s how to create a checkbox GUI element from scratch:

Are you looking for a textual serialization? Internally you can keep the symbols in whichever format you want, is the export format for versioning?

1 Like

The “export format” (a very odd way to put it, putting the focus on onky machine-to-machine transmissio - I would rather just call it a “language”) would be for transmitting objects between systems at a fine grained level - by either machines or humans. The same way letters and words work to create a human language that can be said, typed, or written.

Without such a clearly defined format/language, I fear that graphical UIs inevitably end up as very pretty and complex interactive machines which are super-tightly bound to the machine context where they were created. We’ve created Sistene Chapels and Eiffel Towers for tourists to be awed and intimidated by, rather than pencils and universal literacy. Art galleries, monuments and museums do have their place, and at best, we can preserve and exhibit these software creations by carefully emulating an entire computer environment in a VM, which is at least one small step towards a computing permaculture… but I’d still rather have the equivalent of human-expressible, cheap and disposable, letters and words.

The Chinese writing system has a similar problem. It needs a lot of effort at an early age to crunch through the complexity of even an initial “alphabet” to begin to communicate tiny things. Chinese ideographs still can be hand-drawn and deconstructed into smaller symbols, though. And they run in parallel with a spoken language, with a mapping of sounds to characters. The graphical icons and GUIs we use every day though, often can’t be composed/decomposed, and can’t be spoken.

The sort of “language” I’d like would probably be something like TeX, Postscript, or Logo, but optimised for humans to write by hand and share in tiny pieces. It would probably have to bottom out to ASCII/QWERTY at some point just because ASCII text is like pencils now, available everywhere - and even the Chinese use it to do character entry - but it maybe needn’t be.

This might not be actually achievable, but that’s what I’d like: an entry/export format for GUIs that humans can also directly write. A “Pinyin for GUIs” sort of thing. It would probably have to be pronounceable, too.

3 Likes