Hello Worlds: First Encounters of a Malleable Kind

Theme of the thread: Describe an experience getting started with a malleable system.

I start with Uxn, as a reply to the seed-text mentioned in Bootstrappable Software - #12 by neauoire.

FAFAFXFZFwfaalFBFAAFFZFZXVGfoAAFXFZFFXXfoKgaam&/$AAAFXJgaam&/AFAAFXFFZZDYJF$/FXAKF|Bgaam/AAFXFZFXgBFAYXFEF|BGGoGAFAFXFZDYFEF|B/DAAFX_gBAGYgDAAFXFXAXZEEZXGPgBAAFXFFZZ_]GQFAPGA^GAQFAPgaamFPGAAFXFXFFXXW

In a case of nerd sniping, I’m intrigued by this piece of code and its brief mysterious description. Like the ciphered manuscript in Poe’s Gold-Bug (1843).

53‡‡†305))6*;4826)4‡.)4‡);806*;48†8
¶60))85;1‡(;:‡*8†83(88)5*†;46(;88*96
*?;8)*‡(;485);5*†2:*‡(;4956*2(5*-4)8
¶8*;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡
1;48†85;4)485†528806*81(‡9;48;(88;4
(‡?34;48)4‡;161;:188;‡?;

I’m drawn to understand what it means and how it works. As a learning exercise, I’ll see if I can use this ASCII-compatible binary executable to bootstrap Uxn.


The encoded string looks similar to Base64. Earlier today I was reading the specs for Z85, a format for representing binary data as printable text, apparently with some advantages. A while ago I made a library called base64-compressor to compress and encode text/binary data into a URL-safe variant of Base64.

Using standard Base64 in a URL requires encoding the +, / and = characters as special percent-encoded hexadecimal sequences (+ becomes %2B, / becomes %2F and = becomes %3D), which makes the string longer and harder to read.

Using a different alphabet allows for encoding as Base64 without requiring this extra markup. Typically, + and / are replaced by - and _, respectively. Some libraries encode = as ..

IETF RFC 4648: Base 64 Encoding with URL and Filename Safe Alphabet

It’s common to store application state or dynamic code in the URL, though there’s a browser limit (80Kb-2Mb). What if there was a kind of universal identifier of much larger capacity to represent every possible program and data structure, up to a reasonable length - or point to its hash.

Each Unison definition is identified by a hash of its syntax tree. Put another way, Unison code is content-addressed. Each definition has a unique and deterministic address (its hash) in this vast immutable address space. Names are like pointers to addresses in this space.

💡 The big idea · Unison programming language


bytecode program where every byte is a valid visible ascii character

I see, it was a challenge to code with a limited set of instructions that could be expressed within that byte range (0x21-7e), so limited that numbers had to be conjured up from the void as a workaround to meet the constraints.

the codes 0x20 to 0x7E were designated “printable characters”. These codes represent letters, digits, punctuation marks, and a few miscellaneous symbols. There are 95 printable characters in total.

Curious to learn more, I copy the seed from the browser, paste into the terminal to create a file.

$ echo 'FAFAFXFZFwfaalFBFAAFFZFZXVGfoAAFXFZFFXXfoKgaam&/$AAAFXJgaam&/AFAAFXFFZZDYJF$/FXAKF|Bgaam/AAFXFZFXgBFAYXFEF|BGGoGAFAFXFZDYFEF|B/DAAFX_gBAGYgDAAFXFXAXZEEZXGPgBAAFXFFZZ_]GQFAPGA^GAQFAPgaamFPGAAFXFXFFXXW' \
  | tr -d '\n' > xh.txt
$ stat -c %s xh.txt # file size in bytes
200
$ cat xh.txt
FAFA..

The value is single-quoted because it contains characters like $, &, and |. Yes I found out the hard way by first naively pasting it and hit enter without thinking. I added a filter with tr to remove the new line that gets added by echo, just in case the extra character might get interpreted as an instruction.

The file is named xh.txt because, by the time I’m writing this, I learned the purpose of the code: it converts text .txt into binary .rom. I found by chance the source code xh.tal, which looks like the same/equivalent program. So it converts ASCII encoded hex values to raw binary data that’s executable on a Uxn machine. There’s an opposite command hx to convert binary to text.

How to use this seed?

$ cat drifblim.rom.txt | uxncli xh.txt > drifblim.rom

– From Uxntal software: Bootstrapping

I get drifblim the Uxntal assembler, in text format.

$ curl -LO https://wiki.xxiivv.com/etc/drifblim.rom.txt
$ head -n 3 drifblim.rom.txt
a002 ab80 0637 a00b fb60 037b a001 2580
1037 a003 1716 1c20 000a a002 7860 0a0f
a001 0f17 00a0 0217 160b 2000 1d80 1216

Ah, so this is the hex dump of the binary file, where each byte is encoded as its readable ASCII representation. The first byte, value 0xa0, is encoded in two bytes as a (0x61) and 0 (0x30).

Now I need uxncli to run xh.txt. I’ve built it with clang before, but this time I’ll try tcc (Tiny C Compiler). Since learning about Bootstrappable TinyCC, I started gathering “ever green” tools and libraries in C99/11 that can be built from source with tcc. That gives me a sense of independence and long-term stability rarely felt higher up in the ladder of abstraction.

$ git clone https://git.sr.ht/~rabbits/uxncli
$ cd uxncli && mkdir -p bin
$ tcc src/uxncli.c -o bin/uxncli
$ bin/uxncli
usage: bin/uxncli [-v] file.rom [args..]

Build the assembler from its text description.

$ cat drifblim.rom.txt | uxncli xh.txt > drifblim.rom
$ stat -c %s drifblim.rom
2816
$ hd drifblim.rom
a002 ... 4554
$ uxncli drifblim.rom
usage: drifblim.rom in.tal out.rom

It’s alive! To confirm, compile itself.

$ curl -LO https://wiki.xxiivv.com/etc/drifblim.tal.txt
$ uxncli drifblim.rom drifblim.tal.txt drifboot.rom
-- Unused: rom/mem
-- Unused: rom/output
Assembled drifboot.rom in 2816 bytes.
$ diff drifblim.rom drifboot.rom

No output from diff means the binaries are identical. Bootstrap success.

As a final step, I create and run a little program.

$ cat << '.' > hi.tal
;text                   ( Push text pointer )
@while                  ( Create while label )
    LDAk DUP ?{         ( Load byte at address, jump if not null )
        POP POP2 BRK }  ( When null, pop text pointer, halt )
    #18 DEO             ( Send byte to Console/write port )
    INC2 !while         ( Incr text pointer, jump to label )
@text                   ( Create text label )
	"Hello 20 "from 20 "Uxn! 0a 00
.
$ uxncli drifblim.rom hi.tal hi.rom
Assembled hi.rom in 31 bytes.
$ hd hi.rom
00000000  a0 01 12 94 06 20 00 03  02 22 00 80 18 17 21 40  |..... ..."....!@|
00000010  ff f1 48 65 6c 6c 6f 20  66 72 6f 6d 20 55 78 6e  |..Hello from Uxn|
00000020  21 0a 00                                          |!..|
$ stat -c %s hi.rom
35
$ uxncli hi.rom
Hello from Uxn!

At 35 bytes, it’s the smallest hello-world executable I’ve seen. I like the sense of calm and cozyness. Next time I want to learn how to draw on a canvas. I’ll also enjoy reading and running the treasure of Uxntal programs in uxn-utils, like a fine wine or novel.

Going through the bootstrap from seed, it gave me ideas and insight into designing the lowest levels of a language runtime and abstract machine. And the Varvara Zine, I love the aesthetics and visual story-telling.


Prologue - I found a disassembler uxndis. I built it with drifblim (which I created earlier with xh.txt) and ran it on the original seed text.

$ uxncli uxndis.rom xh.txt
|0100   46         ( DUPr )
|0101   41         ( INCr )
|0102   46         ( DUPr )
|0103   41         ( INCr )
|0104   46         ( DUPr )
|0105   58         ( ADDr )
|0106   46         ( DUPr )
|0107   5a         ( MULr )
|0108   46         ( DUPr )
|0109   77         ( DEO2r )
|010a   66         ( DUP2r )
|010b   61         ( INC2r )
|010c   61         ( INC2r )
|010d   6c         ( JMP2r )
|010e   46         ( DUPr )
|010f   42         ( POPr )
|0110   46         ( DUPr )
|0111   41         ( INCr )
|0112   41         ( INCr )
|0113   46         ( DUPr )
|0114   46         ( DUPr )
|0115   5a         ( MULr )
|0116   46         ( DUPr )
|0117   5a         ( MULr )
|0118   58         ( ADDr )
|0119   56         ( DEIr )
|011a   47         ( OVRr )
|011b   66         ( DUP2r )
|011c   6f         ( STH2r )
|011d   41         ( INCr )
|011e   41         ( INCr )
|011f   46         ( DUPr )
|0120   58         ( ADDr )
|0121   46         ( DUPr )
|0122   5a         ( MULr )
|0123   46         ( DUPr )
|0124   46         ( DUPr )
|0125   58         ( ADDr )
|0126   58         ( ADDr )
|0127   66         ( DUP2r )
|0128   6f         ( STH2r )
|0129   4b         ( LTHr )
|012a   67         ( OVR2r )
|012b   61         ( INC2r )
|012c   61         ( INC2r )
|012d   6d         ( JCN2r )
|012e   26         ( DUP2 )
|012f   2f         ( STH2 )
|0130   24         ( SWP2 )
|0131   41         ( INCr )
|0132   41         ( INCr )
|0133   41         ( INCr )
|0134   46         ( DUPr )
|0135   58         ( ADDr )
|0136   4a         ( GTHr )
|0137   67         ( OVR2r )
|0138   61         ( INC2r )
|0139   61         ( INC2r )
|013a   6d         ( JCN2r )
|013b   26         ( DUP2 )
|013c   2f         ( STH2 )
|013d   41         ( INCr )
|013e   46         ( DUPr )
|013f   41         ( INCr )
|0140   41         ( INCr )
|0141   46         ( DUPr )
|0142   58         ( ADDr )
|0143   46         ( DUPr )
|0144   46         ( DUPr )
|0145   5a         ( MULr )
|0146   5a         ( MULr )
|0147   44         ( SWPr )
|0148   59         ( SUBr )
|0149   4a         ( GTHr )
|014a   46         ( DUPr )
|014b   24         ( SWP2 )
|014c   2f         ( STH2 )
|014d   46         ( DUPr )
|014e   58         ( ADDr )
|014f   41         ( INCr )
|0150   4b         ( LTHr )
|0151   46         ( DUPr )
|0152   7c         ( AND2r )
|0153   42         ( POPr )
|0154   67         ( OVR2r )
|0155   61         ( INC2r )
|0156   61         ( INC2r )
|0157   6d         ( JCN2r )
|0158   2f         ( STH2 )
|0159   41         ( INCr )
|015a   41         ( INCr )
|015b   46         ( DUPr )
|015c   58         ( ADDr )
|015d   46         ( DUPr )
|015e   5a         ( MULr )
|015f   46         ( DUPr )
|0160   58         ( ADDr )
|0161   67         ( OVR2r )
|0162   42         ( POPr )
|0163   46         ( DUPr )
|0164   41         ( INCr )
|0165   59         ( SUBr )
|0166   58         ( ADDr )
|0167   46         ( DUPr )
|0168   45         ( ROTr )
|0169   46         ( DUPr )
|016a   7c         ( AND2r )
|016b   42         ( POPr )
|016c   47         ( OVRr )
|016d   47         ( OVRr )
|016e   6f         ( STH2r )
|016f   47         ( OVRr )
|0170   41         ( INCr )
|0171   46         ( DUPr )
|0172   41         ( INCr )
|0173   46         ( DUPr )
|0174   58         ( ADDr )
|0175   46         ( DUPr )
|0176   5a         ( MULr )
|0177   44         ( SWPr )
|0178   59         ( SUBr )
|0179   46         ( DUPr )
|017a   45         ( ROTr )
|017b   46         ( DUPr )
|017c   7c         ( AND2r )
|017d   42         ( POPr )
|017e   2f         ( STH2 )
|017f   44         ( SWPr )
|0180   41         ( INCr )
|0181   41         ( INCr )
|0182   46         ( DUPr )
|0183   58         ( ADDr )
|0184   5f         ( SFTr )
|0185   67         ( OVR2r )
|0186   42         ( POPr )
|0187   41         ( INCr )
|0188   47         ( OVRr )
|0189   59         ( SUBr )
|018a   67         ( OVR2r )
|018b   44         ( SWPr )
|018c   41         ( INCr )
|018d   41         ( INCr )
|018e   46         ( DUPr )
|018f   58         ( ADDr )
|0190   46         ( DUPr )
|0191   58         ( ADDr )
|0192   41         ( INCr )
|0193   58         ( ADDr )
|0194   5a         ( MULr )
|0195   45         ( ROTr )
|0196   45         ( ROTr )
|0197   5a         ( MULr )
|0198   58         ( ADDr )
|0199   47         ( OVRr )
|019a   50         ( LDZr )
|019b   67         ( OVR2r )
|019c   42         ( POPr )
|019d   41         ( INCr )
|019e   41         ( INCr )
|019f   46         ( DUPr )
|01a0   58         ( ADDr )
|01a1   46         ( DUPr )
|01a2   46         ( DUPr )
|01a3   5a         ( MULr )
|01a4   5a         ( MULr )
|01a5   5f         ( SFTr )
|01a6   5d         ( ORAr )
|01a7   47         ( OVRr )
|01a8   51         ( STZr )
|01a9   46         ( DUPr )
|01aa   41         ( INCr )
|01ab   50         ( LDZr )
|01ac   47         ( OVRr )
|01ad   41         ( INCr )
|01ae   5e         ( EORr )
|01af   47         ( OVRr )
|01b0   41         ( INCr )
|01b1   51         ( STZr )
|01b2   46         ( DUPr )
|01b3   41         ( INCr )
|01b4   50         ( LDZr )
|01b5   67         ( OVR2r )
|01b6   61         ( INC2r )
|01b7   61         ( INC2r )
|01b8   6d         ( JCN2r )
|01b9   46         ( DUPr )
|01ba   50         ( LDZr )
|01bb   47         ( OVRr )
|01bc   41         ( INCr )
|01bd   41         ( INCr )
|01be   46         ( DUPr )
|01bf   58         ( ADDr )
|01c0   46         ( DUPr )
|01c1   58         ( ADDr )
|01c2   46         ( DUPr )
|01c3   46         ( DUPr )
|01c4   58         ( ADDr )
|01c5   58         ( ADDr )
|01c6   57         ( DEOr )

A program of hand-written machine instructions with limited byte range, that functionally behaves the same as the following xh.tal. Maybe one day it’d be interesting to step through each instruction with a debugger (beetbug) and understand exactly how it works. I’m guessing somewhere the number 10 is created out of thin air, to address the console device.

|10 @Console &vector $2 &read $5 &type $1 &write $1 &error $1

|100

@on-reset ( -> )
	;on-console .Console/vector DEO2
	[ LIT2r 0101 ] BRK

@on-console ( -> )
	.Console/read DEI
	( | make chex )
	[ LIT "0 ] SUB DUP #0a LTH ?{
		#27 SUB DUP #10 LTH ?{ POP BRK } }
	( | hb/lb )
	INCr ANDkr STHr ?{ #40 SFT BRK }
	ORA #18 DEO
	BRK
4 Likes

That’s amazing eliot!

I’m glad it actually worked! I used it on my machine but it hasn’t been tested that extensively.

There’s something I meant to write about in the other thread but couldn’t find how to plug it in, but your exploration in bootstrapping from an ascii rom reminded me of it.

So, Ribbit Scheme is another system where the bytecode is within the readable range, you might get a kick out of that :slight_smile:

2 Likes

That seed text was a nice hook in a literary sense, to enter the world of Uxn. I was motivated to solve the mystery. Building everything from source was a good way to understand it from the ground up. I’m looking forward to exploring the ecosystem around it.


Ribbit: a compact portable Scheme with runtime implemented in 25 languages. I especially like this rvm.wat written in WebAssembly Text format. A thing of beauty.


  █▀█ █ █▄▄ █▄▄ █ ▀█▀ 
  █▀▄ █ █▄█ █▄█ █ ░█░ 

A really, really small VM

Who can not love this cuteness. I know by now, often the smallest simplest things are the most powerful.

fr: Par exemple, voici la repl scheme encodé avec la RVM :

en: For example, here is the repl scheme encoded with RVM:

$ ./rsc -t rvm -l max tests/50-repl.scm -o ../presentation/04.txt

R<fi,enifed,adbmal,,,,,,,,,,,*,,,,<,,,,,,,-,,,,,,,,;9:]:9:?Z7?YPZ?^97~Z,^YJ?VvCvR3y]?7#ZI^z]I9(i&:EiS)ai&kkz!S):kw)k]%)_*Z%aC_J^~G^{!I)^8IZ/lbC`^)`~>_J_~G_|]+9)`^YIka_CaJ`.Z+dCbAai$J`^~G_|]K#`kn8:^~i$#`kn8:^~i$#`kn8:^~i$#`kn8:^~WQ^~>w(O^~>kYF^~W^z]5#ZKa_l{])#a_k#k_k~>iS)_{!;.b.:RfCdbw(k~GCaJ`^|!O.R:h-w4k.Rf~>iS)fdAaaa^}(]&*i&^z]**Z&`^{]-*Z*b`^|!=*Z-ca_wC|!.#b`n8OfAi&AbwS'awS'`9+Aea_`~YN_C`.ci$.cTANdwS,ACFcwS&JFa.cACFbwS&~>JFbwS*~GCa_~>wS,^.ci$.cK^.cKZ-TANgwS+wS'wS'Z&Z*`wS'wS%~GNbFa~GCa_~>wS+^.ci%.cK^.cKTi$ANdxP^~GNbFa~GCa_~>xP^8OfNdF`J_`JF`~>wS%^8;cCa_~>wS&^#Z)exN#d~YGbZ(i&:RiS)NeZ%AAfi$i$akYE_nF`~>wA^.:EgZ=ecEfYHdboFa_~>wC^.Z5dYIlbFbYHa_~K>wB_8OfAi&AbwS'awS'`9+Aea_`~YN_C`.ci$.cTANdwS,ACFcwS&JFa.cACFbwS&~>JFbwS*~GCa_~>wS,^.ci$.cK^.cKZ-TANgwS+wS'wS'Z&Z*`wS'wS%~GNbFa~GCa_~>wS+^.ci%.cK^.cKTi$ANdxP^~GNbFa~GCa_~>xP^8OfNdF`J_`JF`~>wS%^8;cCa_~>wS&^#Z)exN#d~YGbZ(i&:RiS)NeZ%AAfi$i$akYE_nF`~>wA^.:EgZ=ecEfYHdboFa_~>wC^.Z5dYIlbFbYHa_~K^~^>wS#^#cFan~>wS.^J_~G_#bYIk``m~YN_|!?1_?H^{]71uy!9)i$89aC_?H^89aC_?H^?HvS#~K>vS#_89aC_?H^89aC_?H^?HvS#~K^~^>vE^89aC_?HvS;?HvS#~>t^89aC_?HvS9?HvS#~>v0^89aC_?HvS5?HvS#~>u^89aC_?H^~Z6`J^~G^{]3)i$)i$93C^?YPJ^~G^?HvC~G^z]'9'ZG^8?vS7vF~ZE^8PZJ^?HvF~ZC^89i$Z$^~Z9^9'ZL^~YN^1vL?Z3C^?YPJ^?HvK~G^8?vLvK~YG^8?vS;vF~>i%^8?vS-vF~Z6^z!P9'^1vE?Pi%Z$^?HvE~Z9^z];9;8L~>u^)^~Ik^Dy!L8L?D)^9;~>vR0^~I_vC)iS-~Z,^Sy!>8>A`^8>Aa^8>Aat~>vS;^8>Aav0~>vS9^8>Aau~>vS5^D~>vS#^9F_~>vE^)i&~Z,^Dz!K*YK^?D)i&~KKIvD`*YK^?D)i&~KK^~^>vL_*YK^?D)i&~K^~^>vK^Sy]8*Z8^YJ)i&?D~>vL^YLy!5)_5BBvRL_M`v3?D~i$)_5BBvRL_M`v3?D~IvS.^~I_vS'5BBvR,_M`v3?D~i$)_5BBvRL_M`v3?D~i$)_5BBvRL_M`v3?D~IvS.^~I_vS'5BBvR,_M`v3?D~IvR<^~I_vR55BBvR%_M`v3?D~i$)_5BBvRL_M`v3?D~i$)_5BBvRL_M`v3?D~IvS.^~I_vS'5BBvR,_M`v3?D~i$)_5BBvRL_M`v3?D~i$)_5BBvRL_M`v3?D~IvS.^~I_vS'5BBvR,_M`v3?D~IvR<^~I_vR55BBvR%_M`v3?D~IvR/^~I_vR$Sz!J9M`)^~^^ZB^Z#AYK^?D9#Ui&?D~>vE^*Ai&YJwS.?D~>vJ^92YJ+Lkk5k?D~>vP^S?D~K>vRM_92YJ+Lkk5k?D~>vP^S?D~K^~^>vS?^)i%?D~>vS;^)i$?D~>vS-^S?D~>vF^98?D~>vK^)^~Ik^YLy!<)^!S$^Dy]4)^!S$iS()^~>iS-^!S$^z!-94^94Z@~>iS(^)^~>iS-^iS$y!S$iS(],'iS-^z](#l`^{]EYDl]JiF]2#oYE_^z]CYDo]$iF]##nYE_^z]9YDn]0)_)i$)i$90BBvR%`MbuC_~IvR/^~I_vR$J^~G^{]190k^)i$~YG^z]B)i$+_k~^Z1^91C^~>vPJ^)i$~YG^Z$^z].)^9._`~IakAb^Z/BMu``vR%ZDu^{]G9#Z.i&^9#AZ.i&B`kvP~Ik^z?ZHki#]OOi#]>)^]OAjO^ZA^9>Oa_)^~YM`O^YF_~G_{]M9>jO^z]Li8]A#m_i$z!NYDm]<)_9<AaJ_C^~G^{]F9<i&^z!E)k9/YEC_l~G^z!G'i&^z]=8HO^z!H/O^z!788O^z!/8FO^z!,i8!3iF!*#k`^{!0YDk!M)i$)i$)i$)i$8MYFaYF_~YMOaO_~YMQaQ_~W`)i$~>pQ_~W_)^~^>`^{]6'i$^z!D9N)i$'bQ^~W^zz!S(Bmk!S-Blk!):lkl!(:lkm!4:lkn]N:lko!@:lkp!F:lkq!8:lkr!::lks]H:lku!':lkv/!2:lkv0]/:lkv1!+:lkv2!6:lkv3]D:lkv4]@:lkv5!1:lkv6y

You’re right, this gets me in the Sherlock mode. “Now what could that mysterious code mean..” I’ll be following the scent.

4 Likes

Looking at the list of VM implementations for Ribbit, it strikes me that they are all for “big” languages and runtimes. What would a Ribbit VM look like in Sector Lisp? Or in Forth?

1 Like