Theme of the thread: Describe an experience getting started with a malleable system.
I start with Uxn, as a reply to the seed-text mentioned in Bootstrappable Software - #12 by neauoire.
FAFAFXFZFwfaalFBFAAFFZFZXVGfoAAFXFZFFXXfoKgaam&/$AAAFXJgaam&/AFAAFXFFZZDYJF$/FXAKF|Bgaam/AAFXFZFXgBFAYXFEF|BGGoGAFAFXFZDYFEF|B/DAAFX_gBAGYgDAAFXFXAXZEEZXGPgBAAFXFFZZ_]GQFAPGA^GAQFAPgaamFPGAAFXFXFFXXW
In a case of nerd sniping, I’m intrigued by this piece of code and its brief mysterious description. Like the ciphered manuscript in Poe’s Gold-Bug (1843).
53‡‡†305))6*;4826)4‡.)4‡);806*;48†8
¶60))85;1‡(;:‡*8†83(88)5*†;46(;88*96
*?;8)*‡(;485);5*†2:*‡(;4956*2(5*-4)8
¶8*;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡
1;48†85;4)485†528806*81(‡9;48;(88;4
(‡?34;48)4‡;161;:188;‡?;
I’m drawn to understand what it means and how it works. As a learning exercise, I’ll see if I can use this ASCII-compatible binary executable to bootstrap Uxn.
The encoded string looks similar to Base64. Earlier today I was reading the specs for Z85, a format for representing binary data as printable text, apparently with some advantages. A while ago I made a library called base64-compressor to compress and encode text/binary data into a URL-safe variant of Base64.
Using standard Base64 in a URL requires encoding the
+,/and=characters as special percent-encoded hexadecimal sequences (+becomes%2B,/becomes%2Fand=becomes%3D), which makes the string longer and harder to read.Using a different alphabet allows for encoding as Base64 without requiring this extra markup. Typically,
+and/are replaced by-and_, respectively. Some libraries encode=as..– IETF RFC 4648: Base 64 Encoding with URL and Filename Safe Alphabet
It’s common to store application state or dynamic code in the URL, though there’s a browser limit (80Kb-2Mb). What if there was a kind of universal identifier of much larger capacity to represent every possible program and data structure, up to a reasonable length - or point to its hash.
Each Unison definition is identified by a hash of its syntax tree. Put another way, Unison code is content-addressed. Each definition has a unique and deterministic address (its hash) in this vast immutable address space. Names are like pointers to addresses in this space.
bytecode program where every byte is a valid visible ascii character
I see, it was a challenge to code with a limited set of instructions that could be expressed within that byte range (0x21-7e), so limited that numbers had to be conjured up from the void as a workaround to meet the constraints.
the codes 0x20 to 0x7E were designated “printable characters”. These codes represent letters, digits, punctuation marks, and a few miscellaneous symbols. There are 95 printable characters in total.
Curious to learn more, I copy the seed from the browser, paste into the terminal to create a file.
$ echo 'FAFAFXFZFwfaalFBFAAFFZFZXVGfoAAFXFZFFXXfoKgaam&/$AAAFXJgaam&/AFAAFXFFZZDYJF$/FXAKF|Bgaam/AAFXFZFXgBFAYXFEF|BGGoGAFAFXFZDYFEF|B/DAAFX_gBAGYgDAAFXFXAXZEEZXGPgBAAFXFFZZ_]GQFAPGA^GAQFAPgaamFPGAAFXFXFFXXW' \
| tr -d '\n' > xh.txt
$ stat -c %s xh.txt # file size in bytes
200
$ cat xh.txt
FAFA..
The value is single-quoted because it contains characters like $, &, and |. Yes I found out the hard way by first naively pasting it and hit enter without thinking. I added a filter with tr to remove the new line that gets added by echo, just in case the extra character might get interpreted as an instruction.
The file is named xh.txt because, by the time I’m writing this, I learned the purpose of the code: it converts text .txt into binary .rom. I found by chance the source code xh.tal, which looks like the same/equivalent program. So it converts ASCII encoded hex values to raw binary data that’s executable on a Uxn machine. There’s an opposite command hx to convert binary to text.
How to use this seed?
$ cat drifblim.rom.txt | uxncli xh.txt > drifblim.rom
– From Uxntal software: Bootstrapping
I get drifblim the Uxntal assembler, in text format.
$ curl -LO https://wiki.xxiivv.com/etc/drifblim.rom.txt
$ head -n 3 drifblim.rom.txt
a002 ab80 0637 a00b fb60 037b a001 2580
1037 a003 1716 1c20 000a a002 7860 0a0f
a001 0f17 00a0 0217 160b 2000 1d80 1216
Ah, so this is the hex dump of the binary file, where each byte is encoded as its readable ASCII representation. The first byte, value 0xa0, is encoded in two bytes as a (0x61) and 0 (0x30).
Now I need uxncli to run xh.txt. I’ve built it with clang before, but this time I’ll try tcc (Tiny C Compiler). Since learning about Bootstrappable TinyCC, I started gathering “ever green” tools and libraries in C99/11 that can be built from source with tcc. That gives me a sense of independence and long-term stability rarely felt higher up in the ladder of abstraction.
$ git clone https://git.sr.ht/~rabbits/uxncli
$ cd uxncli && mkdir -p bin
$ tcc src/uxncli.c -o bin/uxncli
$ bin/uxncli
usage: bin/uxncli [-v] file.rom [args..]
Build the assembler from its text description.
$ cat drifblim.rom.txt | uxncli xh.txt > drifblim.rom
$ stat -c %s drifblim.rom
2816
$ hd drifblim.rom
a002 ... 4554
$ uxncli drifblim.rom
usage: drifblim.rom in.tal out.rom
It’s alive! To confirm, compile itself.
$ curl -LO https://wiki.xxiivv.com/etc/drifblim.tal.txt
$ uxncli drifblim.rom drifblim.tal.txt drifboot.rom
-- Unused: rom/mem
-- Unused: rom/output
Assembled drifboot.rom in 2816 bytes.
$ diff drifblim.rom drifboot.rom
No output from diff means the binaries are identical. Bootstrap success.
As a final step, I create and run a little program.
$ cat << '.' > hi.tal
;text ( Push text pointer )
@while ( Create while label )
LDAk DUP ?{ ( Load byte at address, jump if not null )
POP POP2 BRK } ( When null, pop text pointer, halt )
#18 DEO ( Send byte to Console/write port )
INC2 !while ( Incr text pointer, jump to label )
@text ( Create text label )
"Hello 20 "from 20 "Uxn! 0a 00
.
$ uxncli drifblim.rom hi.tal hi.rom
Assembled hi.rom in 31 bytes.
$ hd hi.rom
00000000 a0 01 12 94 06 20 00 03 02 22 00 80 18 17 21 40 |..... ..."....!@|
00000010 ff f1 48 65 6c 6c 6f 20 66 72 6f 6d 20 55 78 6e |..Hello from Uxn|
00000020 21 0a 00 |!..|
$ stat -c %s hi.rom
35
$ uxncli hi.rom
Hello from Uxn!
At 35 bytes, it’s the smallest hello-world executable I’ve seen. I like the sense of calm and cozyness. Next time I want to learn how to draw on a canvas. I’ll also enjoy reading and running the treasure of Uxntal programs in uxn-utils, like a fine wine or novel.
Going through the bootstrap from seed, it gave me ideas and insight into designing the lowest levels of a language runtime and abstract machine. And the Varvara Zine, I love the aesthetics and visual story-telling.
Prologue - I found a disassembler uxndis. I built it with drifblim (which I created earlier with xh.txt) and ran it on the original seed text.
$ uxncli uxndis.rom xh.txt
|0100 46 ( DUPr )
|0101 41 ( INCr )
|0102 46 ( DUPr )
|0103 41 ( INCr )
|0104 46 ( DUPr )
|0105 58 ( ADDr )
|0106 46 ( DUPr )
|0107 5a ( MULr )
|0108 46 ( DUPr )
|0109 77 ( DEO2r )
|010a 66 ( DUP2r )
|010b 61 ( INC2r )
|010c 61 ( INC2r )
|010d 6c ( JMP2r )
|010e 46 ( DUPr )
|010f 42 ( POPr )
|0110 46 ( DUPr )
|0111 41 ( INCr )
|0112 41 ( INCr )
|0113 46 ( DUPr )
|0114 46 ( DUPr )
|0115 5a ( MULr )
|0116 46 ( DUPr )
|0117 5a ( MULr )
|0118 58 ( ADDr )
|0119 56 ( DEIr )
|011a 47 ( OVRr )
|011b 66 ( DUP2r )
|011c 6f ( STH2r )
|011d 41 ( INCr )
|011e 41 ( INCr )
|011f 46 ( DUPr )
|0120 58 ( ADDr )
|0121 46 ( DUPr )
|0122 5a ( MULr )
|0123 46 ( DUPr )
|0124 46 ( DUPr )
|0125 58 ( ADDr )
|0126 58 ( ADDr )
|0127 66 ( DUP2r )
|0128 6f ( STH2r )
|0129 4b ( LTHr )
|012a 67 ( OVR2r )
|012b 61 ( INC2r )
|012c 61 ( INC2r )
|012d 6d ( JCN2r )
|012e 26 ( DUP2 )
|012f 2f ( STH2 )
|0130 24 ( SWP2 )
|0131 41 ( INCr )
|0132 41 ( INCr )
|0133 41 ( INCr )
|0134 46 ( DUPr )
|0135 58 ( ADDr )
|0136 4a ( GTHr )
|0137 67 ( OVR2r )
|0138 61 ( INC2r )
|0139 61 ( INC2r )
|013a 6d ( JCN2r )
|013b 26 ( DUP2 )
|013c 2f ( STH2 )
|013d 41 ( INCr )
|013e 46 ( DUPr )
|013f 41 ( INCr )
|0140 41 ( INCr )
|0141 46 ( DUPr )
|0142 58 ( ADDr )
|0143 46 ( DUPr )
|0144 46 ( DUPr )
|0145 5a ( MULr )
|0146 5a ( MULr )
|0147 44 ( SWPr )
|0148 59 ( SUBr )
|0149 4a ( GTHr )
|014a 46 ( DUPr )
|014b 24 ( SWP2 )
|014c 2f ( STH2 )
|014d 46 ( DUPr )
|014e 58 ( ADDr )
|014f 41 ( INCr )
|0150 4b ( LTHr )
|0151 46 ( DUPr )
|0152 7c ( AND2r )
|0153 42 ( POPr )
|0154 67 ( OVR2r )
|0155 61 ( INC2r )
|0156 61 ( INC2r )
|0157 6d ( JCN2r )
|0158 2f ( STH2 )
|0159 41 ( INCr )
|015a 41 ( INCr )
|015b 46 ( DUPr )
|015c 58 ( ADDr )
|015d 46 ( DUPr )
|015e 5a ( MULr )
|015f 46 ( DUPr )
|0160 58 ( ADDr )
|0161 67 ( OVR2r )
|0162 42 ( POPr )
|0163 46 ( DUPr )
|0164 41 ( INCr )
|0165 59 ( SUBr )
|0166 58 ( ADDr )
|0167 46 ( DUPr )
|0168 45 ( ROTr )
|0169 46 ( DUPr )
|016a 7c ( AND2r )
|016b 42 ( POPr )
|016c 47 ( OVRr )
|016d 47 ( OVRr )
|016e 6f ( STH2r )
|016f 47 ( OVRr )
|0170 41 ( INCr )
|0171 46 ( DUPr )
|0172 41 ( INCr )
|0173 46 ( DUPr )
|0174 58 ( ADDr )
|0175 46 ( DUPr )
|0176 5a ( MULr )
|0177 44 ( SWPr )
|0178 59 ( SUBr )
|0179 46 ( DUPr )
|017a 45 ( ROTr )
|017b 46 ( DUPr )
|017c 7c ( AND2r )
|017d 42 ( POPr )
|017e 2f ( STH2 )
|017f 44 ( SWPr )
|0180 41 ( INCr )
|0181 41 ( INCr )
|0182 46 ( DUPr )
|0183 58 ( ADDr )
|0184 5f ( SFTr )
|0185 67 ( OVR2r )
|0186 42 ( POPr )
|0187 41 ( INCr )
|0188 47 ( OVRr )
|0189 59 ( SUBr )
|018a 67 ( OVR2r )
|018b 44 ( SWPr )
|018c 41 ( INCr )
|018d 41 ( INCr )
|018e 46 ( DUPr )
|018f 58 ( ADDr )
|0190 46 ( DUPr )
|0191 58 ( ADDr )
|0192 41 ( INCr )
|0193 58 ( ADDr )
|0194 5a ( MULr )
|0195 45 ( ROTr )
|0196 45 ( ROTr )
|0197 5a ( MULr )
|0198 58 ( ADDr )
|0199 47 ( OVRr )
|019a 50 ( LDZr )
|019b 67 ( OVR2r )
|019c 42 ( POPr )
|019d 41 ( INCr )
|019e 41 ( INCr )
|019f 46 ( DUPr )
|01a0 58 ( ADDr )
|01a1 46 ( DUPr )
|01a2 46 ( DUPr )
|01a3 5a ( MULr )
|01a4 5a ( MULr )
|01a5 5f ( SFTr )
|01a6 5d ( ORAr )
|01a7 47 ( OVRr )
|01a8 51 ( STZr )
|01a9 46 ( DUPr )
|01aa 41 ( INCr )
|01ab 50 ( LDZr )
|01ac 47 ( OVRr )
|01ad 41 ( INCr )
|01ae 5e ( EORr )
|01af 47 ( OVRr )
|01b0 41 ( INCr )
|01b1 51 ( STZr )
|01b2 46 ( DUPr )
|01b3 41 ( INCr )
|01b4 50 ( LDZr )
|01b5 67 ( OVR2r )
|01b6 61 ( INC2r )
|01b7 61 ( INC2r )
|01b8 6d ( JCN2r )
|01b9 46 ( DUPr )
|01ba 50 ( LDZr )
|01bb 47 ( OVRr )
|01bc 41 ( INCr )
|01bd 41 ( INCr )
|01be 46 ( DUPr )
|01bf 58 ( ADDr )
|01c0 46 ( DUPr )
|01c1 58 ( ADDr )
|01c2 46 ( DUPr )
|01c3 46 ( DUPr )
|01c4 58 ( ADDr )
|01c5 58 ( ADDr )
|01c6 57 ( DEOr )
A program of hand-written machine instructions with limited byte range, that functionally behaves the same as the following xh.tal. Maybe one day it’d be interesting to step through each instruction with a debugger (beetbug) and understand exactly how it works. I’m guessing somewhere the number 10 is created out of thin air, to address the console device.
|10 @Console &vector $2 &read $5 &type $1 &write $1 &error $1
|100
@on-reset ( -> )
;on-console .Console/vector DEO2
[ LIT2r 0101 ] BRK
@on-console ( -> )
.Console/read DEI
( | make chex )
[ LIT "0 ] SUB DUP #0a LTH ?{
#27 SUB DUP #10 LTH ?{ POP BRK } }
( | hb/lb )
INCr ANDkr STHr ?{ #40 SFT BRK }
ORA #18 DEO
BRK

