History as a First-class Citizen

yihanwu1024 · 27 January 2025 12:30

I am surprised that the role of history in metaphysics was never mentioned on this forum, since many of you engage with some kind of “software philosophy”. I myself have been searching for the next philosophy that I can claim to be highly applicable to malleability. Today I am sharing some results with you. I want to be upfront that I have not prepared a complete explanation for many of these arguments, but I still think the content can be inspiring. I appreciate all critiques.

One important conclusion from my journey is that the process of understanding itself only makes the most sense in the context of a history. Admittedly, other factors such as documentation and design matter, but these are incomparable to history. History subsumes everything in a system and makes them meaningful to the user, producing a system that transcends the existing one. The new does not naturally fit within the old. If it is forced into the old one, it will most certainly end up with bad properties. This is just one way to state a class of the difficult ontological problems in our software and information world.

But computers are useful! We want to use it to manage information. How do we possibly manage both the old and the new in a symbolic way, without somehow fitting the new in a system we already know? Now I can see that it is possible with some imagination. The revelation? Symbolic representation does not entail structural submission.

Computer programmers are extremely familiar with finding information by structure. You have some new idea to write down? Open a file first, and write to it. This file is within a certain location in the filesystem, to which you navigate using steps defined according to the structure of the system. Did you see that current systems force users to think by structure? Because the inputs are immediately “type-checked” by the structure, there is no way to define anything outside.

Alternatively, the system could allow the user to:

assert that the new is of its own kind
pick out relevant information from history
transcendentally maintain the new, as an additional aspect

This “picking out” is different from a structuralist reference as it is not stored in a field of the constructor. These picks are hints to a new reality, and their other aspects are soon to be established, again by attaching to the original objects themselves, at the current time rather than as a field of the original objects. Everything is added to this system at the current time “frontier”, and the effective environment is the cascaded result of the whole timeline.

For history to be real, it must be compulsory. The system will transparently record all nondeterministic procedures, and save the parameters as strictly additional data that cannot be manipulated by userspace code. The system does not overwrite or delete such data on behalf of the user. When data cannot be immediately “type-checked”, keeping a history is necessary for correctness, and this correctness is only judged by the user.

Perhaps it is not a coincidence that the process above resembles the progressive process of human intelligence. In my opinion, this exactly signifies its potential as a malleable system. It might be time for us to put aside structuralism, a fixed understanding of intelligence, and even some well-established constructs in computer systems.

As an additional benefit, programmers in this system will deal with significantly less I/O and multiplexing, as they are often handled by compulsory history. Many programs will only need to be pure functions. In my opinion, the bloat of boilerplate and multiplexing control flags is best addressed with system-level history. These features were usually requested to increase flexibility of use in the first place, but now a full history enables ultimate flexibility.

Finally and as always, there is a social aspect. A system with first-class history will also give users more control over what they use in a few ways:

History subsumes everything it uses. When you think an app GUI might be manipulative, the ability to keep past and current content (even if only spanning a few minutes) side-by-side is a powerful countermeasure.
This should also make us consider our strategy with manipulation. If our interactions with external systems are not recorded, can any process of manipulation even be systematically clarified? Manipulation seems to always require a forgetful subject.
History demystifies the IT process. New types are no longer “new” things the user must pull out from thin air. As soon as relevant facts are identified within history, a type emerges without much difficulty.

yihanwu1024 · 27 January 2025 13:18

cf. Reversibility Reversibility
cf. “All application data is automatically persisted” Feedback on a malleable operating system
Some of these motivations are relevant.

akkartik · 28 January 2025 05:33

This sounds interesting but I’m afraid I don’t quite follow what you mean. Could you give a concrete example of what first-class history would look like? Does it look like version control? Or a directory hierarchy by year/month/date/etc.? Is it necessary/sufficient to just save every single thing?

I’m particularly attracted to the goal of countering manipulative apps.

khinsen · 28 January 2025 07:41

They do. But I don’t see how they couldn’t. There are enormous differences in degree, and I am also very much interested in reducing that degree. But information always attaches to prior structure. Textual information builds on plain text encoding. And plain text encoding builds on binary data storage. Which in turn derives from number representations that have been around for many centuries.

yihanwu1024 · 31 January 2025 10:28

Thank you for your reply. What do you mean by “information always attaches to prior structure”? I think you mean “things are always stored in a data structure defined by previous programs”. It actually contains two points: things are defined with previously existing constructors, and each of them has a location.

A certain edition of this understanding does not overlook the algorithmic and progressive aspect of systems. In a system where history is first-class, everything that is done generates something new, namely the new event which is distinct from any past event. This gives everything a unique location in time. Obviously, such an identifier has to be long enough to support a large space.

This is how it will be implemented, not how it will be understood in the significant way. The significance of this system lies in progression.

yihanwu1024 · 31 January 2025 10:37

It is sufficient to save every single nondeterministic procedure. I would like to also skip recording certain unimportant things, but all business logic will definitely be recorded.

It can work as a navigable timeline tree, as a tree is naturally induced by history-hopping. There is no explicit version control, because history is compulsory and it is not possible to not leave a history.

khinsen · 1 February 2025 07:12

I certainly agree about progression!

What I meant with “prior structure” is not only “previous programs”. The hardware itself imposes some structure: bits, bytes, memory pages, etc. And then there are relevant cultural artifacts that predate computers: formal logic, formal languages, and ultimately all of language and the semantics and semiotics that come with it.

yihanwu1024 · 1 February 2025 14:30

I would like to say that structures are very, very common. The very concept of a “thing” already entails structure, since there must have been a way for you to distinguish it from its background.

Structures are not a restriction to us. Structures are there, but they can also be subsumed, i.e. made a figurehead. History is one way to do this.

Assume scoping is already addressed of. No matter what data is in the structure you have, you can ask: “have I already seen this”? The system assigns an identifier to every distinct thing it sees, disregarding any invariant structure during the interaction, and taking note of the behavior caused by the interaction. Any result from this process does not use the original structure.

yihanwu1024 · 1 February 2025 14:30

Additionally, please have a look at my previous article, Permissive Navigation. Many of these ideas are related.

amirouche · 4 February 2025 17:48

Can you explain what the words means in this context?

Indeed.

Very well written. Hence, it is unclear when I stop programming.

I believe real is not the best word, but I can’t recommend something else, nor explain it.

In the thread, You commented history is not versioning. In my experience, it is. Versioning is discrete representation of the past, and in the case of my project you can travel through time, relate to it, building upon its consequences the present. It is not exactly history, as history is continuous.

Maybe you look into my project, and tell me how it relates to your vision, and philosophy around “History as a Frist-class citizen”, the latter goes into my next project! First iteration at foundation formely copernic reference at SRFI-168.

The gist is the algorithm is that it allows to relate any partial facts to zero or more dense facts in O(log), and then depending the dense dimension construction, one can relate one or more rare dimension to learn, and relate dense facts to create new structures? Rare dimensions are numerous. Interesting dense dimension include subject, predicate, object, spatial, and temporal extent, sequence index position, timestamp, content-hash, uuid. Not all are useful, all the time.

Tracking history, logs, time spent, offer a another medium for navigation. And may even offer replay feature (think vectorized record of computing act) for learning purpose, but also healing purpose after a brain damage. Replaying the act is navigating history, pausing, then editing is fixing the present in shared structures.

I really like this topic. I added a couple of comments, I thope you do not see it as merely bragging my project. and hijacking the conversation. I planned to post about it in its own topic, but time flies…

yihanwu1024 · 4 February 2025 18:30

Welcome to the forum, amirouche! I don’t think you are advertising; I think you are here for sincere conversations

Let me answer your questions now, and I will look at your project soon.

submission: “1. The act of submitting or yielding; surrender.” For current systems, if you want to manage information with it, you have to use its schema, for example. And schema is one way a structure is manifested. Hence “structural submission”.

Symbolic representation is just how you ever have things as symbols, like natural language always does.

I can change that word to “nontrivial”.

You can interpret my notion of history as versioning; I don’t think it is wrong. It just doesn’t need you to do anything explicit like git add, git commit.

Again, thank you for reaching out!

khinsen · 4 February 2025 18:32

Interesting. Your first variant of permissive navigation, also known as Miller Columns, is what I have been playing with recently, inspired by Federated Wiki and Glamorous Toolkit. Going back to standard browser navigation is rather frustrating.

yihanwu1024 · 4 February 2025 18:41

I would distinguish between Miller Columns and permissive navigation variant 1.

Miller Columns restricts the view to a path in a tree. Permissive navigation (any variant) just keeps everything open, more or less plainly. So, it is possible to have more than one leaf open, for example.

I have actually seen permissive navigation implemented on the Web, but I cannot find it now.

khinsen · 4 February 2025 19:15

Federated Wiki lets you keep everything open, or close panes as you like. So does my own experiment. The original Miller Columns are indeed limited to a path, as is Glamorous Toolkit.

natecull · 6 February 2025 02:38

I"m concerned by the phrase “mandatory”.

There is some information which must be deleted, otherwise the user is exposed to security and privacy threats, and also in some cases, very grave legal risk.

A computer which refuses to delete information which must be deleted, is to quote Blade runner, a hazard, not a benefit.

yihanwu1024 · 6 February 2025 09:38

The repository and SRFI you posted did not have sufficient documentation on the data structure. Can you give me something readable? There are also some mailing list archives, but I don’t know what to read.

Physicists will disagree with you for the following reasons:

There is no continuum. (There are, however, processes that happen very fast.)
Nothing can be “exactly x”, because it is impossible to accurately measure the state of or make an exact copy of any particle.

However, you are right that history is what can be built upon.

yihanwu1024 · 6 February 2025 09:48

I agree that there are things that must be deleted. I intend to only apply “mandatory history” to explorative processes. After all, that is the theme of malleable system. For example, when you get a new tool (which you would actually want to learn and audit), or develop your own tool (which you want to debug). Once a process has solidified and about to be put into production, it is time to consider retention and compliance.

amirouche · 6 February 2025 17:21

Thanks. I will be back with the nstore, and vnstore in a swift time, I promise.

I guess I did not want to believe, or submit myself to the fact, and moment that I was precisely reading what I understood, and that it is also what you meant to write. The idea you wanted to share has now a new host.

Back at the original sentence:

Symbolic representation is a proxy of my internal representation. And I need to mine [hard work, struggle with] the environment, hence schemas in order to manifest the intended structure. Representations, that can be measured, and that another intelligence can interpret are lossy compression projected in a physical space upon which we try to build shared understanding.

The following is more relatable to me:

Internal representation does not entail structural submission.

Experiencing alienation, and freedom at the same time. I am free to think. To share an idea is getting something from outside to inside, I need to submit to inside structure.

Is that a new [independent] entanglement?

Understanding freedom is passion of mine, so is understanding intelligence.

In other words:

history of what is measurable enables unprecedented flexibility.

Internally, I call vnstore project the panopticon.

amirouche · 6 February 2025 17:28

The generic tuple store (nstore) was created to build the versioned generic tuple store (vnstore) as a tool to keep track of history of knowledge. I wanted to build a knowledge base, I started with eav because it was looking versatile, then figured that sometime I need to query by value, such as: what has “malleable” as Value. I figured that RDF triples store can do that. And then I wanted to keep track of changes: there was no rdf database with a 5th column to store time. I asked for help, and someone told me about an algorithm documented by Erdős: nstore was born.

Within a generic tuple store (nstore) the following degrees of freedom are available:

a tuple has always n items;
tuples are sorted in their natural order;
tuples admit a lower, and upper bound limit;
add(t) adds the tuple t;
ask(t) returns true if tuple t is in the nstore, otherwise false;
del(t) delete the tuple t, subsequent calls ask(t) will return false, until a new call to add(t);
query(start, end) returns tuples in order between start, and end;
if start < end, query(start, end) returns tuples in ascending order;
if end < start, query(start, end) returns tuples in descending order;
query(pattern) return ascending tuples subsumed by pattern. subsume(pattern, tuple) is true if for all i in [0…n] where pattern[i] is an item, then tuple[i] = pattern[i]. Sometime, it is called unification. The complexity is O(log n) where n is the total number of tuples.

That is a description of the nstore the generic tuple store.

a triplestore is a nstore with n=3;

Usually, in RDF, when n = 3, item_1 is called subject, I like uid, maybe you prefer structure identifier;
Usually, in RDF, when n = 3, item_2 is called predicate, I like key, maybe you prefer field, or field identifier;
Usually, in RDF, when n = 3, item_3 is called object, I like value, maybe you like value.

a quadstore is a nstore with n=4;

Usually, in RDF, when n = 4, item_0 is called graph, it can also be timestamp.

It is possible to represent how a structure evolves over time:

0: program(0, freedom, language, nuance)
1: program(0, freedom, author, amirouche)

2: program(1, freedom, define, (lambda args (void)))


...
2600: program(2, freedom, define, (vau args env env))

Are tuples 2, and 3 co-existing in the same present now in different namespaces, are they alternative path in the same moment, should definition in 3 replace definition in 2. It depends on the reader. It is possible to describe a more complex history logic with one or more items.

edit: fixing ordered list 9 → 10, 10 → 11
edit2: O(complexity) is the minimal O(log n)

amirouche · 6 February 2025 17:37

I forgot to relate to vnstore, and how it is related to first-class history, and how to avoid structural submission.

most queries are unknown, or it is known that all pattern will be useful: that’s why nstore compute all necessary, and sufficient indices, the minimal subset of the permutation of the tuple, that allow to bind any pattern in one key-value range query (hop) thanks to Erdos algorithm, that subset is minimal, and result in logarithmic complexity for any nstore, and any query(pattern);
nstore columns must be dense dimensions, every tuple has a significant cell for every item, ie. no item is NULL;
nstore is its own hyper space, when n >= 3, and referencing identifiers;
it is possible to define fields using other fields;
it is possible to define an arbitrary number of fields;

How it is related to the idea of ‘History as a First-class Citizen’, and the fact “symbolic representation does not entail structural submission.” I suggest that:

the number of fields can evolve over time;
in a long living system, there is no enough clue at the start about what will be the whole thing;

If understand correctly, the requirement is to represent time to manifest history, and then reference that history in navigation over structures that evolve over time both in their values. So, it seems to me you need a version control system that allows to represent and track the presence and absence at the granularity of structure -> field -> value. That’s what vnstore when n=3.

Since you also need to reference a tuple, ie reifying: user looked at structure -> field -> value, n=3 is not enough, and you will need to n++ = 4 to have another item to store a tuple uid (tid).

Example:

Now:

(freedom, author, amirouche, tid_1) ;; amirouche authored freedom

Later:

(event0, type, event, tid_2) ;; event0 is a navigation event
(event0, type, navigation, tid_3)
(event0, user, amirouche, tid_4) ;; navigation event, by amirouche
(event0, subject, tid_1, tid_5) ;; navigation subject, tid_1

The astute reader, will recognize there is no timestamp. That is what vnstore does: add timestamps. It is implemented on top of nstore with two more columns.

The previous example is stored as:

(freedom, author, amirouche, tid_1, 1738880373 epoch, created) ;; amirouche authored freedom

;; event0 is a navigation event, created at 1738881373 epoch
(event0, type, event, tid_2, 1738881373 epoch, created) 
(event0, type, navigation, tid_3, 1738881373 epoch, created)
(event0, user, amirouche, tid_4, 1738881373 epoch, created) ;; navigation event, by amirouche
(event0, subject, tid_1, tid_5, 1738881373 epoch, created) ;; navigation subject, tid_1

Tuples can be marked as deleted:

(freedom, author, amirouche, tid_1, 1738882373 epoch, deleted) ;; amirouche deleted freedom

At this point, in vnstore, query for freedom by amirouche:

query(freedom, author, amirouche, variable(tid)) = {∅}

whereas the nstore still has it, and has the record of the deletion:

query(freedom, author, amirouche, variable(tid), variable(epoch), variable(act))
  = {
  (freedom, author, amirouche, tid_1, 1738880373 epoch, created),
  (freedom, author, amirouche, tid_1, 1738882373 epoch, deleted)
  }

I just described how to represent vnstore with nstore as an example nstore use-case. It can also be used to track, and query arbitrary tuples (where n is variable).

What make nstore stands out is the first few words:

most queries are unknown, or it is known that all pattern will be useful: that’s why nstore compute all necessary, and sufficient indices, the minimal subset of the permutation of the tuple, that allow to bind any pattern in one key-value range query thanks to Erdos algorithm, that subset is minimal, and result in logarithmic complexity for any nstore any query(pattern);
nstore columns must be dense dimensions, every tuple has a significant cell for every item, ie. no item is NULL;
nstore is its own hyper space, when n >= 3, and referencing identifiers;
it is possible to define fields using other fields;
it is possible to define an arbitrary number of fields;

I am not happy with the code but here is a recent-ish implementation:

If you look around in the repository, there is a webui for that.

Exercise:

Exercise 1: Describe a nstore schema where it is possible to reify tuples in the vnstore’s nstore, in other words, make it possible to annotate a history tuple from within the nstore, that is make history first-class.
Exercise 2: Optimize it;