Lithification

akkartik · 31 March 2025 16:55

This term pointed out by @Bosmon deserves a separate topic. Some links:

I’m extremely supportive of it, and feel no use to balance its opposite, abstraction/reuse. Or at least, I think much more open source activity should involve creating complete products that accomplish tasks rather than libraries for others to accomplish tasks with.

One wrinkle: The first article confuses matters by conflating products vs libraries with proprietary vs open. They’re independent dimensions and need to be reasoned about explicitly. The article seems to want to move human activity from open libraries to proprietary products, while I want to move activity from open libraries to open products.

khinsen · 1 April 2025 05:34

A similar though not-quite-identical concept is Donald Knuth’s “re-editable” as opposed to “reusable” software (see my summary). Which is something I have been advocating for in scientific computing for a while. With little success, as the mainstream of scientific computing is moving towards “best practices” of traditional software engineering, in the hope of being able to offload all software-related questions to another profession, rather than having to understand software themselves.

khinsen · 1 April 2025 05:37

Maybe “design patterns” are also related to this. I remember lengthy debates in which one side said “design patterns are just patterns that haven’t yet been abstracted properly for inclusion into a library” and the other side maintained that “design patterns are something else then reusable code”.

Apostolis · 1 April 2025 13:36

The existence of a variation is only known after the same software has been used in multiple different contexts.

Thus , the process should be multiple forks, and then an abstraction if necessary.

But there are problems that need to be solved if we want multiple forks. How can one idea from one fork propagate to the rest of the forks?

I think that we need tools for this that we do not have. Do you have any ideas on this?

khinsen · 1 April 2025 15:39

The only tool I am aware of that supports cherry-picking changes from multiple forks is Federated Wiki. The granularity level is paragraphs, meaning that you can copy a paragraph from another fork into your version if you think it’s an improvement of the page. Non-textual units (images, code, …) count as paragraphs as well for this operation.

My impression is that this corner of collaboration space is underexplored. Most work focuses on automatically merging change, CRDT style. That supposes that everybody contributes to a single consensual version. Maintaining diversity seems to be an unusual idea.

Apostolis · 15 April 2025 17:15

An idea came to me from the other thread where we discussed about automated merging of a repository.

We could define a merge function where we do not have a unique version but an équivalence class of a specific piece of code or data.

So different projects that have diverted could still merge changes between each other.

An initial thought that needs expanding.

Is this the same to abstracting? Or not?

akkartik · 15 April 2025 18:04

It is a huge and fundamental problem with software that describing the equivalence class of programs that would meet one’s needs (I often shorten it to “the intent of a program”) is orders of magnitude harder than just identifying one program that happens to meet one’s needs.

So yes, an equivalence class would be great. It also seems extremely unrealistic. If you work really hard you can get a rough boundary of intention, or a good boundary some of the time.

Apostolis · 15 April 2025 19:57

Let me give an example. By the way the idea is experimental.

Let us say that we have a function that sorts a list of numbers. Instead of putting it in library for all to use, Everyone just copies it on its project.

Then we have a multitude of unrelated projects. Some of them change that function to make it faster.

We could define an automated merge method that only merges that specific function.

We could merge two unrelated projects together. We could have both variation and sharing of new ideas.

In fact we could define multitude such automated merge methods for different parts of our code.

khinsen · 16 April 2025 07:46

That would indeed be nice to have!

I suspect that in a Turing-complete universe, this kind of equivalence is untestable in finite time. But maybe there is a less powerful category of machines in which it is? I’d be happy to give up Turing-completeness in exchange for such advantages (assuming the overall trade-off is acceptable of course).

Apostolis · 16 April 2025 08:12

There are multiple équivalence classes that are decidable.

Examples:

The fifth line of your source code in this file should be equal.
This function f has type F. Type checking is decidable.

The idea is to construct a view of your code and check equality there. Even though I haven’t used it a lot, this is similar to the idea of lenses.

Merging would simply mean to update the values from that view to the original source code.

Edit:

The idea is to ignore the parts that are not important and to update on the value that changes the équivalence class.

In example 1, we only update the fifth line.
In example 2, we only update the internals of the function that has type F. Here, the rest of the code is irrelevant.

khinsen · 19 April 2025 09:16

At the source code level, I see many possibilities. Beyond git-style textual comparison, one could move to the AST level, replace variable names by de Bruijn indices, etc.

At the type level, I wonder how practically useful such equivalences would turn out to be. I have seen Haskellers claim that type signatures are often enough to identify a function, and that may well be true for the kind of problems they work on. In my corner of computing, most data are of types “real” or “array of real”. Just think of trigonometry, logarithms, all the stuff called “special functions”. That’s dozens of functions of type Real → Real, but that doesn’t make them equivalent in any useful sense.

Apostolis · 19 April 2025 18:33

Documentation could play a similar role, like on hyperdoc. Most of the time , it will require a review before merging. The idea is that the views will create a specific “view” for the specific topic. It will also create a specific diff, only on the parts that are important on this topic.

I do not know all the details, this idea is new. We need to brainstorm it. I believe that what we want is for every person to have its own version of part of a code, and be able to cross merge from others without having a central repository.

Would that be enough? Is this correct? Is it possible? I don’t really know.

Apostolis · 24 April 2025 08:44

Some more thoughts:

A. Abstraction vs reusability
Assuming there is a membrane that separates a component from its environment, that has constrains
on both the component and its environment.
We could have big variation on

the environment and a single implementation of the component in which case, we talk about reusability.
the component that only works on a specific environment, in which case we talk about abstraction.

I believe that we are interested in abstraction and not so much reusability.

In biology, the structure of a protein and its functionality can remain unchanged under multiple mutations, this has allowed mutations to persist across generation and thus enable variation.

Similarly, interfaces enable variation by being opaque about the implementation details. Examples are the different specifications, for example html, tcp. We have one specification that allows multiple different implementations, that have varying characteristics on things other than the specification.

B. On merging and views

Thus, a new merge algorithm would take into account this membrane. The implementation of that membrane depends on the language, thus there is no single solution to this. For languages that have interfaces for example, that could be a good place to construct a view/lens that would allow the merging of new changes.

C. Changes that span different levels of abstraction

This is a topic that interests me. Here, a merge would change the interface as well. I think that we could track changes to the interface and when merging, simply informing the user.

In any case, an abstraction seems necessary to construct the view, that would enable the merge between different projects. Also abstractions are necessary to increase variation.