distributed as in Git - replication of "live" objects

Apostolis · 13 April 2025 14:31

For git, distributed is synonymous to the replication of code and to the parallel work while offline.

What would be the equivalent of that in a live environment where some of the actors/objects/services were located somewhere else?

Consider for example mastodon, a service that has other users interacting as well? What would be the equivalent of git on that live service?

We would first need to be able to download the source code and compile it with only one button click. Still though, that wouldn’t be a live service. We would need to take the environmental closure of that service, meaning that we would have to have a default database with some values and we would need to have code that emulates other user interactions.

With a single button click, we would have replicated a live service, and we would be able to work on our code.

Not only that, with a single click, we could have made our own mastodon service.

khinsen · 15 April 2025 05:48

I see the difficulty not so much in the equivalent of Git for live systems, but in the equivalent for diff and merge.

Git is easy. It does snapshots. Ignoring efficiency for a start, download a Smalltalk image and commit it to a git repository. Let people fork it. And then have everyone commit a snapshot of the machine image once per minute. Mission accomplished.

But unless you can compute diffs between two snapshots, and apply them in a principled way to another image, the branches of snapshots are not very useful.

Apostolis · 15 April 2025 08:13

I need to think about it. The problem has many different dimensions.

In the meantime, irmin is designed to be a distributed data storage with git semantics for both data and code.

https://irmin.org/

“Irmin is an OCaml library for building mergeable, branchable distributed data stores.”

Apostolis · 15 April 2025 08:15

If I remember correctly, you can automate conflict resolutions in irmin. Does anyone else have experience with it?

"Dynamic Behaviour

Allows users to define custom merge functions and create event-driven workflows using a notification mechanism."

khinsen · 15 April 2025 09:11

Looking at just the Web page, I don’t see any claim about managing code. It’s all about data. Sure, code is data, but that’s a very superficial way of looking at code.

Also, the only reference I see to the diff/merge issue is “Allows users to define custom merge functions”. A good start, but I’d love to learn more about it. What do typical custom merge functions look like? Are there generic ones (for, say, trees)? Or are they mostly application-specific?

Apostolis · 15 April 2025 09:29

It seems that you define a custom type.

Code is a just text, right?
Also, it is compatible with git. So you could use the standard git program for code. And push / pull.

I only wish the type was not ocaml specific. Instead have something like wit.

khinsen · 15 April 2025 15:10

Yes, code is text, but for a library managing graph data I’d expect at least to be able to work at the AST level.

Apostolis · 15 April 2025 15:55

I don’t understand. Git does not do that. The AST is specific to a language.

To put it another way, you could do it yourself. You could define the type of the AST of your preferred language and then perform version control on the AST. Of course, you would need the parser of your preferred language.

So, this is a good idea. Maybe it would be better to have both the text and the AST a la glamorous toolkit.

At least in agda, there are multiple syntax trees, but you lose information from one to the other.

This is something I am exploring as well. Different views of the code a la GT

khinsen · 16 April 2025 07:43

Sure, git does that. I don’t need a replacement for git (not for that reason at least, I’d be happy to switch for better UX). I am looking at Irmin from the point of view of “better than git”. If it handles code as well as git, then I am not interested.

I see a lot of room for improvement in handling code in version control, compared to today’s state of the art (git but also mercurial or fossil).

Different views that each have their own diff/merge - that would be interesting news!

computably · 28 April 2025 02:29

Seconding khinsen’s point - spinning up a blank-slate copy of a live service is a solved problem via Git and Docker (and associated ecosystem). Any piece of software is trivially “decentralized” if it doesn’t bother communicating with the outside world, any piece of software is trivially “stateless” if it intentionally omits persistence.

The functionality that is needed beyond the status quo is fully decentralizing technologies that are traditionally centralized due to legitimately difficult infrastructural/societal problems. Mastodon and similar examples are ad hoc; as are functional failures like cryptocurrency/blockchain. Achieving “git for live services” means solving a whole slew of fundamentally much harder problems that git totally ignores - and let me be clear, I’m not minimizing the value of git or the difficulty of version control, I’m saying “git for live services” is at least 10x harder, probably 100x or 1000x harder.

Apostolis · 28 April 2025 13:08

There are many different problems which are related.

A. I was mostly talking about copying a process/service in the same way that prototype based programming languages do, with all the changes that have been made by the service provider. For example, support for math formulas on mastodon.

I would like to have a button that copies the service as is.

Whether we replicate the state is another thing. If the state is not private then yes.

B. Merging the changes to the original object / actor, that is another problem entirely, but it is another interpretation of what git is.

C. Decentralizing a service is more about splitting it into small pieces.