A programming environment with first class categories

Apostolis · 5 April 2024 06:27

natecull · 19 April 2024 04:51

I’m pretty dumb about category theory. From a distance, it feels like a category is almost exactly the same thing as a relation, except somehow not an actual relation because of infinities or somesuch. But apart from not being a relation, it looks and feels almost exactly like a relation.

I feel like if we had a programming language that natively did relations, we could express category theory in that fairly simply. But not necessarily the other way around, because category theory is Large and Confusing. For a programming language, I’d prefer to start small without all the Large and Confusing stuff and save thatstuff to add on later - in a library, rather than a language extension, if possible.

By relations, I mean something like Prolog’s predicates and/or Kanren’s relations: things a little like functions except that

a) they can have multiple return values instead of just one (it’s probably required to use a form of lazy evaluation so as to not return all of those values at once, because they will very often be a countably infinite number of them)

b) like a typed function and unlike an untyped function, not every argument/value actually exists in the relation (a side effect of there being possibly more than one value is that there could be zero values). Yet no actual concept of “typing” is required to express this; just some kind of computable rule to say which arguments/values are or aren’t in the relation - and this same kind of rule is exactly how we also do ordinary computing.

c) as a tricky edge case, you can also, potentially, have just a naked value that exists in the relation yet doesn’t have any argument attached - is not part of an argument/value pair. Naively you could think about it as “this value has an argument of a special otherwise-inexpressible guard value”, but even that will I think get you in trouble, because algebras don’t often like inexpressible special-case values. This is a tricky edge case and very hard to explain clearly except that it’s what you get if you consider the set of “pairs plus atoms”: ie, if you have a bunch of Lisp terms - some of which are pairs and some of which are just naked values - and shove them all together into a set. That set will contain both pairs and naked values. And if you remove the “naked values” and try to squeeze everything into r rigid “two column table” datatype, you break all of the algebraic properties of the Lisp term as a type. It MIGHT be possible to work around this, somehow, with a very clever selection of “guard value”… but it also might not. I feel like this edge case hasn’t been widely talked about, but solving it is quite important to getting really good support for relations as a native computing data structure that don’t subtly explode when we put real-world data into them.

d) finally, unlike lambda calculus lambdas, you can iterate relations (getting them to provide you a list of possible arguments) if you don’t know an argument, as well as just evaluating them (getting a value or set/sequence of values for an argument). And that means we maybe can’t compute with them exactly by rewriting one term into another, like we do for functions. Or, maybe we can and we just haven’t found a clear way to express this yet.

If we had a language built on such relations, I believe we could express ALL of: functions (transformations over data); datasets (table-like things) AND types (restrictions on or precomputed declarations about data), all as instances of one simple computed-collection entity.

But finding a decent, clear, idiomatic semantics for how you do the actual computation - the equivalent of a “function call” on a relation - such that you can both iterate or evaluate in the same call format, and do that soundly and efficiently in the presence of recursion and countable infinities - seems like a missing piece.

Prolog (or rather the “resolution via unification and backtracking” algorithm which Prolog implements) can do this! But to do it, it introduces the unusual concepts of “logic variable” and “unification environment” both of which are quite alien to normal functional programming, don’t really behave nicely with “rewrite” semantics, and I don’t think they’ve ever been parallelised entirely successfully.

Datalog gets a lot of traction in the data world, where it’s a slightly better SQL, but, it achieves a lot of its good qualities by throwing away a lot of the good parts of Prolog, particularly by being non-Turing complete. So Datalog itself cannot be the actual answer we’re looking for, as a general-purpose computing (rather than just a limited data-query) language.

Kanren, superficially, embeds the Turing-complete Prolog algorithm into a pure-functional form, and improves the Prolog search method (it’s not quite a depth-first search, so it can handle a bit more left-recursion, possibly even including handling integer arithmetic). But it cheats a bit to do this: it still lugs around a “variable environment” and although it calls its things relations, when computed they are actually not set-like but sequence-like “streams” which can have the same value occurring multiple times. To my way of thinking, that’s still not the 100% native “relational programming” algorithm that we really want and that could simplify a lot of things if we found it.

computably · 27 July 2024 02:06

Category theory is fundamentally incredibly abstract. Though it may have some applications, they are rare enough in theoretical physics/CS, with probably zero non-academic applications. There are, perhaps, one or two dozen people in the world who are experts in category theory, computer science, and practical implementations of software systems. The closest it has come to the “mainstream” of programming is Haskell, which itself is niche and leans towards academics.

I should also note that, although category theory is a legitimate field of study, most mathematicians generally think of something like ZFC set theory as the current foundational theory. However, even most pure mathematicians rarely describe things in terms of ZFC.

With all that in mind, I’m highly skeptical of any proposal like “all programming should be built around category theory.” It’s rather vague, without any motivation other than conceptual elegance, similar to “everything is an object” or “everything is pure.”

Apostolis · 27 July 2024 10:59

Have you seen the demo of dynamicland on a biolab?

There you have the physical lab and its experiments and on the other hand you have the simulation of those experiments. From what I saw, the demo mixes simulation and reality.

The interesting thing is the relation of the model and physical reality. After all, we as humans are physical and we use computation to change the physical world somehow.

computably · 29 July 2024 06:26

Sorry, I don’t understand how your example relates. Concretely, what does it mean for a software system to be category theoretic? Does Haskell count, and if not, what more is required? What advantages would your vision of category-theory-based software provide over OOP or FP? etc.

Apostolis · 29 July 2024 10:31

This is a huge topic and I am very busy at the moment.

I would look at Rosen’s book “Life itself” which I am currently reading.

Also Conal Elliot has something to say about the io monad of Haskell.

http://conal.net/blog/posts/can-functional-programming-be-liberated-from-the-von-neumann-paradigm

He has also proposed a denotational technique :

Also many mathematicians / cs profs need this, tools that help to write software categorically.

Each person in its own field wants it, because category theory is about relating categories with each other.

And this is very useful for many reasons.

Glamorous toolkit has the tooling as well. Dynamicland as well.

I am interested in modeling reality, both computers and the environment in a uniformal way.

stefanlesser · 29 July 2024 13:31

The way I interpret this, and I may be way off from the original author’s intention, so take this as an independent contribution inspired by the discussion so far:

There are “maximally abstract” interfaces for types that you can design, and category theory is a mathematical way to systematically find them.

Take for instance a monoid (not monad, that’s a special kind of monoid, but let’s not go there right now). You can take two integers and add them and get another integer that represents the sum of the other two. You can take two strings (or arrays) and concatenate them, which will give you one new string (or array) that represents the two original strings (or arrays) merged together. There’s an abstract pattern at play here, and that’s what mathematicians decided to name “monoid”. You could go and implement that as a very abstract type class, as for instance Haskell does. And then you get the benefits of easily adding this to anything that you want to have this behavior with the main code already implemented, as Haskell also demonstrates.

Same thing with those map, reduce, filter, fold, etc. methods you probably know mainly from arrays, but they work with pretty much everything that can be iterated (that’s another one of those patterns, btw) in some way. Again, mathematicians can tell you exactly what rules you have to play by to belong to such a category, and again, you can implement the code for this behavior once and get it everywhere you need it.

One way to build libraries is to come from the maximally abstract and carefully work your way through all these very abstract mathematical objects that all result in useful behavior that can apply to a large number of types and can ideally be combined orthogonally. At some point of course these things start to interact and then it gets messy real quick.

The downside of this approach is that these abstract patterns are not exactly intuitive to deal with if you’re not mathematically trained. However, even if you hate math you have probably at this point picked up on that there’s something magical going on with a map() function and that it often makes more sense to use that over for loops or something like that. Mathematicians will just smile at you and offer a “Told you so” in combination with some other words that don’t seem to make any sense, and you still are none the wiser, except that you have some intuition that there is some pattern at play here that is incredibly versatile.

So in a way I think there is a lot to be gained from designing things coming from the maximally abstract down to the reasonably concrete, but it’s unintuitive and works against our cognitive way of categorizing things on not such an abstract level by default (which I believe I wrote about somewhere else in this forum). And it’s also somewhat against best practices in software engineering that recommend to go from the most specific to the more generic, and only if there are actual reasons to do so.

In such cases with two pretty much opposing extremes I tend to adopt a usual stance that neither extreme is better than the other and that it’s in the oscillations between them that you can learn and improve your craft.

Apostolis · 29 July 2024 13:54

This is what everyone thinks of category theory, about abstractness. It can be used in such a way. But this is not what I say.

I would recommend Conal Elliot’s videos on denotational design as a starting point, which deals with elementary mathematics on affine lines.

And then on the biologist robert Rosen. I am reading it myself, to have a better understanding now. Maybe Louie’s book “More than life itself” is better.

Apostolis · 31 July 2024 13:59

I really do not want to associate what I am saying with anything that has to do with haskell or abstractness.

Because the culture around category theory is that it is used only by mathematicians when I want every programmer / human being to be able to understand and use it. It is the basis of science relating a model with a physical system.

And this is the goal when I say to have first class categories, to make it very easy to use them.

Secondly, haskell actually has abandoned any effort to model the outside world. That is why it has the io monad. Haskell is a language for computer programs, not a specification languages that describes both the physical reality and computer programs.

counterexamples :
Leslie Lamport’s Temporal logic of Action (TLA) actually models both the environment and the program.

computably · 8 August 2024 08:48

I believe I have been referred to this book before, and while it has had quite some influence on biology, it has never struck me as interesting in its own right for a non-biologist.

In particular, from the summary given by the publisher:

Central to Rosen’s work is the idea of a “complex system,” defined as any system that cannot be fully understood by reducing it to its parts.

As a generic, buzzword-y concept, it’s intuitive. As a concrete property, it’s hopelessly vague. What counts as a “system”? How do you measure if it is “understood”? Understood by whom and according to whom?

I believe physicists would call that “physics”! The problem with any model is that models, by construction, live inside human minds and ignore details of reality. From there, different people begin to use different models, and then…

Again, I think I’m missing quite a bit of context here.

You’ve omitted what is even meant by “writing software categorically,” which is one question I had. Denotational design is perhaps an example, but it’s not clear if that’s the only thing you’re referring to, or if it’s applicable outside Haskell / pure functional programming.

What evidence have you seen that specific people need/want to model their problems using category theory? Given that most mathematicians are probably not very familiar with category theory, let alone computer scientists or researchers in other fields, there must be some additional inference about how category theory would be useful for specific applications.

Are there any writings on how GTK/Dynamicland are categorical? I’ve found nothing from quick searches. I don’t see the relationship.

I’m still confused about this general claim.

Take for example calculus. Calculus has countless applications and it was invented (in Newton’s case) to study basic human-scale physics. Yet billions of human beings live their lives without ever understanding or using calculus; for many millions, in spite of the fact that they took multiple calculus classes! Although calculus’s contributions are felt - through mechanical engineering, statistics and algorithms, etc. - it is totally unnecessary for them to understand calculus itself.

Compare category theory to calculus. See also my previous comment

I don’t think I could even justify investing effort into getting every mathematician to learn category theory. What would make category theory multiple orders of magnitude more worthwhile than calculus?

Could you clarify? Almost all scientists go about their work without using category theory.

Apostolis · 8 August 2024 18:56

All of those questions, can be answered by reading the book.

Even though, I can easily describe to you what category theory is, I am trying to understand the epistemological implications of its use. In other words, we are entering into philosophy territory here. The epistemological aspects , I haven’t fully understood yet.

If you replace the word “system” with object or with function, you would get the same questions. In my opinion, it is a piece of a whole that can be understood easily as a whole rather than its parts. Similarly for functions, we write our code into functions for the same reason.

That is exactly why we need category theory, because we have multiple models for the same physical system and the opposite, multiple physical systems that can be described by the same model.
(For example we can create a computer made from crabs
)

We also have relations between models. This is denotational design.

It means to be explicit about the relations of models and physical systems, or relations between models. And to switch from one to the other.

By being explicit, you can define a model for reality and use it as part of your model that contains software. For example you can model an environment in which you have packet loss, by a function that receives a msg and returns a Maybe Msg.

Because they have told me. Here is one problem:

The problem we have with science is that most do not really look into philosophical / epistemological aspects of what they do and they should.

stefanlesser · 8 October 2024 14:36

What I described above technically doesn’t need any category theory at all; a good understanding of group theory and some abstract algebra is more than enough, and both are concepts you better know something about before tackling category theory anyway. Maybe there are more concrete concepts you can point towards?

I see from your references to Robert Rosen and A. H. Louie that relational biology seems to represent at least an exemplar of what you mean. I don’t know anything about that and its connections to category theory, but it does resonate with something else I’m currently learning about. I’m blindly poking at this, maybe with a little bit of luck there is a connection and you are able to confirm?

Searching the web for those names, I picked up some ideas around the interplay between structure and function, and since this is about living systems I suspect this is ultimately about autopoesis. There’s is a pretty interesting self-referential loop between conditions and events that cannot be captured in the usual reductionist theories and requires a relational approach, where properties are tied to neither object nor subject (one of them usually being the environment), but to the relationship between them.

I just wrote a paper where I had to explain “relational”, so here’s a snippet that I had handy — affordances (in the cognitive sense) are such a relational thing:

An important fact about the affordances of the environment is that they are in a sense objective, real, and physical, unlike values and meanings, which are often supposed to be subjective, phenomenal, and mental. But, actually, an affordance is neither an objective property nor a subjective property; or it is both if you like. An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer.
— James Jerome Gibson. (1986) The ecological approach to visual Perception. Psychology Press.

Alicia Juarrero wrote two books about complex systems dynamics and has a pretty interesting model of how constraints can practically encode emergent meta-stable properties in complex dynamic systems. I have a hunch that this is related to what you are talking about. I have no idea how that connects to category theory (and even if it does, I don’t think it needs to), but it may very well be that Rosen and Louie came to certain conclusions through category theory whereas Juarrero came to similar conclusions through systems theory.

[… ~2 months passed…]

And as I just rediscovered everything I wrote above sitting as a draft in here for probably months, I have just now come across a single paper that integrates all of those things and does show at least one powerful connection between those concepts and does confirm that my hunch was correct and we probably are talking about very similar things. That paper is: Naturalizing relevance realization: why agency and cognition are fundamentally not computational. Pages 8 and 9 bring together Juarrero and Rosen/Louie.

Apostolis · 9 October 2024 07:56

Yes, we are talking about the same thing.

And we need to understand Rosen and Juarrero because we need to build a formal language that can describe such phenomena , encode models for those phenomena.

If Ilich talks about the role of institutions and tools in constraining freedom and creativity or the opposite, liberating users, complex systems, both natural and anthropogenic do the same.

Thus the only way to liberate ourselves from such complex systems is to model them, in the same way that we model software, in a unified framework. Then we check the properties of the systems, either with simulation or with formal methods, or with execution of our programs in the complex environment.
If there are properties that we do not like, we adapt. In other words we change our software , and the way we interact with the environment.

In other words , I am interested in introducing a democratic feedback that allows society to adapt. I consider software to be the mediator to our interactions.

We have a great amount of things to learn from them. I think Rosen might have been more abstract to Juarrero, thus the use of category theory.

Apostolis · 10 October 2024 09:58

Incidentally, before even knowing about Rosen, the type system I am trying to construct for actors has efficient causation as the main construct, it is it’s main feature. In other words , in every step, the computation determines the constrains of the next step. The only way for this to be defined ad infinum is if there is a loop, ie if there is corecursion.

Thus all infinite systems have closure to efficient causation, or closure to constrains.

And because we are working on the type level, this models both pure functions and non-deterministic ones. Thus it can model organisms as well.

With regard to the article on relevance realization, it is for this reason that I cannot understand why closure to efficient causation can only describe organisms and not machines as well.

For me the argument is insufficient since we can have programs closed to efficient causation.

I agree that we are not machines, but I do not agree to the argument.

Apostolis · 10 October 2024 12:48

So it appears that Johannes Jaeger is aware that we could have programs that define their own constrains, or rewrite themselves. (Mossio 2009)

As I noticed as well, it is the initial conditions that give the unpredictability of the system, the free will of agents.

stefanlesser · 11 October 2024 16:19

@Apostolis Thanks for sharing that video.

I’m not sure I follow your thinking here. I understand it the other way around.

In my view nobody says anything about closure to efficient causation to be limited to organisms. I understand it as a requirement to understand and describe living organisms.

So, sure, we’ll be able to model and simulate systems under closure to efficient causation, ie. potentially programs that “program themselves” and change their rules, but only up to that point that they still only exist "in a box”, whether that’s your running (and not changing) program, or the higher-level running environment that simulates programs possibly rewriting themselves. As it says in the paper:

it may well be possible to approximate aspects of biological organization through algorithmic simulation, but it will never capture the full range of dynamic behaviors or the evolutionary potential of a living system completely.

I find that more than enough to still have a lot of fun with this, and I don’t feel limited by this at all. Certainly, there is lots more we can accomplish, even if creating artificial life through computation alone may not be part of that.

The required open-endedness for living systems surely is non-deterministic, but I don’t think non-determinism is sufficient for open-endedness. So I find that conclusion a bit premature.

But I don’t want to dwell on the question of if we’ll ever be able to create living things with some form of computation, because I don’t find it that interesting. What I find much more interesting is the different paths this paper opens up to re-think computation and simulation with the dynamic systems approach described.

Apostolis · 12 October 2024 10:36

Just an initial comment:

" A natural system N is a mechanism if and only if all of its models are simulable"

The proof goes like this. Life has a model which is not computable, thus it cannot be a mechanism.

Apostolis · 12 October 2024 14:06

As I said , I need to study their work, so I will respond to your other points as soon as I can.

What I really enjoy is that last year , I was thinking that a type specification without user input affecting it, is static, meaning that it imposés itself on people and no one can change it.

Because of that, I decided that users should be able to change the dynamics of the system, and the type system actually enables that, now.

And now I see the concept of agency and autonomy of living organisms, based on the same mathematical foundation as me, on efficient causation.

Just to be clear, there are two possible forms of change.

A. An inhérent ability of the system to adapt, without mutation.

B. Mutations and évolution.

Here I am talking about A.

Apostolis · 17 January 2025 16:34

Mossio has a definition of self constraint that closely follows the closure of efficient causation of Rosen.
He makes it explicitly clear here.