Rethinking the object-oriented paradigm - what can we throw away?

sph · 12 July 2024 18:43

Many patterns of object-oriented programming are considered good practice and taken for granted, but looking deeper at them I wonder if they exist because mainstream object-oriented languages are quite bad and inflexible.

Inheritance for example. Inheritance breaks encapsulation, because a child object might need to know internal details of the parent object to make sure no invariants are broken. Additionally, changes to the parent object might break children that expect the parent to behave a certain way. Does object orientation mean inheritance?

See traits, mixins, CompositionInsteadOfInheritance and Lieberman prototypes for alternative approaches to inheritance. But the question is, do we really need inheritance to build a solid and expressive OOP platform?

Encapsulation and information hiding are often mentioned as a good thing™, because they hide internal state and clients should only be able to operate an object through its public interface. The downsides of information hiding are two, and in my view, quite major: the first is that an object has to define and implement how to serialize and deserialize it from a bit stream, as their internal state is hidden, making communication and networking of objects verbose and harder than necessary.

The other problem with encapsulation is quite philosophical and dives into the reason people dislike OOP so much: data is dumb, static, immutable, infallible and eternal. Behaviour is an interpretation of data, a model through which to interpret reality, and no model of reality is ever perfect. Coupling data and behaviour means coupling data with an imperfect and fallible model to transform this data.

See this rebuttal by the author of Rebol.

Functional programming is very in vogue these days because it leaves data alone, and functions which operate upon it can easily be replaced, so one might provide multiple models (functions) that operate on the same data. But to me this feels like throwing the baby with the bathwater, as providing a clean interface to data is a big plus to manage complexity. Our mind loves to represent reality as well-defined objects and things.

I wonder if loosely-coupled structs of data and functions with an implicit self argument are a more flexible middle ground between mainstream OOP or FP.

It feels to me that message-passing is the only core idea of object-orientation, and all the notions we have attached to it are unnecessary if not harmful to the exploration of OOP languages and environments, that reflect how people think and explore the world. This was Alan Kay’s motivation for his work with Smalltalk, but after decades of Java and C++ it seems no one wants to work with objects any more; to me they are crucial to building introspectable computing, more advanced forms of networking (i.e. object-capability model) and more interactive programming environments.

akkartik · 12 July 2024 19:01

A lot of this resonates.

I want to poke a bit at the claim that “data is infallible and eternal”. Data too is a model through which to interpret reality, always imperfect. So that piece of the argument isn’t very persuasive, that the fundamental disconnect between data and behavior is due to differences in fallibility.

akkartik · 12 July 2024 19:19

Personally I mostly stopped thinking about OO about 10 years ago. It certainly bears rethinking, but it’s not clear that starting from OO provides a particularly useful vantage to find something better. My preferred approach is to unbundle various concepts that tend to get smushed together under OO, and to think about them independently:

namespaces
lexical scope
dispatch // rebinding names to closures // aspects // advice
queues
messages
registries

I wonder if loosely-coupled structs of data and functions with an implicit self argument are a more flexible middle ground between mainstream OOP or FP.

I’m curious to hear more about what you have in mind there.

sph · 13 July 2024 07:20

Take this piece of data: {name: "sph", location: "Earth", isOnline: true}

Data cannot have “bugs”. Data is a static representation of facts. It is truthful by nature. The value themselves might be incorrect (once again, because all measurements are interpretation of reality through a fallible model), but this collection of values—data—if taken as truth, can be changed and transformed however we want, changing its shape to better suit our understanding or our program. FP is fantastic at data transformation, so turning that above piece of data into another form of data is a trivial problem in computing.

Now imagine that this data is hidden and coupled into a behaviour, i.e. you have an instance of a Person object, which exposes three getter methods, name, location, and isOnline.

Behaviour (i.e. computing) is buggy, and imperfect. Hard to imagine with such a simple example, but on a larger program you have no guarantee that a method does exactly what is says on the tin. And, most importantly, we have no real way of editing and transforming behaviour with the same ease we transform data. Lisp is built on the code = data equivalence, but no one uses Lisp to transform a buggy function into a working one. We just rewrite the entire thing from scratch. Even in Lisp, code is not very malleable.

So the problem with mainstream OO, to me, is that we put something which is perfect, values, together with something which more likely than not is not perfect. And create hierarchies of things out of this flimsy house of cards.

Does this make any sense? I feel there is a deep philosophical concept at play here which I’m unable to name.

sph · 13 July 2024 07:27

An example:

struct Pen
  position: struct {x: Int, y: Int}
  angle: Float
  color: enum Red, Blue, Black
  penState: enum Up, Down

Pen.move(x, y)
  self.x = x
  self.y = y

Pen.down()
  self.penState = Down

Pen.up()
  self.penState = Up

Pen.reset()
  self.position = {x = 0, y = 0}

Here we define a Pen as a structure, and create a set of methods (function that have an implicit self argument) that are able to operate on it. A user is always able to create new “verbs”, i.e. to expand the model through which we interface with data. But also a user is free from information hiding, so they might create a better implementation of Pen, which will be 100% compatible with other Pens, which nothing are than a struct. Additionally, Pen is easily serializable, can be transmitted onto wire because it’s nothing else than a bag of bytes.

Admittedly this is a stupid example and nothing ground-breaking, but most innovation in programming lie in surfacing certain concepts and philosophies through syntax. The syntax I’m exploring here is OO without inheritance and encapsulation.

akkartik · 13 July 2024 07:59

Data cannot have “bugs”. Data is a static representation of facts. It is truthful by nature. The value themselves might be incorrect (once again, because all measurements are interpretation of reality through a fallible model)

I don’t understand why you say data is “truthful” even if it is “incorrect”. Seems confusing.

I’m not an expert, but a few years ago I skimmed a book online that I’m having trouble locating now. The premise is that the world is always more complex than can be represented by any data model, and the most important attribute for the modeler is humility. Does that sound familiar at all to anyone?

Have you heard of Authoritarian High Modernism from “Seeing like a State”? The examples predate computers. I think James C Scott would say data is dead until it’s used – and then it’s extremely imperfect, susceptible to dramatic misinterpretation and procrustean distortion. There’s nothing special about OO or structs here. Any use you put data to is bound to be imperfect.

sph · 13 July 2024 08:29

I haven’t heard of that reference, but to me “data is dead” sounds exactly what I mean by static and eternal.

I don’t know if you had the chance to read the Wikipedia article about “model-dependent realism” (Model-dependent realism - Wikipedia) which touches upon this notion.

Again, this is a larger philosophical problem that simply computing. Reality is perfect, and unknowable in its entirety (see Heisenberg’s uncertainty principle). We are only able to know it through fallible, incomplete models. These models offer measurements, which are literally simply numbers. Whether they’re true or not is not a property of the number—the data—itself. A number is a number and will always be a number forever. A number can be transformed. The result is another number which is eternal and perfect and is neither true nor false.

Computer data is nothing else than a collection of numbers. It is neither true nor false in itself. Is “0x881ad5” a buggy number? It only depends on how you “interpret” this number. Interpretation—behaviour—is where bugs lie.

You might enjoy this discussion between the two great minds of Alan Kay and Rich Hickey: What if "data" is a really bad idea? | Hacker News

EDIT: I find it funny that in my defense of objects, I am closer to Rich’s argument, which is arguing against them. Perhaps that is the reason why I feel we need a middle ground between OOP and FP.

stefanlesser · 13 July 2024 10:55

There have been a few famous software people in the past that have characterized OOP (in its popularized form, not the Alan Kay version) as sort of a fad, fashion, or market trend, that has packaged up well-known concepts into a paradigm and built a successful movement around it — for a while. For instance, here is Niklaus Wirth in 1995:

Reinventing the wheel?
Remarkably enough, the abstract data type has reappeared 25 years after its invention under the heading object oriented. This modern term’s essence, regarded by many as a panacea, concerns the construction of class (type) hierarchies. Although the older concept hasn’t caught on without the newer description “object oriented,” programmers recognize the intrinsic strength of the abstract data type and convert to it.
— Niklaus Wirth • A Plea for Lean Software

When judging OOP in its entirety, or features that are considered to be part of it, I wouldn’t want to miss Moseley and Marks’ thorough analysis from 2005:

The classical ways to approach the difficulty of state include object-oriented programming which tightly couples state together with related behaviour, and functional programming which — in its pure form — eschews state and side-effects all together. These approaches each suffer from various (and differing) problems when applied to traditional large-scale systems.
We argue that it is possible to take useful ideas from both and that — when combined with some ideas from the relational database world — this approach offers significant potential for simplifying the construction of large-scale software systems.
— Ben Moseley, Peter Marks • Out of the Tar Pit

I’ve just been reviewing a lot of papers like that lately, so I thought I’ll point to some good resources I’ve came across that have good input on the topic.

In another forum I posted a take on why OOP may be technically inferior, but cognitively superior:

OOP has some peculiar ways to “hack” into our cognitive system. The way we categorize things when we talk (and think) about them, feels very much like modeling a domain with OOP. And we categorize things all the time.

When you casually bring up your neighbor’s dog in a conversation, you say “dog ”. It would be weird to talk about your neighbor’s mammal . And although possible, if you are super into dogs yourself, you would likely not talk about your neighbor’s cocker spaniel . You pick dog — a natural category that’s not too generic and not too specific, it just feels right. That all happens subconsciously. You don’t really think about it.

When modeling a domain we do something very similar. Without much experience, it’s easy to fall into the trap to believe that good natural categories make good classes in OOP. But experienced programmers tend to pick up that modeling domains for programming seems to work better with superordinate categories that are more abstract.

Applied to the analogy, programming usually works better if you’d refer to your neighbor’s mammal , or even better to your neighbor’s animal , or even better to your neighbor’s living being , as often as that’s sufficient. That’s how you can reap the benefits of polymorphism best later. You pick the most abstract type class or interface available that models the behavior you want. That, however, is usually much more abstract than what intuitively feels right and is therefore in conflict with your intuitive cognitive categorization. It also requires more abstract thinking when discussing the design with fellow programmers.

I think OOP became so popular because it hooks into our cognitive categorization facility so effectively. And it became misunderstood and used “wrongly” because most programmers designed classes to feel like good natural categories that work well in conversation, but then cause problems later because they are too specific for getting the benefits of polymorphism. You can’t really blame them for that. It’s very natural to do exactly that. Literally “natural ”.

Now, dog is a category or class or type. There are many dogs. But when you tell a story about your neighbor’s dog, you use that category as a shorthand to refer to a very specific thing that is unique. Unless your neighbor has several dogs, there is only one instance (and if your neighbor has several dogs, you would’ve qualified it more to refer to one of them).
It’s easy to mix up what we’re doing here: We use an abstract category (class) to refer to a specific thing (instance). Human language is super ambiguous, but we usually get the level of abstraction right so we can understand each other. And we don’t really think about that distinction between a category and a unique thing either. We (usually) tend to just know what we mean.

In programming we need to be more precise. The colloquial understanding of OOP is that behavior is modeled in classes (because we want to reuse it), while state is modeled in objects (because we need to distinguish different instances). The confusion with what Alan Kay really meant comes from that. Because in Smalltalk both behavior and state are modeled in objects, and it is kind of understandable why he called it “object-oriented” and why he dislikes classes. And message passing was so important because that was the universal interface for any kind of object to interact with any other kind of object. Message passing was the most abstract behavior that applied to all state — maximally efficient polymorphism, if you will.

Class-focused OOP happens, pretty much naturally , if we just apply our unreflected naive understanding of how we categorize things and model our domains like that.

I haven’t worked out if all that makes OOP a good or bad approach to modeling domains. It’s obviously beneficial to hook into stuff that comes easy to us, but then it seems as if in this case we get lead down the wrong path too easily and what we need for good design is somewhat opposed to what feels right intuitively. But then if you’re aware of that, you can make it work quite well.

— from: Original post in Future of Coding Slack

We usually don’t consider the cognitive side of things much and rather talk about more “objective” properties, but the cognitive side of things is kind of my jam, so I try to point people to it, and point out that there is quite objective scientific research about it too, so it’s no purely subjective matter of taste.

The good news is, most of the cognitive aspects of categorization work equally well with abstract data types and type classes instead of OOP classes and objects. And, we certainly categorize things into various overlapping categories all the time, so we don’t really have a cognitive limit to only hierarchies as OOP inheritance has. We aspectualize and are pretty good at seeing the same thing as very different kinds dependent on context and situation, which sounds a lot more like type classes or protocols/interfaces.

So I wouldn’t search for a middle ground between OOP and FP (Moseley and Marks kind of did that already). I’d rather look for a middle ground between programming paradigms and cognition. And I see the main tension here between the need for specificity for cognition and for generality in programming.

sph · 14 July 2024 09:49

Excellent quotes, very much appreciated.

Yeah, I think this is crucial but we are still far away from having made significant progress on this.

My posts are coloured by my belief that runtime, exploratory design of programs would be easier for humans, not just “scribes” like us, to understand and manage complexity. OOP in regular, static languages like those used today would look very different than an interactive OOP environment, where people can play with different ideas in real time before committing to a certain design.

In fact, these past few days I was looking to expand my research into a conversational model of programming, because I feel it maps human cognition even better. Instead of telling my computer what to do, I’d use it to explore the problem space with its help. Indeed, the closer we get to “programming for humans”, the more we need to leverage concepts of human cognition: pattern recognition, pervasive categorization and “objectification” but also out-of-the-box thinking (a thing is never just one thing for everybody in every context)

FP is good because it’s mathematical, it maps well to how CPUs work, it is easy to compose, but is too low-level compared to human thought which at its baseline is very abstract and high-level. It’s all about things and how they relate with each other.

khinsen · 15 July 2024 11:23

I see data vs. code/objects/whatever from a more pragmatic than fundamental perspective: “dead” data is more long-lived and more interoperable than anything that has execution semantics. Take these as observations, not beliefs or basic principles. They are definitely valid in my field of work, computational science. I suspect they are true elsewhere as well. That problem with the Smalltalk approach of wrapping data access in APIs is then that the wrapper will become obsolete more quickly than the data it wraps, and that two systems have more difficulty to agree on a common API than on a common data format. (was “data model” in the original post, but that’s a mistake)

Gnuxie · 17 July 2024 13:35

Hi, I am interested in an explanation from you on how you distinguish between a “common data model” and a “common API” because to me these seem like the same idea? Thanks in advance.

khinsen · 17 July 2024 19:18

Right, I mistyped “data model” where I should have written “data format”.

A data model defines higher-level entities in terms of lower-level data entities. For example, how to represent a protein in terms of strings and integers.

A data model can be implemented as a data format, which is a serialization of the data model in terms of byte arrays (or byte streams).

A data model can also be implemented as an API, which is a set of functions that translate between higher-level and lower-level entities.

Gnuxie · 17 July 2024 20:18

So with that in mind. I have to ask why a “wrapper” would become obsolete more quickly than “raw data”? To start, one thing is obvious, which is that if you’ve wrapped something there’s probably already more moving parts. However, it’s not much to go on alone. The reason we’re focussed in on the data format is because in the scenario being described we’re forced to deal with the file system and serialization in any case. That’s to say that we’re now all using programming environments with a file system. It’s inevitable and inherent in an environment like this that the serialized data is disconnected from any parser that you might have written. So you might lose your parsing program or the dependencies might become outdated and break, etc. It’s quite likely even, we’ve all probably been there. And this is part of the problem, and of course it’s going to be a lot easier to rebuild a parser for your serialized data than an entire programming environment. Because in the most exaggerated and worst case your “wrapped data” will be caught up in a VM image (or something like it) rather than a boring standard serial format. Do you see what I am getting at here?

So something else that I feel like I need to point out here is that idealistically the API is the same for both methods right? And if I continue looking at this from a pure OOP “behaviourism” perspective, whether you consume the wrapper or the raw “data” (post parsing) should be transparent in either case. I don’t think that’s the problem here, and the problem isn’t OOP.

khinsen · 18 July 2024 13:12

Serialized data is not just for file systems. Data communication relies on serialization as well. And databases happily store blobs containing arbitrary byte arrays.

The reason why data formats are more long-lived than APIs is the minimal requirement for data access. If you have a data format defined in terms of byte sequences, then any programming system that can deal with bytes can also work with the data. With an API, the compatibility requirements can be very strict. Smalltalk is an extreme case: if your data model is implemented as a Smalltalk API, then it becomes extremely difficult to access from any language other than Smalltalk. You’d have to write a compiler to Smalltalk (or to the bytecode of a Smalltalk VM), or write a bridge from your system to a Smalltalk image.

Of course I am assuming that a data format is documented in detail, by a specification rather than by a parser. In my experience, that’s always the case. Outside of proprietary products, I have yet to encounter a data format defined by nothing else than the code of a parser.

Apostolis · 18 July 2024 15:15

I wonder if we have a stable runtime like wasm, whether binary exécutables have the same properties as binary data.

Gnuxie · 19 July 2024 11:07

Yes, that is true.

It’s certainly more convenient while we are stuck with lots of languages with different and incompatible implementations, yes. That we find ourselves in this unfortunate circumstance is not fundamental or essential. Your observation that dead data is more interoperable is only true while we are here and while you are stuck dealing with legacy software. Legacy languages that I believe should be long dead, but I get that this is an outspoken opinion. Worrying about various custom data formats is an inconvenient necessity only because we have a lasting legacy of languages and environments with different fundamentals that do not support a common API.

I’d like to get out of here, and I recognise that is only going to happen if there is a path for legacy software.

As an aside, that is probably natural, evolution has been chaotic and unorganised. That doesn’t mean we shouldn’t take control though.

Relating back to the original question, I wouldn’t describe “image as the world” as inherent design, and I wouldn’t also say FFI issues are inherent with the approach. It’s usually a matter of principle and whether you’re willing to compromise the semantics and safety of your system so that C or what-have-you can bail you out. Preserving the fundamentals of your new language does involve making it inconvenient to add glue using something else, at the very very best that hit will probably be runtime performance. But that’s asking for a lot to go right.

Right, I’m not suggesting we go back to Smalltalk when I say this. But there is by enough engineering capacity in the world to very comfortably do that. There is not the will.

Maybe for a breakout thread: I would genuinely be interested if you would share with us the scale of the legacy software you are dealing with and the variety of hardware that they were written to run on. Maybe you have already written about that before already?

khinsen · 19 July 2024 19:06

You can see all this as a problem with legacy software, but “legacy data” is at least as important, and that’s most of the data that we want to process.

Also, I doubt that legacy languages will ever go away. I remember a discussion on that topic back in Twitter days, where someone challenged the public to name a language that was actively used in some application domain for several years and then disappeared. My proposal was PL/1, but there are still commercial PL/1 compilers with support and regular updates, so apparently even PL/1 isn’t dead.

Yes, we should aim at taking some control, but in the end, it’s software users who decide if legacy software is important to them.

The oldest software packages routinely used in my field are Amber and CHARMM, both going back to the 1970s and written originally in Fortran 66. Many people have moved on to later packages, such as GROMACS or NAMD, which are written in C and/or C++ and are also more than 20 years old. All of these packages share common file formats, which clearly show their roots in Fortran. The Protein Data Bank, the most important collection of experimental data in structural biology, has finally managed to move away from those antiquated formats, but that’s 20 years after their initial announcement. The community only accepted the change when it became strictly impossible to deal with protein structures of the size explored today.

khinsen · 19 July 2024 19:07

Binaries are data with execution semantics, and it’s the latter that make code so much more fragile than “dead” data.

khinsen · 21 July 2024 08:25

I am just thinking of a nice example for the importance of “dead data”. In scientific computing, it is common to have datasets that consist of several N-dimensional arrays, which can become quite big. A popular implementation of such data models is the Hierarchical Data Format that has been around since the 1990s. It is the de-facto norm for many kinds of scientific data, from weather observations to diffraction data on protein crystals.

HDF was designed as a library managing an on-disk format that is documented, but so complex and optimized for the access patterns implemented in the library that there is no alternative implementation. Meaning that HDF is really an API-based storage medium, the API being the library.

The current incarnation, HDF5, is about 30 years old. It’s written in C using an OO approach, so it’s a bit cumbersome to use because the OO mechanisms are emulated via C functions. But that’s a minor point. The library also comes with a Fortran wrapper. C and Fortran covered everybody’s needs - 30 years ago.

Accessing HDF data from other languages requires FFI wrappers, most of which are rudimentary and error-prone to use. There are a few good ones though, such as h5py for Python. With new languages becoming popular in science (JavaScript, Julia, Rust, …), the requirement to link to C code becomes increasingly a burden and a reason for disciplines to move away from HDF.

To make it worse, the HDF5 library has basic but nowadays insufficient support for parallel data access, which becomes increasingly important. That’s another reason why people are moving away from HDF.

On the other hand, much of the data that has been archived and published as HDF files still matters, meaning that a lot of effort is invested in maintaining the HDF5 library even though its popularity is in steep decline.

The currently most popular successor to HDF is Zarr, which is a fuzzy hybrid between data format, access protocol, and access API. And it changes too fast for many disciplines. I hope that a stable archival format for Zarr will emerge one day, but it hasn’t happened yet.

tagglink · 26 July 2024 14:31

Hi! I’m finding this thread really fun to read, and I’m convinced that these forums are made up of some smart people.

My own thoughts on the subject include a few questions that might be interesting to ask as a kind of base for the discussion:

What is a ‘good programming paradigm’?
What is a ‘good software’?
What is a ‘good software developer’?

I think that the reason we are seeing exclamations like ‘OOP is a trend, not a real paradigm’, or ‘great software doesn’t restart’, among other commonly repeated opinions, is because businesses are fundamentally different in what they think about these questions and we can’t seem to find a way to unify some fundamental principles even as we try to, as is our human tendency to try to do.

Is it possible to have two software written in two different paradigms interact? I think that’s certainly so! “But why would you do that?”, you may ask, since it seems to cause a bunch of difficulty and inconsistency that is not desired. The simple answer is that the real-world problem that each software is trying to solve can be very different, resulting in a gain in using different paradigms for each, yet they may still want to interoperate quite closely.

As an example, take a GUI application. It’s a run-of-the-mill graphical form-like application with some textboxes, some graphs and buttons for various functions and navigation. Just looking at this GUI application from a practical point of view, it’s going to need to keep some state in memory as the user is using the interface. This state is everything that is needed to draw the interface on the screen, and also the current state of the form data that the user can change in this part of the interface, which is the ‘input’. When the user changes the input, some calculations are run and the stuff on the graphs changes too, i.e. the state also has an ‘output’ part.

We can tell from this description that, there are incredibly many points from which the user can change the input. As many as there are textboxes, in fact. Also, every time a textbox is changed, only a small part of the input is changed, but we still need to collect all of the current input from other textboxes and call the calculation function all over again in order to get an updated output state.

Of course, the part of the state that doesn’t really need to change is even bigger. There’s probably some layout system that we fed with some data which resulted in our textboxes ending up where we wanted them on the screen. There’s a whole plethora of parameters that we’ve given the rendering engine to make sure that everything is in the correct color, the lines have the right thickness, et cetera. Only a small piece of this information actually needs to change as the user changes the value in a textbox, and that’s the piece that controls the graphs.

But, FP in its orthodox extreme, says that there is only ever input and output. So, it becomes incredibly difficult to define what we want here using FP, at least all the way. The best we could do is to define the calculation function as a single pure input-output function taking all of the input data, and then having a software that is not FP-strict assemble that input data structure and call the function, every time we detect that ‘the user has changed the value of a textbox’.

It would be completely nonsensical and a huge waste of energy to recreate essentially the whole GUI description every time the user changes the value of a textbox.

This is a property that is inherent to the problem we want to solve: a GUI application that is consistently showing the user a graphical interface. It makes sense to, say, model our graphical components as “objects” that keep a state and are mutable, because that’s what the application needs to do, in essence.

On the other hand, the calculation function defined at a reasonable level, where it takes the input it needs and outputs the data that is needed to present the graphs, would be terribly awkward to define using OOP. That’s because we know that the input data that we’re dealing with has been reduced to what should reasonably be needed to solve the problem, and that data does not have to be used for anything other than actually producing the graph output. Thus, why would we store it in mutable ‘objects’? In this context, it’s only confusing and difficult to keep track of all the different states that those objects might end up in when dealing with them. We’re going to have a much better time dealing with our code if we use invariants and immutables that enforce a certain order in which data is transformed: the code becomes easier to read, the number of states the data takes on is predictable, etc etc yada yada, all the benefits of functional programming basically. So in this case, we may even be better off using a programming language that supports the FP view.

So, what is a good software paradigm? I think it depends on the problem you are trying to solve. I also think that ‘problems’, as software developers understand them, are best defined as data transformations, but also including the context of where the data comes from before being transformed, and where it ends up after it is transformed (in essence: what various memory locations are in fact used for).

On the topic of the other questions, I’ve noticed that there is some confusion around what makes a ‘good software’ and a ‘good software developer’. In fact, a lot of software businesses, and by unfortunate extension also some software developers, seem to believe that ‘good software’ is not just software that solves a problem well. Some prevasive beliefs seem to be that ‘good software’ is software that is widely used, long-lived and that supports the livelihood of those that create and maintain it. In reality, a software could be used by a few people, be abandoned a few months after its creation, and the creators may not have earned a single penny from the users of the software. Yet that software could still be of immense value to its users.

I’m not saying that the incentives above are inherently bad, but I do think that they have, let’s say, “muddied the waters” a bit in the discussion of what ‘good software design’ really is. As has been already stated in this thread, we love to seem like we are being perfectly objective in our statements, yet much is still decided by cognitive bias and our real-world conditions.

The truth is, software does nothing on its own. It is not good by grace of existing. It is good because the bits and bytes that it handles are at some point shot back out into the world through some hardware, and deliver a useful impact on the real world.

So, what problems are we trying to solve? Should we even use digital devices to solve them? If so, maybe we should start by thinking about the hardware that collects information from the real world and digitizes it, and what that process looks like. Once we understand the data that our software works with and where it comes from, and have an idea of how the output of our software is used, we should probably decide on the set of hardware devices that we intend for our software to run on, and how that hardware really works. What’s costly on that hardware, and what isn’t costly? Once we have all this knowledge combined I think is when we can actually start to think about software design seriously.