What do algorithms want? A new paper on the emergence of surprising behavior in the most unexpected places

Published by

on

Sometimes people ask: “with your minds-everywhere framework, you might as well say the weather is intentional too!”. The assumption being that 1) these things can be decided from an armchair (by logic alone), and 2) that this would be an unacceptable implication of a theory (i.e., we can decide in advance, by definition, that whatever else, systems like the weather cannot be anywhere on the cognitive spectrum).

My answer is: “I don’t know, have you ever tried training it? We won’t know until we do.”

I think it is essential that we don’t make assumptions or just have feelings about where things belong on the spectrum of persuadability. We need to do experiments. Fortunately, the tools of behavior science can be applied outside of their normal domain, of brainy animals behaving in 3-dimensional space. Thus, the field of diverse intelligence is emerging, to help us get better at recognizing, predicting, building, and relating to unconventional systems with degrees of intelligence (competencies at solving problems in some space). And one of the earliest lessons it teaches is to have humility about our ability to recognize the beginnings of minds in novel embodiments. This post is about a new paper (in preprint form – not yet peer-reviewed), which explores that idea in silico.

One important aspect of diverse intelligence research is basal cognition: looking for minimal systems that show early, simple versions of intelligent behavior: learning, memory, problem-solving, decision-making, etc. This is crucial not only to understand the evolutionary origin of our own complex cognition, but also for AI and synthetic morphology which seeks to create novel systems of various degrees of agency. Really it’s a fundamental part of the biggest question of all: what are the necessary and sufficient ingredients for forming a mind?

Most such work is done in biological model systems such as slime molds, bacteria, or single cells. The problem with any biological system, no matter how “simple”, is that there is always more mechanisms to be discovered. That is, if you see an unexpected interesting behavior, it is always possible that it is explicitly baked in and implemented by some mechanism you just haven’t found yet. For this reason, some people are studying the behavior of simple physical systems, like inorganic chemical droplets.

Being interested in surprising competencies in unexpected substrates, I tried to think of the most outrageous example I could. Now, we did something like this once before – in two papers on Boolean and continuous gene-regulatory networks, we showed that even very simple network models had surprising proto-cognitive behaviors – several different kinds of learning including Pavlovian conditioning. But I wanted something even simpler and more minimal.

I landed on the idea of using sorting algorithms – these are short (<10 lines of code), classic algorithms for sorting a mixed up string of numbers so that they end up in perfect order. The nice things about these are that: 1) they are very simple and transparent – you can see all the code in one glance, there is no place for undiscovered mechanisms, 2) they’ve been studied by generations of computer science students for decades, and everyone thinks they know what these algorithms can do, 3) they are totally deterministic – no randomness or any oracle elements, and 4) the process of sorting a bunch of elements that begin scrambled, into an invariant order, bears an uncanny resemblance to the rearrangements that morphogenetic remodeling can accomplish to repair, for example, the tadpole face.

Taining Zhang, Adam Goldstein, and I studied these because I wanted the maximal surprise value. I hypothesize that intelligence (agential behavior implementing problem-solving) is present at the lowest levels in our Universe, not requiring brains or even life per se. Specifically, I am not interested in emergence of mere complexity. That is easy; simple cellular automata (CAs) such as the Game of Life enable huge complexity to come from very simple rules, as do the fractals emerging from short generative formulas. But what such cellular automata and fractals do not offer is goal-seeking behavior – they just roll forward in an open-loop fashion regardless of what happens (although, I must say that I am no longer convinced that we have plumbed the depths in those systems – after what we saw in this study, I am motivated to start looking for goal-directed closed-loop activity in CA’s as well, who knows). What I am after is very minimal systems in which some degree of problem-solving capacity emerges, and specifically, capacity that is not specifically baked in by an explicit mechanism. It is a related set of questions to those studied by Stuart Kauffman in his inspiring books.

We used the methods of TAME, which focuses on perturbative interventions to gauge an agent’s behavior in some problem space, to ask what these sorting algorithms can do. In other words, we treated them as a new kind of animal and characterized how they traverse the sorting space, and what they do when presented with barriers.

We made 2 small but fruitful changes to the algorithms. First, instead of a top-down God’s-eye view where an omniscient single algorithm controls how each element is moved, we went bottom up, with a distributed approach: each number (we call them cells, in a 1-dimensional “tissue” array) has its own ability to carry out the algorithms and its own local preferences about what neighbors it wants to see to its left and its right. So this is now (like in biology) a parallel process in which each cell exercises its own agenda, using the standard sorting algorithm as a policy for how to get to the right location by swapping with neighbors until everything is sorted. Thus, we use a cell’s eye view, not a global top-down view. The first result was that it turns out that bottom-up control (with no global knowledge) works perfectly well to solve sorting tasks.

The second change is that we implemented “unreliable computing”. In standard sorting algorithms, it is assumed that when the algorithm issues a command for two cells to switch positions, they do it. There is no notion of failure, and the algorithm never checks to see if it worked. We introduced the concept of “broken cells” – ones that either lack initiative to move themselves, or are so broken they cannot be moved by other cells either.

Note that we did not change anything else – we didn’t add code to enable cells to see how well they are doing, whether a move worked or not, or any other embellishment. They’re still short, deterministic algorithms with nothing added to make them any smarter. Everything described below is emergent, and surprising (and hadn’t, to our knowledge, been discovered despite the very wide use of these algorithms). The details of our work (currently in review) are described in this preprint.

We did many experiments to understand the properties of this system (see the preprint for details); I want to focus on just a few specific ones here. Our two changes allowed us to study aspects that have not been studied before. By introducing the notion of broken cells, we get to ask how this system fares under perturbation; this is crucial because, as William James said, the measure of intelligence is whether you can achieve the same goal by different means, when surprising circumstances get in your way. What do these algorithms do, when they are proceeding on their way through sorting space, moving around as needed, and then BAM – they come upon a barrier – a cell that just won’t move. The algorithm itself makes no provision for this scenario. I turns out that not only can they still complete the task, they do it via delayed gratification: the ability to temporarily get further from their goal in order to do better later. This is illustrated in William James’ example of the magnets vs. Romeo and Juliet; shown here in this diagram by Jeremy Guay:

While humans, many animals, and autonomous vehicles can get around obstacles to get to their goal, magnets for example will not – they are simply trying to minimize energy and thus can’t even temporarily go against the gradient, pull further apart, to then go around the barrier to meet each other and get even closer together. The ability to do some degree of delayed gratification is one component of intelligence.

Where would a simple sorting algorithm fit on this scale? As emphasized in my TAME framework, is that you cannot guess in advance, just by knowing the components of the system (which in this case, are perfectly known – the algorithm is there for all to see): you have to do experiments. And when we did the experiments, we found that the algorithms not only can get around obstacles by temporarily disordering the string further (a weird thing for a deterministic sorting algorithm to do!), but they do more of this when greater numbers of defective cells are introduced (showing that it’s a contingent, contextual response to their unexpected situation, not just a random back-pedaling that just routinely happens no matter what).

Another thing that were were able to do, because in our system it is each cell that runs an algorithm, is study chimeric scenarios in which half the cells are running on algorithm, and half the cells are running another. This is actually a critical issue in biology because the field currently has no formalisms for predicting what happens to the anatomy when cells of different genetic make-up (and different policies for action that lead to each species’ target morphology) are combined into a single whole organism. For example, as illustrated below from this review, what head shape will be built in a flatworm that contains stem cells that normally build a flat-head planarian and ones that normally build a round head one? Will one of the shapes be dominant, or maybe an intermediate shape, or perhaps it will just keep morphing, since neither set of cells will ever get to the stop criterion (a completed target morphology of the relevant species)?

We created arrays of mixed up numbers, where half the numbers belonged to cells executing one algorithm, and half of them executed a different algorithm. The assignment of algotype (a word coined by Adam Goldstein, parallel to genotype and phenotype, indicating the overall behavioral tendencies resulting from a specific algorithm) to each numbered cell was totally random. Crucially, the algorithm didn’t have any explicit notion of this – the standard sorting algorithm doesn’t have any meta properties that allows it to know what kind of algorithm it is running or what its neighboring cells are running. Its algotype is purely something that is known to us, as 3rd person external observers of the process. But it guides the cells’ behavior and the decisions they make on when and where to move to in their quest to have properly sorted neighbors. The basic result is that chimeric strings sort just fine – the cells don’t all need to be using the same policies, for the collective to get to its endpoint in sequence space.

We then asked a weird question. What would the spatial distribution of algotypes within a given string look like, during the sorting process (its journey through sequence space)? This is akin to a traditional morphological analysis of an embryo during its journey through morphospace – an anatomical/histological scan of what kinds of cells are placed where. We thus defined a quantity called “clustering” which simply described how likely it is that your neighbor is the same algotype as you. So, we knew that at the beginning of the process, the clustering had to be 50% (because the assignment of algotype to position in the array was random). And we also knew that at the very end, when all the numbers have been put into their final correct positions, it would also have to be 50% because there was no relationship between algotype assignment and position in the final numerical order (we assigned it randomly). But what did it looks like between those two endpoints, while the algorithm was working hard to do its thing?

Amazingly, what they did during that time was cluster significantly. For any two combinations of algorithms, cells of the same algotype tended to hang out together, until the cruel hand of the sorting imperative pulled them apart again (as the numerical ordering must win in the end, since the algorithms are guaranteed to establish sort order eventually). Take a moment to take stock. These simple systems not only have the ability to solve their task despite a novel situation (barriers in their space), they use delayed gratification to do it, and they exhibit a specific behavior that maximizes a meta property (clustering of algotypes) that is implemented nowhere in the algorithms themselves. I was frankly shocked to see this, even though a gut feeling caused me to plan the experiment in the first place.

Here’s an example. The blue line shows the progress of the sorting. The faint pink line is a negative control to make sure our code isn’t doing something wonky – it wobbles around the usual 50% when we’re not actually combining 2 different algorithms. The bright red line is the aggregation index – the tendency of each kind of cell to cluster, while it can, with those of its own kind (and the haze around it represents standard deviation measurements of 100 experiments). You can see here it goes above 0.6 – highly statistically-significant effect.

This inherent tendency of cells to travel, even for a time, with behaviorally-defined kin is perhaps relevant to a couple of other concepts. First is the Hebbian idea of fire-together-wire-together: could it be a more general property of things to associate with those that behave like them? Second, one interesting thing about copies of you is that they are more predictable than random features of the environment. Chris Fields and I proposed a model of multicellularity based on this idea – cells keep their progeny close because it’s a kind of bulwark against the unpredictability of the outside world. Could this tendency to cluster with similar algotypes be due to a kind of Fristonian surprise minimization, in which Hebbian behavior arises due to an emergent drive to minimize uncertainty of your local microenvironment? This remains to be tested in this model, but it seems reasonable that agents with the same personalities (behavioral algorithms, or algotypes) would be comforting and predictable, to have as neighbors.

One last result to note. Given that these algorithms have a cryptic goal – to cluster with their own kind – how strong is it, really? In our case, we inevitably suppress their ability to pursue this unexpected situation by demanding (via the explicit algorithm) that the numbers get sorted – it is impossible, under the standard system, to do both – keep algotypes segregated and sort the numbers, because it’s 50% likely that the number any cell wants next to it happens to have the wrong algotype. This limits how much clustering they can do. What we then did to let them flex their inherent behaviors a bit more was simply allow duplicate numbers in the string. That way, for example, if you have a string of 555555, it can occur between the 4’s and the 6’s, satisfying the algorithm’s need to sort on numerical value, and also allowing as much clustering as it wants (because for example, the left half of the string of 5’s can all be of algorithm 1 type, while the right half can all be of algorithm 2 type – plenty of clustering with its own kind within each set of repeated digits). When we did that, the clustering did in fact rise, revealing that the explicit sorting criterion was indeed suppressing its innate desire to cluster.

There is something to be said here about goals and where they come from, in general. Where do animals’ goals come from? Evolution? Where do Xenobots’ and Anthrobots’ goals come from? They have never been specifically selected for, they’re brand new (and we do not yet know their goals – that is very actively being researched). Where do humans’ goals come from? Human children’s come from their parents (environment) and some built-in circuitry; how about the adults’ – both average, and genius-level adults – how do they get their goals? And where do goals arise – which animals (down to amoeba and bacteria and the networks within them) have them and how do they scale? I think these algorithms are teaching us something. They have derived goals – the goal of having a sorted string is given to them by us; they inherit them (i.e., second-hand goals) by virtue of the algorithm we designed. But they also seem to have intrinsic goals (clustering, and who knows what else that we didn’t think to look for) which come from … I’m not sure where. I have some ideas which I will write about soon. At the very least, if these simple things can have cryptic goals intrinsic to their behavior and not to the algorithm, just think what kind of latent dreams we could find in more complex systems (even algorithmic ones, before we even get to biologicals that might use quantum dynamics and other things not captured in common digital algorithms).

I wonder if we can think of the unexpected clustering behavior of these algorithms as a kind of “subconscious” influence over their behavior (which is otherwise controlled by the policies explicitly implemented by the algorithm). In this paper, we uncovered cryptic drives and motivations for behavior (algotype of your neighbor), which are not apparent to the agents (the algorithms give the cells no way to query the algotype of their neighbor or themselves). Is what we did here – determining the hidden causes for behavior – a kind of proto-psychoanalysis, in which at least the external observer gets to find out why the agent does what it does (even if they are too simple to take up that insight themselves, as we hope a human psychoanalytic subject would)? And what of the psychological stress (perhaps not visible in this simple system, or maybe we just don’t know how to measure it?) of having your explicit goals (numerical sorting) be in conflict with your implicit goals (clustering)? And by the way, inevitably win… I’m doing my best not to feel nano-bad about the existential futility of their plight.

Fortunately, I know two amazing people with whom to discuss this sort of thing, both having expertise in psychoanalysis and basal cognition: Mark Solms and Karl Friston. I’ve got talks scheduled with them about this in the next month, I’ll put up the video links once we have a chance to talk about this. Entirely possible that my thinking on all of this will change, after we do more experiments and try to define quantitative metrics of stress, implicit sensing of large-scale behavioral policies, and motivation in these systems.

To summarize: these basic algorithms have unexpected competencies to solve the problem we explicitly designed them for (in sorting space), and also apparently have behaviors (maximizing algotype clustering – a meta-property in morphogenetic space) that we had no idea about until we looked for it. My suspicions (which we are now testing) is that this may be fairly general, and that once we look, many (most? all?) algorithms will prove to have unexpected tendencies and capabilities. I think the continuity we see in development and evolution is far deeper than we realize. As the diverse intelligence field increasingly finds forms of learning, decision-making, goal-directed activity, and other emergent competencies in minimal unconventional substrates, some who want cognition to be a magically unique property of advanced brains will say “that’s not really what we meant by these terms”. Listen closely – can you hear the screech of the goal-posts being moved?

Featured image was made by DALL-E. Planarian schematic was made by Daniel Lobo.

72 responses to “What do algorithms want? A new paper on the emergence of surprising behavior in the most unexpected places”

  1. frank schmidt Avatar

    As always, informative exploration beyond the borders of dogma.

  2. Rob Scott Avatar

    Now we’re really getting somewhere. Love the subconscious reference. Excited to see the talks you have with Solms and Friston. Well done, sir. 🙂

  3. Benjamin L Avatar
    Benjamin L

    Fascinating work. I love the fact that you have this blog and have been filling it with such interesting content so frequently.

    What you’ve described here is similar to discoveries in motor behavior, where instead of viewing motor behavior as entirely controlled by the brain in a one-to-one fashion, many of our motor behavior capabilities are more usefully attributed to the anatomical order and musculoskeletal system. These systems display motor behaviors without the brain, which is very useful since it frees the brain up to regulate and coordinate motor behavior without having to micromanage it.

    I’ll also add that I’m very interested in modeling purely mathematical structures as “agential behavior implementing problem-solving”, although I’m looking in proof-based rather than algorithmic directions.

    1. Mike Levin Avatar
      Mike Levin

      Would love to hear more about mathematical structures with agential behavior – I’ve been thinking in those directions too, but haven’t done anything seriously on it.

  4. James of Seattle Avatar

    It really seems like you’re playing in Alicia Juarrero’s backyard. If you haven’t read her book “Context Changes Everything”, you should. I think you are dealing with context-dependent (given algorithms), and context-independent (clustering) constraints.

    *

  5. Micah Zoltu Avatar
    Micah Zoltu

    I just wanted to say, these two sentences each made me laugh:

    > I’m doing my best not to feel nano-bad about the existential futility of their plight.

    > Listen closely – can you hear the screech of the goal-posts being moved?

    1. Arjulaad Avatar
      Arjulaad

      🙄👍🏻☯️

  6. Leah Avatar
    Leah

    Love it. This stuff makes me feel just slightly crazy.

  7. Billy Avatar
    Billy

    Thank you for sharing Michael, goal posts are meant to be moved, peace

  8. Mark Heyer Avatar

    Excellent article as usual. I’m facinated to see what can be done by applying the new math of vector spaces/parameters/dot products to the elucidation of these new algorithms. Not to mention discovering the base algorithms that live inside the Markov blanket (or as I like to call it, the “Bejan Envelope” of thermodynamic/information possibility. Looking forward to your interviews. Bring on the new science!

  9. David Morton Avatar
    David Morton

    I would be very careful to account for the fact that some sorting algorithms tend (on average) to move a random cell either left or right or are balanced. Bubble sounds balanced, except that it first checks one direction and then the other, so it it slightly imbalanced. Insertion sort sounds biased to the left a fair bit, but Selection sort sounds heavily left biased. This appears at first glance to match up with the clustering affinities.

    1. chris m Avatar
      chris m

      i had the same thought, that idiosyncrasies/inefficiencies of the algorithms could cause them to ride together for a time

    2. Craft Link Avatar

      I agree with the call for being careful. A simple mechanistic (artifact) explanation seems so much more likely than the rather vague, hand-wavy interpretations of the observations (frankly, they are non-explanations imo). The article is in the open, so let’s get our hands dirty and try to explain the observation with some more rigor. That is not to say that these effects/artifacts have no role in the biology described and their effects are perhaps under appreciated. But let’s first try to rule out a simple mechanistic explanation for the clustering, before getting too much with our heads into the philosofical clouds.

      1. Mike Levin Avatar
        Mike Levin

        I don’t disagree with this. But let’s understand what “explanation” by “simple mechanisms” means. Any behavioral act of a human, or a morphogenetic process, can be said to have simple mechanical explanation if you zoom in on the level of the physics. It will always be “just physics and chemistry” if you look closely enough, and it can miss everything that is important. The presence of such mechanisms does not mean that a process does not also have some other interesting features at higher levels such as those studied by behavioral science etc. They are not mutually exclusive. Also, non-explanations is correct – I have not focused here on explanations at that level, I have focused on the emergence of interesting higher-level behavior. For example, in the Game of Life (https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life) there is a simple, deterministic, mechanistic explanation for every single event. Does it tell the whole story – nothing else to see here? No, I don’t think so. It is entirely likely that some mechanistic story about the clustering and the delayed gratification can be told. And yet, it is still interesting that they have these properties and that no one predicted that they would, in decades of study. Why did we not know about it, and what can we do with this (how to exploit it), how to learn to predict others, what it says for more complex algorithms, and what else it helps us build and discover – those are still interesting questions, imho.

        1. Craft Link Avatar
          Craft Link

          I certainly sympathise with that, but don’t you want to make sure you aren’t looking at a very peculiar, non-generalizable artifact? You can of course claim the artifact is the feature in this case but that doesn’t justify, at least not in my mind, the very foundational claims attributed to this particular behaviour. At least not until we understand the concrete mechanistic underpinning of this particular observation. I think it’s an interesting observation, but I have my reservation regarding it’s attributed scope, if that makes sense.

          1. Mike Levin Avatar
            Mike Levin

            Yes, I agree. The specifics of this – the clustering – may well be limited in scope, we will find out. We’re currently testing it in a number of other systems. The bigger picture – for example the delayed gratification ability – has already been found in many systems (mostly biology, which I often write about), it’s quite general. What I wanted to do here was to see if I can find it in a truly minimal, simple system for it, in which the components and rules are all known (unlike in the biology). It’s certainly formally possible that I just happened to pick the one simple system that does it, and others won’t. I think it unlikely, but it’s possible and we certainly need to look at others to know how broadly it really goes. In the end, this blog is about possibilities and ideas beyond what we already know, so I feel free to speculate a bit – none of this is meant to be strong statements about what we know for sure but more an exploration of different ways to think about things. In the primary paper, we were much more careful about what we actually claim (at least I tried to be) vs. what are hypotheses for future research.

  10. Alain Schaerer Avatar

    Interesting. I coded something on my own this morning before going to work, and it seems to be working if there an arbitrary amount of moveable frozen cells, but not when there are fully frozen cells. I also dont have a sorting evaluation system and I am solely using the bubble sort algo.

    2 questions:
    Is the sorting eval system necessary to make it work with fully frozen cells?
    Are more than one algotype necessary for fully frozen cells?

    https://gist.github.com/swisstackle/f69ce28002b06809c01ddab009a8bf0b

    1. Mike Levin Avatar
      Mike Levin

      sorry, what do you mean by “eval system”? No, you don’t need multiple algotypes to do 1-3 frozen cells. But note that with 2+ fully frozen cells, it may not be 100% solvable. I will release our code soon.

    2. Alain Schaerer Avatar
      Alain Schaerer

      Ah. Im seeing in the results section of the paper that it didnt work with every string combo. Im assuming biology also doesnt have a 100% error tollerancy.

      1. Mike Levin Avatar
        Mike Levin

        Yeah it’s logically impossible to solve all cases: for example, if you have 2 totally frozen cells that are out of order, there’s nothing you can do with the other cells that will bring these two into sequence with each other, one or both of them will always be a “point defect”. Biology has tons of these 🙂 The thing is, can you work around them to sort everything else (and it can).

  11. Alain Schaerer Avatar
    Alain Schaerer

    Yea I suspected. Cool stuff for sure!

    1. Alain Schaerer Avatar
      Alain Schaerer

      Would be cool to bring this into 2 dimensions and then 3 dimensions. Possibly make it able to create 3D structure by assigning each cell a value and then they sort each other according to certain rules in a 3D environment.

      Possibly something I might try when I get the time.

  12. Mike Levin Avatar
    Mike Levin

    Cool. We’re doing that kind of thing next too.

  13. Vicente Sanchez-Leighton Avatar
    Vicente Sanchez-Leighton

    Your “distributed and diverse” sorting algorithm reminded me of the “Byzantine Generals Problem” (1982) https://lamport.azurewebsites.net/pubs/byz.pdf which studies resilience conditions for distributed systems 😉 : (Abstract) “Reliable computer systems must handle malfunctioning components that give conflicting information to different parts of the system. This situation can be expressed abstractly in terms of a group of generals of the Byzantine army camped with their troops around an enemy city. Communicating only
    by messenger, the generals must agree upon a common battle plan. However, one or more of them may be traitors who will try to confuse the others. The problem is to find an algorithm to ensure that the loyal generals will reach agreement. It is shown that, using only oral messages, this problem is solvable if and only if more than two-thirds of the generals are loyal; so a single traitor can confound two loyal generals. With unforgeable written messages, the problem is solvable for any number of generals and possible traitors. Applications of the solutions to reliable computer systems are then discussed.”

    1. Mike Levin Avatar
      Mike Levin

      Super. This is also quite relevant to the next paper (and post) coming soon on inter-embryo communication. Thanks!

  14. Kine Avatar
    Kine

    From our point of view, we ask of the completely determined computer (due to stable hardware) to create an ordered set which has to be consistent with the operations of the algorithm. Now, we normally think of electrons, since they are so constrained by the hardware, as passive slaves of our algorithm. And, if we can only order the set one way, then of course the electrons are heavily constrained by the algorithm. But if we want an ordered set which can be reached in several ways, then what exactly are the constraints put upon what map enables the ordered set? In some sense, the electron is confined to a structure that has to be consistent with a certain ordering that we think of as consistent with boolean algebra. But what if that is not completely true? What if Boolean algebra is a special case of a larger structure of the dynamics of the electrons, and that electrons confined by transistors, when given freedom from the top-down constraints, really are not completely confined to an ordering which is only consistent with boolean algebra?

  15. Arjulaad Avatar
    Arjulaad

    🤔…..

  16. James Marquand Avatar
    James Marquand

    Very curious to see the code.

  17. eden sharp Avatar
    eden sharp

    your description of the clustering seeking behavior as cryptic works when analyzing each single cell without taking into account its context, where you could say that its behavior is cryptic because it can not explicitly obtain information about its neighbors’ algotypes anywhere in it’s own algorithm source.
    however, its behavior can be elucidated and captured locally by analyzing each local few cells as a compound meta cell with an algotype of its own, which in effect *does* have access to its constituent’s probable algotypes via effective decisions of ordering that differ between constituents giving off latent meaningful information about their algotype’s behavior, which that latent information can then be reabsorbed by virtue of how a neighboring cell would have made a different decision had this one been a different algotype.
    this phenomenon of “effective information sharing on average” in effect inserts an extra bit of fuzzy logic akin to ‘if neighbor was behaviorally similar to myself, make one already defined choice, else make another already defined choice’.
    and so we see that viewing these neighborhoods of cells as each cell working with their own algotype’s state and nothing else is not sufficient in actually defining its effective algotypical behavior in its surroundings, and what one needs to do in order to *arrive* at an algorithm that fully defines an algotype is to take a meta-cell of around two or more constituent’s combined behavior, taking into account the *emergent state* which identifies which constituents are of which original algotype, and compiling all of that into one algorithm that describes that local neighborhood’s actual algotype.

    i predict that if you do this you will see that for higher and higher scales you will see a more and more accurate picture of the resulting algotype, and there are three key insights here:
    1. that if there is data transfer between cells that cause them on average to make different decisions based on others’ past decisions, then the resulting behavior of metacells will be heterogenous to the states of its constituents.
    2. these metacell algotypes are still procedurally defined and deterministic just like the cell’s internal algorithm, and can be written in the same way. the source algorithm viewed at these levels will show you the explicitly defined behavior relevant to subtypes of cells.
    3. the metacell behaviors have **no inherent or universal requirement to sort subtypes towards each other necessarily.**
    and you can easily think of simple deterministic local algorithms that specifically tend to spread apart two subtypes, it is possible.

    this however is to say nothing about the average *relative usefulness* of algotypes that tend to either attract each other or repel each other, across the endless terrain of possible algotypes and their combinations. it might truly be more useful for the efficiency of the collective to have specialized cells form working groups, or for in some combinations of algotypes, to disperse between each other because their combined algorithm has a sophistication that makes them more effective. your postulation of grouping in order to minimize randomness in computation still has grounds to explain both why algotypes in a combination that stick together are more effective, and potentially even to explain any ordered structures that arise from algotypes in combination that disperse themselves but say, utilize their difference in densities and the low entropy produced to transmit information in more distributed computational networks!

    thank you for your time, and i sincerely hope my perspectives have brought some insight!

    1. Mike Levin Avatar
      Mike Levin

      Superb, thanks!! This is along the lines of some things we’re testing next, and there are some great ideas here. I fully agree that there is more going on at the higher levels of “tissues” (multi-cell groups, which ties back to our original framing of this as a system for understanding morphogenetic sorting), and we’re very interested in quantifying meta-cell algotypes and the cryptic (and deterministic) algorithms that end up being executed by them but are not explicitly in the code. Also interested in the emergent “data transfer” and how much the boundaries at each scale shield or do not shield the internal details. We have also some hypotheses about why clustering *might* be a universal pull (if not requirement) but those (derived from active inference arguments) will be tested soon and we’ll see.

  18. Perry Marshall Avatar

    This is incredibly interesting and I hope to find time to digest this to my satisfaction in the coming weeks. Meanwhile I believe this is a very productive line of inquiry.

  19. Tori Alexander Avatar
    Tori Alexander

    Good answer: “I don’t know, have you ever tried training it? We won’t know until we do.”

    Bruce Clarke, elaborating on Lynn Margulis’ Gaia Hypothesis, makes a pretty good argument that the climate (if not the weather) is intentional (or the product of an intentional system). Eons of interaction between and among living systems has trained it.

    1. Mike Levin Avatar
      Mike Levin

      My gut feeling agrees. But to show it, we need to do experiments, using the tools of behavior science to find evidence of specific intentional properties. I bet they are there to be found.

  20. Michael Kearney Avatar

    To what extent could the architecture of the hardware influence what you see with these fascinating experiments? If you compile the program in different ways, or on different chips, could you possibly get different types of emergence? I get the multi-level competency concept, and the notion that what happens at one level of integration doesn’t need to understand what happens at the level below to interact. But to what extent could the physical instantiation of the algorithm lead to the construction of algorithmic environments that in turn lead to some types of emergence, at least transiently? Or can you rule this all out and attribute it purely to the code itself?

    1. Mike Levin Avatar
      Mike Levin

      good question. My gut says it won’t matter, but as I’m learning, our intuitions here are not a reliable guide, so maybe I better check…

      1. Micah Zoltu Avatar
        Micah Zoltu

        I believe we can assert that the hardware doesn’t matter here because we can fully predict the outcome of the system with 100% reliability. This is because we fully understand the algorithm and we can verify that the hardware is faithfully executing exactly the algorithm we gave it.

        This is quite different from biological systems, weather, and even simple molecular systems where we cannot predict the outcome 100% of the time, and we don’t fully understand the algorithm or underlying hardware.

        1. Michael Kearney Avatar

          The outcome is predictable but the transient clustering of the algorithms was unexpected. I was thinking particularly about this clustering and whether the way the compiler optimises the code might somehow lead to this, perhaps because it is indeed optimal in some physical way related to the structure of the hardware. Not that I know much about the translation of high level code to machine code. It would be interesting to see whether you could compile it in different ways so that the clustering varies, and then see what is most optimal in terms of time taken and energy used.

          1. Micah Zoltu Avatar
            Micah Zoltu

            In the case of computer software, optimizers are designed to ensure (provably) that the behavior of the code does not change. While there can sometimes be bugs in optimzers that result in behavior changes, a properly functioning one will behave exactly the same, just faster.

            That being said, you could tweak the algorithm in ways that us humans think shouldn’t change behavior but *could* change secondary behaviors like the clustering.

        2. Mike Levin Avatar
          Mike Levin

          > we can fully predict the outcome of the system with 100% reliability

          well sort of… We can predict the actual sorting functionality, and I’m sure if that was impacted by hardware implementation or compilers, this would have been found and fixed long ago. But actually no one had predicted the clustering, and it’s not 100% obvious to me (although, seems very likely) that unexpected behavior like that couldn’t actually be less constrained and wobble a bit due to underlying details, if no one is looking for it. We just don’t know enough yet I think about where it comes from in general or its properties, to know how tightly it’s constrained by the underlying levels that are totally, successfully hidden from the “canonical” function of a given algorithm.

          1. Micah Zoltu Avatar
            Micah Zoltu

            I can appreciate the line you are drawing near the edge of the spectrum and your desire to avoid claims of certainty, but we can formally prove modern computer hardware. This means that aside from a bug in an implementation, there would have to be something we got wrong in mathematics for unexpected results to occur. This makes the wobbliness around the edges seem very unlikely to me (near the point of not being worth mentioning).

            Note: By “unexpected results” here I just mean that a human sitting and stepping through the code using logic, mathematics, or a computer would all get the exact same very predictable outcome. We may be surprised by what we find at the end, but it is 100% deterministic from information we (humans) currently know and are capable of fully understanding (unlike biology).

            1. Mike Levin Avatar
              Mike Levin

              I need to be more specific. I’m not claiming that walking through with a machine code debugger will see any magic or errors – it won’t. The steps are being executed faithfully. But if you take that lens, you can do it to a human brain too and never see anything but chemistry. It’s not the only lens, and cognitive science gives us others (in the case of biological beings). What I’m saying is that unexpected things are observed using other lenses (such as, asking about clustering etc., which is normally not done for sorting algorithms) and that those lenses could, perhaps (I’m not claiming they do, just open to the possibility) detect surprising dependencies of *higher order* observables (not the micro states of the registers and the variables) on details of compiler/hardware. There is no magic or indeterminism at the low level – that’s why I like this model system. The surprise comes by looking for higher-level patterns, a certain “synchronicity” if you will, compatible with the low-level chance and necessity. I will do a longer piece on this.

              1. Micah Zoltu Avatar
                Micah Zoltu

                On all of this we agree! I am very rapidly coming around to the idea of higher order patterns that we humans fail to notice/recognize, and this study illustrates it well.

        3. Frank Schmidt Avatar

          Even a hint of agency/intention is a game changer. Can’t wait to see what further research reveals.

          1. Micah Zoltu Avatar
            Micah Zoltu

            To be clear, I wasn’t arguing against Michael’s claim of mild agency! Only that computers are deterministic and well understood. Determinism doesn’t prevent agency.

            1. Frank Schmidt Avatar

              “Determinism doesn’t prevent agency” – that’s the nut to crack. What is really going on?

              1. Mike Levin Avatar
                Mike Levin

                Well, what we mean by “agency” isn’t “random, unpredictable actions” so simple non-determinism doesn’t help. I’m working on a longer piece on relationship of agency and determinism, but others have written on this (Kevin Mitchell’s new book etc.).

                1. frank schmidt Avatar

                  Gonna get that Mitchell book pronto. Love this stuff. AI development will pivot on what you are doing.

                2. frank schmidt Avatar

                  Started listening to Mitchells latest book. Luv it! “Information vs. energy”. How and when did matter begin to tap into information and morph into life? And how and when did energy take on the cloak of information? And what roles do information and heat entropy play in the energy information relationship? Is there some kind of phase transition going on between energy and information? All II can offer is questions.

                  1. Mike Levin Avatar
                    Mike Levin

                    In that case, you will also want Nick Lane’s book.

                    1. frank schmidt Avatar

                      We dispute the phenomenon that enables us to dispute that phenomenon.

  21. Mahault Albarracin Avatar
    Mahault Albarracin

    Really interesting study. I wonder if we could add more dimensions and consider deontic cues, along with semantic functions – the assumption that groups are atempting similar things, given a perspective (like group theory of mind)

    1. Mike Levin Avatar
      Mike Levin

      Please say more! How would we do it? We’re testing some active inference-related ideas, but I am sure we can use your suggestions. What would you measure in this system, for the group theory of mind?

      1. Mahault Albarracin Avatar
        Mahault Albarracin

        https://www.preprints.org/manuscript/202312.1770/v1
        We tap into this idea here and https://www.mdpi.com/1099-4300/24/4/476

        But essentially we would try to determine what gets mapped by an individual node as signals for similarity of objectives (in goal direction) – given some time depth of policy (it does not have to be a perfect overlap, or overlap at every scale).
        The node should be able to map this signal as a reduction for a group coordination : It is not simply recognizable as the signal for an objective, for for a coordinated objective, which then allows the node to recognize the distance to this objective given an individual signal (how well the signal represents distance to the group, and group alignment).

        We are thus looking for the capacity to predict a sheaf, and establish the space on that sheaf by any number of other nodes, by not having to compute the entirety of their trajectories, but rather the signals which signify a larger expected sheaf, one with known or desired outcomes.
        But each node would come with a specific perspective, and potentially, its own internal logic (giving rise to roles) – Thus we could essentially see these perspectives as sheaves which compose toposes, and if logically consistent or unobstructed, share enough of a boundary to be reconstitutable to lead to natural predictions of common goals.

        Such signals could come from the node identifying nodes with which it will have more precision, even if these nodes do not behave the way that it does necessarily.

      2. Frank Schmidt Avatar

        “Semantic relationships” – Yow!…This is getting better all the time. The quintessential mystery novel. I can’t wait for the next post. Keep it cranking. I will plug into Claude or OpenAI for “their” take on the chatter. Great stuff!

      3. frank schmidt Avatar

        Are you plugging these ideas into claude or openai just for kicks? I realize that the AI chats can hallucinate but I am finding that they are capable of some interesting output. If only to regurgitate (in a new way) ground already trod on. I have done this and it’s a hoot. Ya’ never know…

  22. Olli Avatar
    Olli

    I am super interested to see the alogorithms. The link provided in the preprint ‘https://github.com/Zhangtaining/cell_research’, doesn’t seem to work. Is there another way to look at the code? Thank you.

    1. Mike Levin Avatar
      Mike Levin

      Let me check; I think it might be because the paper is still under review and the link gets released when the paper publishes. I’ll look into it.

      1. Olli Avatar
        Olli

        Thank you! I found this on Taining Zhang’s public Github: https://github.com/Zhangtaining/sorting_with_noise/tree/main

        Seems very much related but has been written 2 years ago.

  23. Michael Thudén Avatar
    Michael Thudén

    Really interesting!

    The sorting algorithm you use put the numbers in order on a number line representing the natural numbers. The number line is hence restricted to R1. It would be interesting to see what kind of paths the stepwise changes in the algotypes would trace in R2 in a model I made where the natural numbers are represented in a layered circular binary grid.

    Each “compartment” in the grid can be unambiguously described by a binary string in a vector like manner starting from center, I called these Radial Combinations (RC). A RC from the center to the edge of the grid can point out a direction and the grid move to along the RC. The grid can follow a set of RC:s (binary strings) forming a path in R2. One interesting property is that the grid can find its way back the starting point by just flipping the most significant bit in every binary string in the set.

    Some other properties:

    * The grid can internally represent truth tables
    * The normal π/2 to a given RC can be found by adding a 1 to the second most significant bit (to find a π/4 RC to a given RC you add a 1 to the third most significant bit and so on).
    * The complement to a given RC will always be mirrored by a polarity line. So, if you flip all bits in all RC binary strings that forms a path you will get a mirror path.

    * A path described by a set of RC:s can be mapped onto a Cartesian coordinate system.

    The system can be expanded to R3 in two different ways (not illustrated in the presentation below).

    The presentation also includes an example on how the grid perhaps can be used to mimic ant path integration that might be useful in biorobotic.

    The presentation is in PowerPoint format since it include some animations that can´t be rendered in a PDF document.

    Hope the you find something interesting!

    Best regard
    Michael Thudén

    https://docs.google.com/presentation/d/16HecrthVPPyPFOBKHDOP8G31Zq6PYYZl/edit?usp=sharing&ouid=115162670909464159602&rtpof=true&sd=true

    1. Mike Levin Avatar
      Mike Levin

      Looks very interesting, let me take a look!

    2. Kine Hjeldnes Avatar
      Kine Hjeldnes

      I don’t know if I understand you correctly, probably I don’t, but is it possible, then, to see some connection to some type of “radix mechanics” or as formulated by Garcia-Morales (2014): The principle of least radix economy?

      1. Mike Levin Avatar
        Mike Levin

        I had never heard of the radix economy, I’ll check it out.

  24. Cameron Reynoldson Avatar
    Cameron Reynoldson

    Were the individual cells’ movements individually witnessed over time, or is it possible that the goals themselves moved/clustered/swapped hosts?

    Awesome stuff, keep it up.

    1. Mike Levin Avatar
      Mike Levin

      Good question! In the current implementation, the cells cannot change algotype, but that is indeed a next step – what happens if we allow them to change types (and the algorithms thus swap hosts).

  25. Bence Krénusz Avatar
    Bence Krénusz

    I have a question regarding the distributed concept of the solution.

    1, Are the Cells distributed uniquely, meaning that every Cell with their own ID parallelized into a unique thread or there is a Sorting class based distribution where the one unique thread represents a unique sorting mechanism?

    If it is the second scenario I wonder there is an underlying CPU or GPU architecture based mechanism where as a cause of an inner loop Cells with the same sorting type tends to be located next to eachother.

    Thank You,
    Bence

    1. Mike Levin Avatar
      Mike Levin

      Each cell has its own thread.

      1. Bence Krénusz Avatar
        Bence Krénusz

        I have an other question in my mind, which I dont know how would be possible to test. If there is an infinite number of Cells how would the curve look like. Would there be a probability Convergence to 1?

        1. Mike Levin Avatar
          Mike Levin

          Great question. I suppose we can just try some large numbers and see the trend and how it scales?

          1. Bence Krénusz Avatar
            Bence Krénusz

            I think that would certainly give an estimation, thank You. Looking forward the codes

  26. Bastian Avatar
    Bastian

    Very interesting work. I have the following questions/remarks regarding the clustering behaviour:
    Since you have different algorithm types, that have to take different paths through the solution space, isn’t it probable that they cluster during the journey?
    In addition, the cells and algorithms do share information via the place and value of the cells. Of course this is just implicit since they share the same enviroment. But this feels a bit like message passing on a graph or coupled oscillators. Still unexpected, interesting emergent behaviour.

    1. Mike Levin Avatar
      Mike Levin

      Thanks. It’s not clear to me that the different algorithm types take very different paths through the solution space, and I think even if they took the same path with respect to sortedness, there are lots of ways to achieve that without clustering *spatially within* the array, I think. And yeah, it’s entirely possible that they are implicitly doing something that is also well-described by message-passing or coupled oscillators – we’re checking those kinds of things now.

      1. frank schmidt Avatar

        Electron transport chains come to mind.

Leave a Reply

Your email address will not be published. Required fields are marked *