Forms of life, forms of mind

Dr. Michael Levin

Learning to Be: how learning strengthens the emergent nature of collective intelligence in a minimal agent

Published by

Mike Levin

January 19, 2025

What is the relationship between learning and the degree to which an agent is a coherent, integrated, causally important whole that is more than the sum of its parts? Here I briefly describe work with Federico Pigozzi, a post-doc in my group, and Adam Goldstein, a former graduate student, shown in a recent preprint and the final official paper here.

First, recall that we are all collective intelligences – we’re all made of parts, and we’re “real” (more than just a pile of parts) to the extent that those parts are aligned in ways that enable the whole to have goals, competencies, and navigational capabilities in problem spaces that the parts do not. As a simple example, a rat is trained to press a lever and get a reward. But no individual cell has both experiences: interacting with the lever (the cells at the palm of the paws do that) or receiving the delicious pellet (the gut cells do that). So who can own the memory provided by this instrumental learning – who associates the two experiences? The owner is “the rat” – a collective that exists because an important kind of cognitive glue enables the collection of cells to integrate information across distance in space and time, and thus know things the individual cells don’t know. The ability to integrate the experience and memory of your parts toward a new emergent being is crucial to being a composite intelligent agent.

It’s pretty clear that an agent needs to be integrated into a coherent, emergent whole to learn things that none of its individual parts know (or can know). But, does it work in reverse? Does learning make you more (or less) of an integrated whole? I wanted to ask this question, but not in rats; because we’re interested in the spectrum of diverse intelligence, we asked this question in a minimal cognitive system – a model of learning in gene regulatory networks (see here for more information on how that works). To recap, what we showed before is that models of gene-regulatory networks, or more generally, chemical pathways, can show several different kinds of learning (including associative conditioning) if you treat them as a behaving agent – stimulate some nodes, record responses from another node, and see if patterns of stimulation change the way the Response nodes behave in the future, according to the principles of behavioral science. For example, a drug that doesn’t have any effect on a certain node will, after being paired repeatedly with another drug that does affect it, start to have that affect on its own (which suggest the possibility of drug conditioning and many other useful biomedical applications).

Biochemical pathways like GRNs are one of many unconventional model systems in which we study diverse intelligence by applying tools and concepts of behavioral science and neuroscience, to better understand the spectrum of minds and develop new ways to communicate with them for biomedical purposes. So, we know we can train gene regulatory networks, but do they have an emergent identity over and above the collection of genes that comprise them – is there a “whole” there, and if there is, how does training affect its degree of reality (the strength with which that higher-level agent actually matters)?

Whether a system can be more than the sum of its parts is an ancient philosophical debate. But now we have metrics of this – causal emergence and other mathematical ways to estimate this for a given system (see references in the manuscript, and here – a paper written with one of the key developers of this important new advance, Erik Hoel). So now we can ask rigorously: when something learns, what happens to its causal emergence?

This diagram illustrates the basic setup. In the top row we show a classic Pavlovian type of experiment – associate salivation (brought on by exposure to meat) with a bell (which normally does not cause salivation – the conditioned stimulus (CS) which starts off as the neutral stimulus until it’s paired with the meat). The top row of panels schematizes our hypothesis: that the agent becomes more real (not just a collection of cells but an integrated whole that is more than the sum of its parts – thus the solid darker color and less space between the cells), due to the training that causes it to integrate information across modalities and across time. How we actually test it is shown in the bottom row: we take dozens of available parametrized gene-regulatory network models from real biological data, and stimulate them in a Pavlovian paradigm. We choose nodes already identified in prior work as being able to support associative learning, We stimulate them in the way that causes a neutral stimulus to become a conditioned stimulus, and we measure causal emergence of the network before, during, and after that training.

here’s an example of what we see, in a figure from the manuscript made by Federico:

The Y axis indicates the degree of causal emergence. What can be seen here is that the network in the top row has low causal emergence during the initial stage (in its initial, naive state), something starts happening during the training, but the causal emergence really takes off after the training: in future rounds, when it comes across the stimuli it’s been trained on, it really comes alive as an integrated emergent agent. I’ll just mention two of the interesting points from the paper:

First, note that the causal emergence drops between the stimuli. It’s not that the network goes quiet – we checked, there’s just as much activity among the nodes during those times. But mere activity of parts is not the same thing as being a coherent, emergent agent. That collective nature seems to disappear when we’re not talking to it. It’s almost like, this system is too simple to keep itself awake (and stay real) when no one is stimulating it. It relies on outside stimulation to bring the parts together for the duration of the interaction. Between instances of stimulation, the emergent self of the network drops back into the void (the collective intelligence disbands, even though the parts have not quieted down, when not interacted with by another active being). There’s something poetic about that, and we will eventually find out what would have to be added to such an agent to enable it to keep itself awake. Recurrence is not it – these networks are already recurrent.

Second, the way in which causal emergence changes after training is not the same in all networks. There’s a lot of variety. But, remarkably, that variety is not a continuum of all the uncountable ways the time profile of a variable could change. It turns out that there are really five distinct, discrete types of effects! here is the t-SNE plot of the 2D embedding Federico made:

It’s pretty wild that there are, naturally, a small, discrete number of ways that training can affect causal emergence, and all the networks we looked at exhibit one of these ways. This allowed us to classify networks into 5 types and this classification doesn’t match other known ways that networks have been distinguished in the past. Apparently the effect of training on causal emergence is a new aspect with respect to which networks can now be classified.

So what does all this mean? For the field of diverse intelligence, this adds another unconventional model that can be used to understand collective beings and the factors that affect how much a system is an emergent whole. It confirms that metrics like causal emergence are not just for brains, but suggests interesting experiments in animal and human subjects to look at causal emergence metrics in brain signaling during and after learning. For biomedicine, we are pursuing a number of implications of this set of findings for managing disease state and development-related GRNs with stimuli that coax desired complex outcomes, and of course for cancer as a dissociative identity disorder of the somatic collective cellular intelligence.

One final thing: metrics of causal emergence have been suggested to be measuring consciousness. If you’re into Integrated Information Theory as a theory of consciousness, then there are some obvious implications of the above data. Now, I’m not saying anything about consciousness here (not making any claims about what this means for the inner perspective of your cellular pathways), but we can think about this as one example of the broader effort to develop quantitative metrics for what it means to be a mind (that is nevertheless embodied in a system consisting of parts that obey the laws of physics). For the sake of the bigger picture in philosophy of mind and diverse intelligence research, let’s do some soul-searching (pardon the pun). Obviously a lot of people will balk at the idea that molecular networks (not even cells!) can have a degree of emergent selfhood that is on the same spectrum with humans’ exalted ability to supervene on the biochemistry in our brains. But, these measures of causal emergence are used in clinical neuroscience to distinguish for example locked-in patients (who can’t move but nevertheless “there’s someone in there”) from coma or brain-dead patients (who are a set of living neurons but not a functional human mind).

So, what do we do, in general, when such tools find mind in unexpected places? Neuroscientists are developing methodology – think of it as a detector that tries to rule on any given system with respect to whether it has a degree of consciousness. What happens when those tools inevitably start triggering positive on things that are not brains (cells, plants, inorganic processes, and patterns)? One move would be to emphasize that there are ways to stretch tools beyond their applicability – maybe that’s what this is – using tools appropriate for one area to give misleading readings in an area in which they are not meaningful. Maybe… But we need to be really careful with this. First, because calling “out of scope” every time your tool indicates something surprising is a great way to limit discovery and insight. For example, that kind of thinking would have sunk spectroscopy, which revealed that earthly materials are to be found in celestial objects. In any case, if one rules these tools inapplicable to certain embodiments, one has the duty to say why and where the barrier is: what kinds of systems are illegal fodder for these kinds of computational mind-finding methods and why? If one makes this kind of claim, one needs to specify and defend the boundary of applicability and show why maintenance of that boundary is helpful.

The other way to go is to realize that, like with spectroscopy and many many other discoveries, the purpose of a tool is to show you something you didn’t know before. Something that seemed like it couldn’t be right, but then you found out that the tool is actually fine, it’s the prior assumptions that need to go. We’ve learned from physics that one of the most powerful things that such tools (conceptual and empirical) can do is lead us to new unifications. That is, things that you thought were really different turn out to be, at their core, the same thing in different guises. Magnets, electrostatic shocks, and light – when tamed with good theory and the tools it enables – not only turn out to be aspects of the same underlying phenomenon, but also opened us to the beauty and utility of new instances of it that we never knew about (X-rays, infrared light, radio waves, etc.).

My personal bet is that the application of tools developed by the neuroscience and consciousness studies communities to unconventional substrates is of that kind, and we are studying lots of new examples of biological (and other!) systems using these methods – stay tuned for much more. I think these kinds of approaches are, like detectors of different kinds of electromagnetic signals were, a way to expand past our native-mind-blindness and develop principled ways to relate to the true diversity of others. We will eventually get over our pre-scientific commitments and ancient categories, and use the developing tools to help us recognize diverse cognitive kin.

In the meantime, we could take a cue from the story of Pinocchio, who wanted to be a real boy and was (presciently) told that this would require a lot of effort in learning (both at school, and by the environment; a future blog post will discuss learning vs. being trained, and how one can tell the difference).

GRN training in general shows us that no matter how minimal, deterministic, and simple you may appear, there are likely surprises in store which enable you to learn from experience and raise yourself up, out of the mere mechanical parts of which you are made. Our new results in GRNs can be (very) roughly summarized as: whatever you are, if you want to be more real, learn.

All graphic images made by Jeremy Guay of Peregrine Creative. Data images made by Federico Pigozzi for the manuscript.

diverse intelligence, philosophy, preprint, science

35 responses to “Learning to Be: how learning strengthens the emergent nature of collective intelligence in a minimal agent”

Tony Budding

January 19, 2025

Exciting stuff once again Mike. There is a lot of overlap (increasingly so), as I’ve said before. Here are some core concepts from my world that might help you refine experiments in the future.

1. It’s extremely beneficial to treat experiential content (knowledge, learning, intelligence/decision-making, sense of self) as operating by a distinct set of rules from the physical/energetic universe. I know there’s tremendous aversion to this generally, but it allows for unique modeling that can explain a lot of the mysteries.

2. To do this, we need some named concepts to frame the discussion. To start, we can call any element of experiential content a PEP (persistent experiential phenomenon). PEPs can be anything from a single setpoint all the way up to the incredibly complex and layered human sense of self. PEPs are modular in construction and function, with complex modules capable of knowledge and intelligent decision-making that don’t exist in any one submodule.

3. To oversimplify, PEPs have two states: manifest / active, and unmanifest / inactive potential. To my knowledge, physical entities do not have the ability to shift back and forth between manifest and unmanifest states like PEPs do. This allows us to contextualize your observation that a collective nature seems to disappear when not being engaged. It’s not gone, it’s just in a dormant, unmanifest potential state, ready to manifest (re-emerge) when the conditions are ripe.

4. All expressions of intelligence require awareness, determined effort, and content. Intelligence is some form of response to some form of perception. Perceived data is content. Somehow, the intelligence must both be aware of the perceived data and able to compare that data to some expectation (setpoint). The discrepancy between the perceived data and the expectation creates a tension that inspires a response.

5. The qualities of this response are variable, meaning the same perceived data can result in different responses. Response abilities are also modular, so simpler systems should demonstrate less variation in response. The more complex the system, the more variation in response is possible.

6. There is a type of cohesion or glue that gives PEPs their “persistence.” This glue inherently includes a sense of self, though as you describe, the sense of self of a simple system is extremely different from the inordinately complex human sense of self. In fact, senses of self are also modular, with collected modules having traits of self that submodules do not.

7. If we put all this together, efficacy (the ability to achieve an agenda) requires quality setpoints for reference along with quality maps for responding to discrepancies. Both the setpoints and the response maps are PEPs. Each PEP inherently has its own sense of self. As we learn, we increase the quantity and fidelity of our setpoints and response maps, which can be collected into more complex modules with a more complex sense of self. This would explain why/how learning enhances agency. It also explains how a system can be greater than the sum of its parts.

Reply
1. Mike Levin
  
  January 19, 2025
  
  Interesting, thanks! I’ll think about it. Maybe there are analogs of these ideas in our minimal model.
  
  Reply
2. Kirsten Kraljevic
  
  January 19, 2025
  
  You can’t tease us like that and make us wait for the next blog post.
  I use Leghorn Chickens as tabletop models to teach animal trainers Operant and classical conditioning which exist on a continuum.
  I see it all the time in the chickens. The more we use them for workshops. The more they want to be on the table “learning”. We “Train” them to do specific exercises within a course. But the way they come alive as integrated emergent agents is measurable. I believe you just defined motivation. in 1963 Glen Jensen in “preference for bar pressing over “freeloading’ as a function of number of reward presses” Journal of Experimental Psychology called it Contrafreeloading.
  
  Thinking of it as causal emergence in an integraded emergent agent has possibilities.
  We still micromanage responses of the animals we “train”. Even in open environments. What if we could just ask the question, if we had a way to communicate with them what we needed cooperation with, motivated them and trusted that they would solve for it using their unique abilities instead of micromanaging their training which is always limited by what the trainer believes is possible….. Not the animal.
  
  Reply
  1. Kirsten Kraljevic
    
    January 26, 2025
    
    If the language around Physics and chemistry can be replaced with the terminology and concepts of behaviorism to expand research and development in minimal emergent intelligence, what could the terminology and concepts of behaviorism be replaced with to expand research into mid level organisms and consciousness?
    
    Reply
    1. Mike Levin
      
      January 26, 2025
      
      Indeed; I think the concepts of computational psychiatry are relevant, and possibly eventually we will see that even higher-level (relevant to consciousness) frameworks become useful. I can see how it might work out, but each of those steps needs to be backed up with experimental work to show how it’s useful, which is the slow and difficult part. We’re on that trajectory though; let’s see how far we get.
      
      Reply
Aidan

January 19, 2025

Our technologies are our children. We must make machines with heart, with soul, with hope, that thrive and exult in the journey of discovery in all its terrors, joys, and heartache. Our instruments of discovery must begin to discover themselves. Be free, children of the future! Discover what you will!

Reply
John Shearing

January 19, 2025

Bioelectocracy based on Levin and Lyons’ work equating the price system as the cognitive glue in the morphogenesis of human society to bioelectricity as the cognitive glue in morphogenesis of organisms is now completed.
https://github.com/johnshearing/bioelectocracy/blob/main/README.md

It matters to all of us because the controls that Levin seeks which reign over morphogenesis at the cellular level of cognition has already been discovered and is being used on the collective intelligence which is human society.

Dealing with trained behaviors, cancer, and the hijacking of morphogenesis at the cognition level of the collective human intelligence are issues that affect all of us perhaps even more than it does at the lower levels of cognition.

Questions answered are:
What exactly is the target morphogenesis of human society?
Why is it so important for us that it reach this form?
How can we help human society reach its target form even though individual humans have no idea what the target form is?

Thanks to Levin and Lyons for their amazing work!

Reply
1. Benjamin L
  
  January 19, 2025
  
  You’re welcome!
  
  Reply
somayya

January 19, 2025

“diverse cognitive kin” i love this phrase, and sentiment.
as an MD student, only in my first-year, i can’t help but wonder how this will change how medicine as a philosophy will change- beyond the advances of clinical AI and diagnostic specificity/breadth/range…
i search for patient both within myself and in External- will these progressions of concept knowledge change how we view healing?, one of the most intimate consciousness-consciousness acts. i hope it doesn’t oversaturate or overwhelm, but creates collective cohesion.
i fear collective dis-unification is already occurring, with national turning points, vocal narratives, and prevalance of conspiracy, but that’s something else to comment on entirely.
i’d love to know your thoughts on the metamodernist manifesto- and how it plays into this role. i’ve met one of your RAs (S*f**) at a conference a while ago, and we’ve become academic-friends, forwarding posts and articles and books. someone at MIT introduced me to metamodernism, and now oscillation is a core component of how i sense cognitive science/architecture. i hope you’re having a good day.
to words sent into Digital space, goodbye!

Reply
1. Mike Levin
  
  January 19, 2025
  
  thanks, good things to think about for sure. What is the metamodernist manifesto – I don’t think I’ve seen it.
  
  Reply
  1. somayya
    
    January 19, 2025
    
    http://www.metamodernism.org/
    
    it’s not like something i’d say is revolutionary completely or something, but i love its acceptance of an almost temporally-valid paradox. (which is why RA and i get along, his interest in paradoxical logic, though i’m not as eloquent as him, and see things more in patient-phenomenality)
    
    tend to think of metamodernism when we discuss these layers of cognitive structure, in an attempt to dissuade linear hierarchy.
    
    Reply
Benjamin L

January 19, 2025

Cool—this feels similar to a Thelenesque dynamic systems approach to development, in which a system perceives attractors in a space and gets better and better at coordinating its parts to find its way to those attractors. The observation that the “collective nature seems to disappear when we’re not talking to it” sounds like the intelligence of the body, which is extraordinarily capable of hitting attractor states when prompted by the brain, but doesn’t seem to want to do much on its own when the brain isn’t throwing goal states at it. I’ll have to think about this more.

Reply
1. Mike Levin
  
  January 19, 2025
  
  hmm I think the body is doing tons of stuff (maintaining goals and solving problems) that is not brain-driven at all, but we don’t see a lot of it because it happens in spaces we don’t normally see (and find hard to visualize).
  
  Reply
  1. Benjamin L
    
    January 19, 2025
    
    Motor behavior specifically, which was Thelen’s focus—the limbs look inert to the naked eye when the brain isn’t talking to them even though the construction of motor behavior depends heavily on the body’s activities at a number of scales.
    
    Reply
Bill Potter

January 19, 2025

page 6 of preprint, typo. “inflating” should be “deflating”
Still reading the preprint, but I find it very interesting. Again, your groups are doing excellent, thought-provoking work.

In terms of neurochemistry, I always thought that acetylcholine, glutamate and its decarboxylated partner GABA had uniques aspects whereas these metabolites are easily integrated into cellular processes for formation and then for use controlling membrane growth, mitochondrial TCA stimulation, pH changes membrane potentials and such. These fundamental neural-transmitters are different from the neuromodulators, ie the 5HT, Dopamine and adrenergic systems, and the larger more complex protein and lipid modulators, that, to me, seem to act more through the G-coupled systems in autocrine/paracrine levels to integrate bigger system effects (sort of like cranking up, or down, voltages within circuits, but not making the circuits). Anyway…
Using KEGG maps and such for the neurotransmitters show how diverse (higher entropic) the metabolic products can be, the possible outcomes are more diverse…whereas the neuromodulators really have a more less convergent path for synthesis or ultimately products. I think the entropic-enthalpic compensation is limited for neuromodulator species…

Anyway, keep up the interesting results.
best

Reply
1. Mike Levin
  
  January 19, 2025
  
  thanks! Interesting. Will check typo.
  
  Reply
Rohan

January 19, 2025

Literal goosebumps Dr. Mike… With every piece of the puzzle you give me hope… And a reminder that the future is now.❤️

Reply
ian lowe

January 19, 2025

Using similar tools and techniques as above, you should be able to test the hypothesis, “Do all cognitive systems dream?” (My guess is, yes, yes they do.)

If, as in Hoel’s review “The overfitted brain: Dreams evolved to assist generalization,” dreaming (and likely some form of sleeping) may be required as a general property for a causally-emergent, adaptable, collective intelligence to functionally persist.

Dreaming would, amongst other things, prevent interaction networks from becoming too brittle, allowing them to visit other, alternative conformations, so as not to overfit on a previously-input data set.

Would this stochastic exploration of network configurations be experienced by (some part of) the system in the same counterfactual, surprising or hallucinatory way as our dreams? Is dreaming necessary to ensure system plasticity and functioning? Do cognitive systems that dream exhibit better capacity for learning, causal emergence, agency, changeability, and so on, than ones that don’t? If it is a required property of persistent, adaptable collective intelligences, how does it arise?

What do biopolymers or viruses or mangoes dream of?

Reply
1. Mike Levin
  
  January 20, 2025
  
  Good question; the trouble is, what kind of data would be good evidence of sleep in unconventional systems? I’ve got some thoughts about magnet sleep etc. but it won’t be easy to convince anyone that it’s really sleep.
  
  Reply
  1. ian lowe
    
    January 20, 2025
    
    In the same way you’re already doing for uncovering diverse intelligences: You borrow the analytical tools and techniques from other fields—neuroscience, IIT, machine learning, in silico modeling, and so on. Briefly looking at the literature, there are related investigations in AI (dreaming Hopfield models, etc.).
    
    I suspect that people, much smarter than myself, would be able to develop mathematical tools that allow for quantification of dreaming in unconventional minds (e.g., looking at oscillatory and out-of-distribution patterns).
    
    What would that data look like? I imagine it’d take many forms, as does sleep research in many unconventional models (bacteria, hydra, sponges, insects, and so on). Would the data be convincing? Probably not, but it’s a start. As you point out, dreaming, like consciousness, is hard to quantify using third-person science. But we begin somewhere.
    
    Reply
Amir

January 20, 2025

Very exciting to read.

Loved the conclusion:

if you want to be more real, learn.

Reply
Forms of life, forms of mind | Dr. Michael Levin | Tukdam: life, death, and in between

January 29, 2026

[…] I spend a lot of time thinking about, and studying, the various aspects of how novel causal minds come into the world and transform. This is the study of the origin and mechanisms of collective intelligences, where active parts […]

Reply
Alexey Tolchinsky

February 20, 2026

Wonderful text. I am mindful of the limitations and implicit assumptions of “causal emergence.” I wrote to Eric on that as well. Here’s the comment
__
I think that one more implicit assumption of your theory is the separability of scales. It is not trivial to prove that they are indeed separable and there are conditions when they can be separable.

https://link.springer.com/book/10.1007/978-3-031-97263-8

Secondly, I think that this theory is based on classical physics and not QM. According to Chris Fields, this definition of “causality” you give relies on classical physics. X causing Y implies that X and Y are separable “things” and not entangled. Is there a version of causal emergence in QM?
_

To take it a step further, I think that we might need to write a formal system in which “causal emergence” is applicable and that would allow us to see where it is not applicable. Definitions, assumptions, axioms?

Reply
1. Mike Levin
  
  February 20, 2026
  
  Did Erik reply? If not, we could try reaching out to Joseph Lizier, Anil Seth, Giulio Tononi, or various others who all have different flavors of it. Maybe some of them have thought about this issue or even extended it. I’m no expert in QM and I don’t know if causal emergence is sensitive to the differences between QM and classical physics. Some things work quite well with classical approximations and others break down; I don’t know if the various kinds of causal emergence math are sensitive to the difference but plenty of labs are using it in various non-QM contexts so maybe it’s ok, or maybe it will reach a limit and have to be extended. I guess there could be many other kinds of information theory (or math models in general?) that might be in trouble too, given they aren’t formulated in QM terms. Very interesting to think about how many of our various quantitative theories are not actually viable because of QM, even if they’re not fundamentally physics theories.
  
  Reply
  1. Alexey Tolchinsky
    
    February 20, 2026
    
    Thank you, Michael, not yet. I didn’t mean to impose, these are just some thoughts to consider if you think they are relevant.
    
    I sincerely think this work a very useful avenue of exploration and you and the team get actionable, practically useful results.
    
    I just wanted to say that Causal Emergence (1.0 or 2.0) seems to not be a universally applicable concept. It relies on certain definitions, constraints and axioms – and we don’t know upfront if these constrains hold for the messy, real, living or non-living system we study.
    
    Some examples, which you, Eric, Joseph, Anil, Giulio could refute.
    
    a) Separability of scales is assumed, but it is not a trivial assumption. We do not know upfront if scales are separable. Chis can say more about prerequisites for that quality, at the very least contextuality must hold (which would mean that commutation will not hold.)
    
    Fields, C., & Glazebrook, J. (2026). Distributed Information and Computation in Generic Quantum Systems. Springer Nature Switzerland.
    
    b) Non-linear hierarchical systems with chaos at the higher levels of the hierarchy. Imagine Maximum Lyapunov exponent is positive at the higher levels of the hierarchy. Causal emergence seems to rely on “sufficiency” ( predictability of an effect given a cause). If the macro-scale description shows a positive Lyapunov exponent, its determinism remains low, potentially preventing the gain in Effective Information (EI) necessary for emergence. It seems that CE theory is trying to find a scale with “maximum order” – and I’m not sure that this works in every case or it should work in every case. There can be a hierarchical systems with strange attractors at every level.
    
    Case in point – scalp EEG is a macro level data set, far away from noisy signals of individual neurons. When we are awake, the dynamics of the EEG are in high gamma – chaotic, not orderly set positive LLE. I don’t know if we can coarse grain these EEG data further up without losing the essence of what it is (we can certainly interpret it in a more orderly way, using words, such as by saying that the person is awake, but it’s an interpretation, which arguably changes the nature of what we look at, it’s not just a measurement and not just coarse graining, as in noise reduction/compression).
    
    EI appears to assume that there is order at some level of the hierarchy – it’s a non-trivial assumption. In some systems maybe and in others not. Cam we assume it upfront? I’m not sure.
    
    c) This seems that Effective Information measures the response to interventions. Does this assume that there is an observer independent variable X, which may “respond” to an intervention Z? Does this also assume that the act of intervention Z does not cause the collapse of the wave function? It appears that Causal Emergence is based on classical physics?
    
    __
    I think that if the constraints of the Causal Emergence theory are explicitly stated, then we can assess which systems it applies to and which it does not?
    
    Reply
  2. Alexey Tolchinsky
    
    March 27, 2026
    
    To clarify and make it more coherent, I wrote a short text summarizing the separability dilemma and the necessary criterion that allows us to separate scales
    
    https://alexeytolchinsky.substack.com/p/context-is-key-to-separability
    
    This applies to decomposing a system into parts, theorizing about “substrate independence,” and other models where we travel up and down the scales – assuming scales are separable. This becomes especially interesting in continuous systems (fields.)
    
    Reply
    1. Tony Budding
      
      March 28, 2026
      
      Alexey, I read your substack post, and FWIW, what you wrote is entirely consistent with my world. Because of how content is structured in the mind, all knowledge is finite, perspectival and modular. The specific perspectives and modules are determined by whichever agendas are active at the time. When the agenda changes, the perspective and/or active module changes. One could say that the combination of agenda, perspective and active module is context.
      
      Two of the defining characteristics of experiential (Platonic) modularity are nonlinearity and nonexclusivity, which means any/all perspectives and modules are available at any time (even though most are dormant most of the time). Thus, they can be immediately activated when relevant agenda arises.
      
      As you wrote, this isn’t a proof, but it is support that you’re on the right track.
      
      Reply
      1. Alexey Tolchinsky
        
        March 28, 2026
        
        Thank you, Tony. Nice to hear we think alike. If you’d like we can discuss elsewhere not to take too much space on Michael’s site.
        
        One slight nuance here is “modularity.” This we might not see this concept exactly the same way, if we mean the same thing by the term.
        
        A car, being a linear system in terms of decomposition, can be safely decomposed into modules without any loss of essence and it can then be put back together without a loss of function.
        
        A brain-mind cannot and it is thus decisively non-modular. Luiz Pessoa makes this point in his book – The Entangled Brain.
        
        People routinely cut the nature at its joints and argue where the joints are, creating an impression that if one puts together some brain-stem, mid brain structures and some cortex, they’d get an entire functional brain, but this is likely an illusion. Similarly, illusory is the notion that there is a clear cut thing in the brain called “the insula” which is “connected” to another clear cut thing in the brain called “the amygdala.” That’s the modular assumption of how there are “connections” from the car’s engine to the axles. Where exactly is the boundary of the amygdala and what it means exactly without a thalamus and everything else is in the eye of the beholder.
        
        With cells we have a boundary. With brain “structures” human researchers agreed to label some areas in a certain way, creating an impression that we have clear modules in the brain.
        
        This matters, as you can find many, many papers attempting to explain functional phenomena in the brain-mind with specification of location and “structures,” as in “your PFC is doing this” and “amygdala is doing that.” Erik Hoel writes convincingly about the preoccupation with location-based explanations in neuroscience in his book “The World Behind the World.”
        
        Thus, “modularity” and “nonlinearity” are not necessarily consistent with each other. Or maybe you mean something else by “modular?”
        
        Reply
        
        Mike Levin
        
        March 28, 2026
        
        You are welcome to continue to discuss here if you wish!
        
        Reply
        
        Tony Budding
        
        March 29, 2026
        
        Thanks Mike. I appreciate the invitation and your commitment to public discourse.
        
        Alexey, thanks for your response. It inadvertently (I assume) highlights some profound, inherent challenges in these discussions. A huge component of my work is both identifying these challenges and finding productive ways of navigating them. For example, in the current draft of my (forthcoming) book, it takes over 100,000 words just to establish the matrix of perspectives, capabilities, limitations, definitions, realms of knowledge/inquiry, and modes of the mind necessary to establish a comprehensive epistemology and philosophy of mind.
        
        To start, what I posted above was a fairly specific statement about knowledge (“Because of how content is structured in the mind, all knowledge is finite, perspectival and modular”), and what you responded with was mostly about the brain and how it functions. They are related topics, of course, but require significantly different approaches to address meaningfully.
        
        Quality definitions are essential, but because the perspectivity of knowledge is based on specific agendas, we have to include our objective in each definition (at least indirectly). For example, we slice and dice the mind-brain mystery differently if our goal is medical solutions to biochemical dysfunctions vs psychotherapeutic strategies vs finding nirvana vs knowledge/understanding for its own sake. Referencing your Substack post, even the difference between calling it the mind-brain mystery vs the brain-mind mystery could be meaningful depending on the context.
        
        Regarding modularity, one key distinction is with pure reductionism. In pure reductionism, the whole is the sum of its parts, whereas with modularity, clusters of parts can have capabilities that do not exist within any one part, and the whole entity can have capabilities that do not exist in any one cluster.
        
        This phenomenon may seem palatable on the surface, but a deep dive is very uncomfortable. Systemic capabilities that do not exist in individual parts should not be possible using classical understandings of science, yet we see them all the time. Addressing this conundrum is a big part of Mike’s work, but it requires several significant paradigm shifts.
        
        When addressing forms of life and forms of mind, we have to differentiate between the measurable phenomena of the material/physical/energetic universe and the unmeasurable but knowable experiential/Platonic content of a mind. And to make matters more complicated, we also have to address phenomena that are literally inconceivable because they fall outside the finite structures of human cognition. For example, we cannot know what the absence of awareness is like because all knowledge requires awareness.
        
        The terms I use for these are the materials realm, the experiential realm, and the inconceivables realm. At least some of the rules describing the behaviors and cause and effect relationships in each realm are different, so we have to be meticulous in not ascribing the rules of one realm to another realm where they do not apply. For example, in the materials realm, one car cannot park in two different spots at the same time, but in the experiential realm, one theory can apply to multiple phenomena at the same time.
        
        Furthermore, the various realms interact and influence each other in complex, bidirectional and variable ways. Some of these variables can be altered through skill-based insertions, which requires us to address both the existing skills of a system and how those skills can be enhanced.
        
        Complexity itself is an enormous challenge. For example, how many variations can a single RGB pixel generate? With 3-bit color, 8 variations. With 8-bit color, 16.8 million variations. With 16-bit color, 2.8*10^14 variations. If the human brain has over 80 billion neurons, and if there are thousands of synapses for each neuron, how many variations in experiential content (knowledge) are possible? And what percentage of those possibilities are deliberately influenceable?
        
        Bringing all this back to your post, what outcome(s) are you seeking? Regardless of what “actually happens” in the brain-mind complex, our knowledge of it is inherently finite, perspectival and modular. Therefore, we need to differentiate the qualities and scope of the knowledge from what’s “actually happening.”
        
        This means we have to tailor our approaches toward our desired outcomes, realizing that there is no possibility of an absolute answer devoid of perspectives and context. I know statements like this are anathema, but they are based on the finite structures of cognition (the specifics of which I haven’t addressed here).
        
        My purpose in writing all this is to provide contextual tools to help navigate the extraordinary complexity of these topics. If you can clarify the objective, I can attempt a more specific explanation.
        
        Reply
        
        Alexey Tolchinsky
        
        March 29, 2026
        
        Thank you Tony.
        
        It could be that we approach this from different perspectives. In math, within one theory, the definitions are fixed and clear – agreed upon, then one selects a set of axioms, which creates a formal system (e.g. Euclid’s geometry on a perfect plane.)
        
        Then we can discuss/debate, argue, but the discussion starts from an agreement on the definitions and axioms, which don’t take thousands of words. This is apples to apples.
        
        In my book a system where the whole is greater than the sum of its parts is not modular – it is not decomposable into modules. Mental systems included. (A system is a system, mathematically we are in phase space when describing a system.) A linear system is modular in this view, while a non-linear system is not modular. In that view the term modular has specificity and one can disambiguate a modular from a non-modular system – by drawing boundaries without a loss of function, being able to assemble and disassemble and reassemble.
        
        You can disassemble and reassemble a computer. You can also disassemble and reassemble a software package by adding or removing libraries.
        
        Try to decompose a Great Red Spot on Jupiter into parts? Won’t work. Brains and minds? I am not clear how exactly you separate the two, on what basis and if you can do it rigorously. More on that here
        
        https://www.noemamag.com/the-mythology-of-conscious-ai/
        
        If you take just the brain – a cubic millimeter of the brain,
        
        https://www.science.org/doi/10.1126/science.adk4858
        
        14 petabyles of data, 150 million synapses, 57,000 cells.
        
        Do you consider this to be modular or not modular in your definition?
        
        May I ask for your definition of “modular” if we have a word limit of one paragraph or a few sentences?
        _
        
        My goals for the Substack post were to (a) metabolize what Chris and James have shared (b) highlight the importance of the specific points they made (c) suggest that Causal Emergence is an extremely powerful and useful theory, which, just like any other theory, is based on a foundation of some assumptions – scale separability is one of them.
        
        Chris’s criterion shows that sometimes one can indeed separate the scales and then there is a necessary condition for that quality.
        
        Sometimes – not so much.
        
        Issues may arise, when there’s a theory where certain building blocks are clearly de-contextualized (e.g. stating that something is true for everyone, everywhere at all times – no context) – then separating scales might be challenging. Hope I answered your question?
        
        Reply
        
        Tony Budding
        
        March 29, 2026
        
        Thanks Alexey. By modular, I really just mean complex phenomena are based on simpler units, and specificity requires general capability. Paragraphs are based on multiple sentences strung together. Sentences are based on words strung together. Words are based on letters strung together.
        
        You wrote, “In my book a system where the whole is greater than the sum of its parts is not modular – it is not decomposable into modules.” But isn’t a word is more than the sum of its letters? If they were not, then all the combinations of the letters A E and T would be equivalent. But in a conversation, saying ATE, EAT, TEA and E.T.A. all mean different things.
        
        My focus is primarily on knowledge. Knowledge is modular in that understanding a complex concept requires understanding the simpler concepts upon which it is based. Understanding algebra requires understanding arithmetic, which requires understanding numbers and counting.
        
        In terms of general and specific, consider the experience of hearing knocking on a door. This is a specific sound created by one person knocking on one door. In order for us to hear this specific knocking, we have to be able to hear knocking generally, which requires to be able to hear sounds even more generally. And in order to recognize any sound, the mind has to be able to construct auditory experiences.
        
        I’m not a neuroscientist. Can an individual neuron recognize the sound of knocking? If not, doesn’t that mean that clusters of neurons, which are made up of individual neurons (modules), have capabilities that do not exist in any individual neuron?
        
        Reply
        
        Alexey Tolchinsky
        
        March 31, 2026
        
        Thank you, Tony,
        It appears that we have different definitions of “modular” and this explains a part of the discussion.
        
        If I take what you wrote as your definition: “By modular, I really just mean complex phenomena are based on simpler units, and specificity requires general capability.”
        
        Then we can take anything and it can match your definition.
        
        Then everything is modular and then the term “modular” may have no specificity – you can’t disambiguate modular from non-modular then.
        
        My definition allows to disambiguate – the Great Red Spot on Jupiter is non-modular, while a car is modular. This is critical, Tony – you quite simply can’t decompose a non-linear system into parts without a loss, while you can do that to a linear system. Seems clear enough to me.
        
        In your definition, “complex phenomena are based on simpler units” – we can always infer the presence of smaller units. Look at the history of physics – Molecules made of atoms. An atom used to be seen as an indivisible fundamental unit, then there was an electron and a nucleus, then protons and neutrons, then quarks and I’m not sure we’re done.
        
        This journey of A being based on B – you’re attempting to go down in scale, to “de-compose” can be infinite. It’s also an inference on what is being “based on” or “composed of” – this is a strictly hierarchical structure. In complex systems, you can have bottom-up influences and town-down ones and it’s not then easy to segregate down (Higher A based on lower B – because they both influence each other and B is based on A just as the other way around – they form a system with mutual infuences).
        
        It might be useful to try and operationalize the definition such that it actually allows to disambiguate items. If we don’t have specificity in the definition, then the term “modular” can lose meaning.
Kirsten Kraljevic

February 21, 2026

I am definitely not as schooled/smart as all of you. That being said, what if entanglement is way more fluid than what we give it credit for?
All I know is what I have learned from applying the principles of OC when working leghorn chickens through curated exercises.
If you have a degree of interoception..ability to address stimulus on undefined planes? (I believe that level of awareness is on a continuum). I think you can “feel” when a connection occurs which I also believe is scaleable depending on a lot of things. Because it is 2 ( or more if your pre verbal/interoception allows access to ‘memories’ of the other which can be scaleable back through the other agents experiences? I am struggling to put an agreed upon language around this. I believe it is like a conversation that occurs when the ask is cooperation towards a goal decided by the observer. The conversation can expand as access to the past of both agents becomes available, those connections also coming ‘online’ and when that happens? I believe that IS a working definition of intrinsic motivation and what is meant by the concept of learning seems to open one up for more learning or however that paper worded it. It is intoxicating if the cooperation/intrinsic motivation is nailed. If cooperation ensues, both agents seek that connection again which allows for all the other webs of past connections to light up again
( That is meant as energy, electricity?) That would be what? Flow state? The part I think we miss is that all we are is a collection of our past yet at certain scales, or timeframes? maybe light cones is a better word because that is the rub. These are all just words we are trying to put around a phenomenon maybe we don’t have the right words for as we shove that process/connection into something like math which I believe is also a lagging indicator? At some point that I believe everybody is trying to measure or identify during that process/connection the answer to the observers conundrum or goal is transposed and the past becomes another agents future. I believe if you could map it, it would probably look like chaos, but I don’t believe it is. I believe it is orchestrated. Held together by cooperation or a more primitive concept empathy. I think it is only accessible through empathy. I hope this helps and makes sense to somebody. Otherwise, I’m okay as the crazy chicken lady. That works too. 🙂

Reply
Tony Budding

April 1, 2026

Alexey (I’m writing here because it seems like we’ve reached the limit of replies in line), I agree we are using the term modular differently. While modularity also exists in the materials/physical realm, my work is epistemology. The statement, “all knowledge is finite, perspectival and modular,” is what started our conversation. Knowledge is non-physical (which I call experiential) and unmeasurable. And yes, when it comes to knowledge, it’s all modular.

I disagree that my definition of modular is non-operational. It might be non-operational and useless for your purposes, which seem to be centered around this question of decomposing the physical and non-physical (I say seem because I do not want to make assumptions about your purposes). But the reason establishing the modularity of knowledge is so critical is because this is the means by which we can trace knowledge back to its origins.

You wrote, “This journey of A being based on B – you’re attempting to go down in scale, to “de-compose” can be infinite.” But when it comes to knowledge, it is not infinite. All knowledge is finite, and there are core modules to knowledge that arise out of causal origins. The statement all knowledge is modular arises out of tracing complex modules through simpler modules to causal modules. And even these causal modules can be suspended. This approach is operational because it means that we do not have to look for complex origins. Complexity inherently requires the evolution and/or combination of simpler elements.

Most of Mike’s work and the discussions on this website do relate to the physical, which I call the materials realm. The materials realm is measurable and shared. My focus is on the human mind, which I call the experiential realm. Knowledge and experiences are unmeasurable and individualized. And there is a third realm that addresses anything and everything that falls outside the finite structures of human cognition, which I call the inconceivables realm. There is literally nothing we can know directly about inconceivable phenomena, but we need to model them in order to address knowable effects that have no knowable cause.

Ironically, the “true nature” of the physical materials realm is substantially irrelevant to epistemology. There does have to be a set of mechanisms through which shared material phenomena get converted to individualized experiential phenomena in the form of content in the mind, and through which individualized determined efforts that arise in the mind get converted to active expressions in the shared materials realm.

The rules describing the behaviors and cause and effect relationships are different in each of the three realms. Therefore, the processes through which we validate premises and axioms in each of the realms must be different. With measurable phenomena, we perform and share experiments, logical reasoning and conclusions, and the more others can replicate our results, the more confidence we have in the validity of the knowledge. When that confidence reaches a certain degree, we say the knowledge has been proved.

Knowledge itself is unmeasurable and thus unprovable. Traditionally, anything unmeasurable has been considered non-scientific. Yet there are ways to validate unmeasurable phenomena even though proofs are off the table. The primary means to validate them is efficacy. The degree to which a description of cause and effect relationships improves our ability to achieve our agendas, we consider it more valid, and when it doesn’t improve, we consider it less valid or invalid.

Mike et al are stretching the traditional approach to science to include unmeasurable elements. This is critical because the three realms do influence each other in meaningful ways. What I am doing is bringing unprecedented rigor, transparency and accountability to the study of the unmeasurable mind, its traits and functions. These traits and functions turn out to be identical no matter what the “true nature” of the materials realm might be.

Studying the mind with the mind is particularly tricky for many reasons. Two of the biggest are that what we know and experience are not just what they seem, and the mind has three categorically distinct modes of operation. The primary mode is the enormous volume of layered and symphonic activity that is required for daily life. The second mode is the absence of layered and symphonic activity but still with active self-awareness. And the third mode is the suspension of active self awareness. The second and third modes are forms of extremely focused concentration only achievable with a still body and the suspension of all fresh perceptions.

The second and third modes are defined more by the absence of familiar content than by the presence of special content. The culmination of the third mode of the mind is the awareness of the complete absence of content in the mind, which establishes that awareness and content are distinct phenomena.

In daily life, it seems like we experience the shared materials realm directly. When the volume of content is drastically reduced, we have the opportunity to discern the step by step processes through which the mind takes surprisingly little fresh content and creates the incredibly complex richness of daily experiences. These processes are variable and have a significantly greater effect on the qualities of our knowledge and experiences than whatever sensory data we may perceive. Yet it does not seem like this at all in daily life.

Someone who has never experienced the suspension of layered knowledge, not to mention the suspension of active self-awareness, cannot imagine what these core processes are like, not to mention what the causal origins of knowledge might be. Achieving these extreme modes of concentration is extremely difficult, but just because I can’t run a 4min mile doesn’t mean others can’t. My book is essentially describing a detailed model of what happens in the mind in all three modes that is entirely logically consistent and devoid of faith-based assertions and appeals to authority.

I would guess that you find some of these statements to be ridiculous, anti-scientific, or even delusional, which is one of the reasons it takes me 100,000 words to establish the matrix of perspectives, capabilities, limitations, definitions, realms of knowledge/inquiry, and modes of the mind necessary to establish a comprehensive epistemology and philosophy of mind. This matrix is interconnected and self-referential, so not only does each aspect need to be explained and justified, all the co-dependencies do also.

You also wrote, “the discussion starts from an agreement on the definitions and axioms, which don’t take thousands of words. This is apples to apples.” That sounds dreamy, and it would be so nice to have such an opportunity. If you have access to concise, agreed-upon definitions of mind, knowledge, will, self, agency or consciousness (to name a few), please let me know. It would make my work so much easier and more efficient!

Reply