Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Can you watch yourself thinking?

Consider how you are reading this book, or any book for that matter. You probably go back and forth across a page, and when met with a complicated part, you may ask yourself: What is the gist of this sentence? What is the main idea here? And that familiar part of your mind springs to life, highlighting and summarizing, making decisions, doing the hard work. You can watch it think, go through the usual paces. You probably did that thousands of times. But the other part, that makes requests, can you watch it think? Where is it? What is it? It is there. Inaccessible.

Clearly, these parts, one working and one directing, are related to each other. They speak the same language, after all. They are almost the same, but different. What if we could figure out how they are related? Then, we could work backward from what we observe to what must be there.

Consider the fundamental constraints on energy, on space, on time that we have. You don’t have infinite energy. Your biological brain is probably about 1.4 liters in volume and uses 20 watts of power. So every one of your thoughts costs calories, but the world is infinite. There are more books than you could ever read in your limited lifetime, as simple as that. So you must select, by necessity. Everyone does that, but not everyone does it the same. What makes our filters different? What do we filter for exactly?

Language models can do work our minds can do. Is this intelligence? What kind of intelligence? How does it compare to mine and yours? Not every intelligence is equal. Say, a toad operates in a separate level of comprehension compared to the insects it eats: a fly has no idea what hit it. What are these levels? That’s not about brain size; there are countless species with larger brains than humans, yet they are not that smart. So there’s something else that makes us smarter. What is the structural reason we are as smart?

But first, we have to establish the ground rules. Surprisingly, there are only a few rules.

The Constraints

Boundary

I’m only interested in living systems with a defined boundary: it created the first distinction, inside versus outside. Viruses and self-reproducing enzymes that predated early protocells are out of scope.

Sufficiency

Our environment has infinite ways to kill us. In a world full of threats and competition, a simple life didn’t stand a chance. Simplicity wasn’t getting a pass.

On the other hand, life always had finite means to persist. Every invention, every bit of structure, every pathway took a toll on finite energy, nutrients, and time. Complexity has an upper bound.

Life had to increase complexity only as much as required. Sufficiently so.

Correspondence

Just like if there’s an apple in your hand, there’s an apple in your mind. If there’s a physical distinction, then there’s a conceptual distinction. If the distinction counts, it is in the language. I’m using the physical structure as a map to find the conceptual structure and discover meaning. Our language is the evidence that the map is accurate.

Let’s see how these constraints were in play 4 billion years ago.

The Membrane

When the first protocell appeared four billion years ago, its membrane barely contained the self-reproducing enzymes of the early life to create an individual unit of natural selection. Without a membrane, a better RNA molecule would benefit every other molecule in the pond equally.

Just as it created the first physical distinction between what was “self” and what was “not self,” it made the first categorical distinction: inside vs. outside.

To use this categorical distinction as a tool for organizing concepts, we need a test.

  • Internal ideas are concepts that relate to what exists or happens within the boundary. If an outside observer must cross an opaque boundary to perceive it, the idea is internal.
  • External ideas are concepts that relate to what exists or happens outside the boundary. If an outside observer can perceive it without crossing the boundary, the idea is external.

Relative to a given boundary, every concept, no matter how complex, can be sorted according to this most basic distinction: is it “internal” or “external” to the boundary, just as when you cut a piece of paper in two, you have a left side and a right side. There is no “third side” created by the scissors.

A book’s weight is an external concept, measurable without opening it. The narrative within requires boundary crossing: an internal idea. A person’s thoughts? Internal, hidden behind the boundary of the skull.

A conversation is internal to the room (an outsider must cross to perceive it), external to each participant at the table (they perceive it directly, no boundary to cross).

Early life didn’t stop at the fragile fatty acid shells; it needed semipermeable membranes that were selectively open, and they uncovered more categories.

The Four Operations

Even a primitive membrane created a tiny physical separation, allowing an ion gradient to form. The charge gradient generated a voltage across the membrane, and, as the membrane is an effective insulator, the cell accumulated electric charge.

With a charge difference, you have a work capacity: the cell moved from being a prey of its environment to being an agent that could manage inputs and outputs, inner and outer states.

The early cells went from

  • passive diffusion to active harvesting for nutrients and building materials (moving ions from outside to inside through the membrane)
  • simple osmotic pressure regulation to active waste management (moving ions from inside to outside through the membrane)
  • background noise to active sensing (by detecting ions attached outside)
  • Brownian motion to structural state (by detecting ions attached inside)

Early life went from knowing one categorical distinction to learning four primitive operations, single movements across or along the boundary.

Each operation has a starting point and an ending point relative to the boundary. As the single boundary creates exactly two locations, there are only four possible pairs of sources and destinations.

Imagine watching a highly acclaimed dancer. Every one of their moves and steps has a name in the precise vocabulary of dance. Without names, we can’t see the patterns and anticipate what should happen next. Without names, we can’t understand the dance.

If every signal is a piece of information, what happens when a cell receives an ion from outside? It now possesses the information. It knows. When an ion is detected on the outside of a cell, the cell senses it. When membrane tension changes from within? The protocell feels it. When a protocell expels an enzyme to break down food, it predicts the food is there.

The four operations:

  • Know: External information internalized across the boundary.
  • Sense: External signals detected without crossing.
  • Feel: Internal states detected at the boundary.
  • Predict: Internal information externalized across the boundary.

At a glance:

The four operations

To InsideTo Outside
From OutsideKnowSense
From InsideFeelPredict

The Centipede Test

For each physical distinction, there’s a conceptual category. We started with one: internal versus external. Now we have four new ways to organize ideas.

How do we know we are discovering nature rather than imposing categories? Just as a centipede doesn’t need to understand leg coordination to walk, a protocell knew nothing about signals and information. Every pattern we identify must operate in systems that have zero concept of what they’re doing. If it requires understanding to work, we failed.

But how do we tell if an idea is an idea of knowing or feeling? One question isn’t enough.

New Attributes

Where are these operations relative to the boundary? If we look at the final points of each operation’s movement, Know and Feel belong to internal ideas, while Sense and Predict belong to external ideas.

OperationMovementExternal or Internal?
KnowExternal → InternalInternal
SenseExternal → ExternalExternal
FeelInternal → InternalInternal
PredictInternal → ExternalExternal

Knowing if an idea is external or internal to the boundary does not allow us to classify it into four buckets. As each question cuts the possibility space in half, we need at least one more question or attribute to specify exactly one of four binary classes: say, if an idea belongs to knowing or feeling.

Let’s call attribute A the external/internal distinction.

  • Know and Feel share A=Internal, so they must differ on B
  • Sense and Predict share A=External, so they must differ on B

So, Know and Sense share the same value for the unknown attribute B, while Feel and Predict share the opposite value of the attribute B.

Alternatively, Know and Predict could share the same value for a different unknown attribute B’, while Feel and Sense share the opposite value of attribute B’. Both attributes, B and B’, satisfy our requirement.

If we look at the truth table with all three attributes, we immediately recognize the binary logical function XOR (exclusive OR).

ABB’
000
101
011
110

So B and B’ are independent of each other, but neither is independent of A (knowing A and B determines B’ while knowing A and B’ determines B).

How does XOR work in action? Close one eye while keeping the other open - you’ve made a wink. This gesture exists only when your eyelids are in different states. Both eyes open? No wink. Both closed? No wink. Different states? Wink appears. That’s XOR, creating something new from differences.

We are looking at more than we bargained for: a third binary attribute that is a valid classification pair to the first attribute. Both B and B’ can be considered a third attribute relative to each other.

Whichever attribute we chose as the canonical attribute (a standard representation, not a superior one) for the classification basis, the other emerges as the third. Neither B nor B’ is “more primary” because they’re both describing the exact structural requirement: the second bit.

For simplicity’s sake, let’s take the grouping where Know and Sense share a value as our canonical attribute B, in addition to attribute A, so that the third attribute will be A^B.

For operations, we have

  • One inherited attribute (external/internal), A - inherited from the membrane
  • Two emergent attributes (B, A^B) - they appeared first for operations
  • And two canonical attributes (A, B)
OperationABA^B
KnowInternal00
SenseExternal01
FeelInternal11
PredictExternal10

This constrained triple’s redundancy is practically convenient: classify using any two, then test it with the third. If the third disagrees, it means you made an error somewhere.

But as with dance moves, the next logical step is to name B and A^B.

Naming Attributes B, A^B

Attribute B

B = 0B = 1
Know: External → InternalPredict: Internal → External
Sense: External → ExternalFeel: Internal → Internal

What is common between Know and Sense, and separates them from Predict and Feel? What is common between Predict and Feel?

The starting point, the origin of the signal.

  • For Know and Sense, it is the external world, objective reality.
  • For Predict and Feel, it is the cell itself, the subject.

So Know and Sense are objective operations, while Predict and Feel are subjective operations, and the categorical distinctions are for objective ideas and subjective ideas.

Attribute A^B

A^B = 0A^B = 1
Know: External → InternalSense: External → External
Predict: Internal → ExternalFeel: Internal → Internal

What is common between Know and Predict, and separates them from Sense and Feel? What is common between Sense and Feel?

Sense stays outside the boundary, Feel stays inside the boundary. These operations preserve the immediate, particular nature of the signal, capturing raw, unmediated experience. They just are, and can be grasped directly without comparison.

So, Feel and Sense are absolute operations, complete in themselves.

Some examples of absolute ideas:

  • Blue: You perceive blue directly; there is no need to see a different color alongside it.
  • Triangle: A recognizable shape existing without comparison.
  • Hunger: When you’re hungry, that’s just the absolute experience.
  • Pain: The sensation exists in itself, incomparable.
  • Wall: Lean against it, and it’s just there.
  • Love: The feeling exists whole, not as a comparison to non-love.

Know and Predict cross the boundary in either direction. When signals cross the boundary, they change the information’s context, integrating with the subject or influencing the environment. These signals are now defined relative to either system’s internal state or environments.

Know and Predict are relative operations that exist through a relationship to context.

Some examples of relative ideas:

  • Structure: Only exists as a sum of parts. No parts, no structure.
  • “Taller than”: Only makes sense when comparing two heights.
  • Speed: Relative to the ground? To other cars? Air speed?
  • Progress: Only meaningful relative to a starting state.
  • Champion: Can’t be champion without others to beat.
  • Friendship: You can’t be a friend in isolation.

All Three Attributes

OperationMovementExternal or Internal?Objective or Subjective?Absolute or Relative?
KnowExternal → InternalInternalObjectiveRelative
SenseExternal → ExternalExternalObjectiveAbsolute
FeelInternal → InternalInternalSubjectiveAbsolute
PredictInternal → ExternalExternalSubjectiveRelative

If a distinction counts, it’s in the language. These words existed before. Philosophers have argued about the objective and subjective for millennia, and the distinction between absolute and relative runs through physics and ethics alike. We didn’t name the categories - we found that we already had them named. They were in the language all along.

Next, let’s talk uncertainty. I chose clear-cut examples to make a point, but our language is much more diverse than three binary partitions or a taxonomy with four simple buckets. The same word often has different meanings depending on context, and even within the same context, not every choice is obvious - probabilistic inference to the rescue.

Probabilistic Inference

When considering complex concepts like friendship, we often struggle to place them definitively. Is friendship more internal or external? More objective or subjective?

Infinite precision is infinitely expensive. By converting this uncertainty into confidence levels, we’ll get usable answers from partial certainties on the cheap.

Degrees, Not Boxes

Let’s replace the binary yes-or-no with degree measurements. Instead of “Is this internal or external?” we ask “How internal and how external is this?”

Then, the probability for a given category is simply the product of the probabilities.

\[ \begin{aligned} P(Know) &= P(Internal) \times P(Objective) \times P(Relative) \\ P(Sense) &= P(External) \times P(Objective) \times P(Absolute) \\ P(Feel) &= P(Internal) \times P(Subjective) \times P(Absolute) \\ P(Predict) &= P(External) \times P(Subjective) \times P(Relative) \end{aligned} \]

Think of the friendship that you had with your best friend. You express it through visible actions (external), but base it on personal feelings (subjective). You understand friendship in comparison: it is closer than acquaintanceship, less intense than romantic love (relative).

  • P(Internal) = 0.35, P(External) = 0.65 (not fully external, but more outside than inside)
  • P(Objective) = 0.3, P(Subjective) = 0.7 (more subjective, starts with you)
  • P(Absolute) = 0.15, P(Relative) = 0.85 (strongly relative)

P(Predict) will have the highest score with 0.39 (39%), followed by P(Know) at 0.09, P(Feel) at 0.04, and P(Sense) at 0.03.

You did not need to decide if friendship is definitely external. Predicting still won with 39%, with everything else following far behind. You did not need 100% certainty to make decisions that work.

That 16% attributable to other operations is what we lost to aliasing, across many dimensions of friendship (a complex socio-biological phenomenon), to fit it into one of the four boxes.

Selective Filter

If we compute the totals, we find that almost half the probability (46%) landed outside any operation, more than the highest answer. It isn’t aliased-away information about friendship; it’s information that meant nothing, void.

Those combinations did not correspond to anything: no operation is all internal, subjective, and relative, but this contradiction accounts for 21% of the total probability. The system excluded background noise using structural constraints between its own attributes.

Recall the sufficiency constraint. If the environment has infinite ways to kill you, then most structures of matter don’t persist. Life has to find the tiny subset of designs that work against perpetual chaos.

The more attributes you add, the more selective the filter becomes. A classification with more types and attributes that selects only 5% meaning from the chaos suggests that coherent structures are rare.

Protocells did not stop at learning four operations just to listen to the world - they discovered how to act, to respond, in the same four-verb language.

From Operations to Functions

For early enzymes, it was a great fortune to stumble upon a membrane. They were never parting ways. Being able to let stuff out selectively, or let in only the good things, is excellent, but letting toxins in along with the good stuff is a death sentence. Sometimes you just need to get away from the danger, or from being digested alive. Protocells that connected detection to action gained an edge, an evolutionary advantage.

When you have a vocabulary and make a sentence, do you invent a new word every time? No, you use the words you already know following the rules of grammar. If a protocell reads the world in categories of either knowing (K), feeling (F), sensing (S), or predicting (P) - the vocabulary, then the rules for how we can combine them are the grammar.

Grammar of Survival

In a world full of threats and competition, the smallest possible grammar with a two-word sentence already puts any protocell ahead of the pack. It is also the cheapest and the fastest one. Just add a word, and the metabolic costs jump, and the reaction time increases. If a protolife had to react, the reaction had to be spot on.

For this, we want the smallest possible set of pairs of events and reactions (or functions that map specific inputs to specific outputs) that can handle everything that life throws at a bounded system.

16 Possible Pairs

With just four categories in our vocabulary, we have only 16 possible pairs.

\[ \begin{aligned} K &\to K & S &\to K & F &\to K & P &\to K \\ K &\to S & S &\to S & F &\to S & P &\to S \\ K &\to F & S &\to F & F &\to F & P &\to F \\ K &\to P & S &\to P & F &\to P & P &\to P \end{aligned} \]

Repeating that a cat is a cat adds nothing new. For a protocell, it is a wasted effort. Pairs such as \(K \to K\) and \(F \to F\) do not create a selective advantage and do not belong to our grammar, so we are left with 12 pairs.

\[ \begin{aligned} K &\to S & S &\to K & F &\to K & P &\to K \\ K &\to F & S &\to F & F &\to S & P &\to S \\ K &\to P & S &\to P & F &\to P & P &\to F \end{aligned} \]

The Halting Problem

Four of these are fragile. For example, a jump from Know (Internal, Objective) to Feel (Internal, Subjective) stays internal, going from objective to subjective in a single step, changing the meaning of a signal (a molecule) without doing any work on it on the boundary.

Such a jump changes the internal state, which can change it again, and so on. Whether it converges depends on the specific chemistry, the specific initial signal, and the specific internal conditions. Predicting whether this loop will halt is exactly as hard as the general halting problem: the cell would need to simulate itself to completion to know if it ever stops, which is the very thing Turing proved impossible. The simple life did not take chances and completely excluded them.

A counterexample: an internalized molecule (K) creates a sensation in a membrane protein (S) beyond the activation threshold. It just validated the existing inconclusive opinion about the environment: the boundary serves as a reality check, grounding the action in certainty. The cell responds and persists, without any risk of non-termination.

Here’s another fragile example: a protocell detects (S) a chemical at its surface, sensing something. Instead of internalizing it (K) or triggering an internal response (F), it directly secretes (P) a molecule outward based on what it sensed. Now the secreted molecule is in the same external space that the cell is sensing, and the cell senses its own secretion. Predicts again, senses that prediction. Rinse and repeat, burning scarce fuel and reducing the chances of survival. It might halt, it might not, but the cost is not recoverable.

The Rule of the Boundary

Changes in the meaning of a signal (from objective to subjective, or vice versa) are only allowed after crossing the boundary because, without it, the flip is a relabeling, the same as if the system declared that the signal is more than it is, based on nothing but the system’s own prior state, creating an unstable function.

Following this rule, we have to exclude these four:

  • Know (Internal, Objective) \(\to\) Feel (Internal, Subjective)
  • Sense (Objective, External) \(\to\) Predict (Subjective, External)
  • Feel (Subjective, Internal) \(\to\) Know (Objective, Internal)
  • Predict (Subjective, External) \(\to\) Sense (Objective, External)

Final Eight

We’re left with a minimal, sufficient yet stable set of 8 functions.

\[ \begin{aligned} K &\to S & \qquad S &\to K \\ K &\to P & \qquad P &\to K \\ F &\to P & \qquad P &\to F \\ S &\to F & \qquad F &\to S \end{aligned} \]

I arranged the functions so you can see the pattern: it is a cycle graph on four vertices (\(K\text{-}S\), \(S\text{-}F\), \(F\text{-}P\), \(P\text{-}K\)), where only moves to adjacent positions are valid.

Cycle graph of four operations

You can’t reason about eight unlabeled things, even if they are the fundamental verbs of existence, valid moves a living thing can make to stay alive. Billions of years old, they left a deep imprint on our language, and if a move is real, we have a word for it.

Naming Functions

A bunch of cars are moving erratically, some standing still, and some barely moving. The stopping lights are lighting up and fading, as far as you can see. Clearly, it is a traffic jam. If you know the pattern, you know how to reason about it: every traffic jam has an average flow rate and total length, and if we know the properties of this emergent phenomenon, we can predict when we’ll get there. Very useful indeed!

The make, model, and exact position of every car on the road describe the recipe of a traffic jam, but do a poor job of explaining the outcome, just as the list of ingredients can’t alone explain the chiffon cake. Just like this, Know to Sense or Feel to Predict are only recipes, not flavors that emerge from these simple parts.

But we can’t just use any names for these flavors. If a name, an idea, falls strictly into one of the pre-existing four buckets, it is too primitive to capture the emergent flavor accurately. Rather, good names must follow a cause-and-effect structure, linking concepts from the earlier four buckets.

Ideally, we want the names to be memorable and distinct from each other, while avoiding confusion. Say, if an idea can be subjective or objective depending on the context, it won’t work.

My actual process for coming up with these mnemonics was much less straightforward and messy. I started with some initial ideas, used them to uncover attributes, and then iterated to find better names. I’ll list these bridge ideas where they make sense.

Know → Predict: Impression

You are in a windowless room. You know there are three apples in front of you, red, green, and yellow. Now, lit by pure blue light. Your best guess is that one deep-black apple is probably a red variety, and the lighter blueish-gray apples are either green or yellow. Blue light turns off, and a very warm light comes up; now you can see the slight green tint of one of the apples. The warm light cools to daylight, and in a matter of seconds, your vision adjusts so you can tell the green apple from yellow or red.

You are listening to a piano. You hear a note and another, and you start guessing if the melody is somber or playful. Just two notes are probably too few to predict the vibe. A few more notes, and the probability collapses to a melody you know all too well.

Photons or sounds, objective external signals, cannot paint the whole picture alone. Only after your eyes or ears can collect a few, can you make a reliable guess, a prediction, about the color or the tone.

I used “color” as a bridge idea to study how the Know to Predict function works. Still, I ultimately settled on Impression as it captures what the function does, internalizes objective signals (like light wavelengths), and externalizes subjective qualities, beyond simply capturing a singular kind of results this function produces.

Know → Sense: Identification

You are a seeker in a hide-and-seek game. You enter a living room, and after a few moments, you hear a rustling sound of a candy wrapper. Turn around, and there it is - a familiar shape behind a curtain. For adults, this is where the real game of joyful suspense begins: pretend you did not find the child, keep looking around.

But for us, let’s focus on the sound reaching your ears. From it, you sensed the direction the child was hiding. That’s what I call Identification. By comparison, if several children were hiding, Impression emerges as you are trying to guess which suspicious shadow corresponds to whom.

An ancient humanid saw a patch of yellow-orange in the forest and, recognizing it as a tiger, they were off running for help. That’s Identification. New Year’s Eve, something flashed in the windows of a building - that’s fireworks. Again, Identification.

To wrap my head around the Know to Sense function, I used “form” as a bridge, but it was too limiting to the inputs and outcomes. The other noteworthy mnemonic is “Effectiveness” - it also focuses on the outcome of recognizing the form or effect, not the essence of the function.

Feel → Predict: Opportunity

It’s almost lunch, and you are beyond hungry: that salad place you saw on your way here is very tempting. That’s certainly what you want!

You’re parched, walking through the city. There’s a vending machine, finally. That convenience store sign far ahead also glows like a beacon. You were on the same street yesterday. This vending machine was there all along, but your internal state led you to project usefulness onto everything around you.

An organism that could act only on external data would be purely reactive. But an organism that can project internal states outward as predictions about external advantage becomes proactive. It reaches for things before the environment tells it to. So, selective pressure favors organisms that sense their own metabolic deficit and extend themselves toward whatever is available, even if at random.

Just like that, speaking is essentially pointing literal sound waves outwards. Imagine you are taking your partner on a dinner date, feeling excited and anticipating, when you whisper, “You look amazing tonight.” It works for loneliness, too. You feel lonely, you call your friend, and the words you say carry the feeling outward, “I’m so glad you picked up.”

When our internal feelings project meaning into external things to make them useful, or to make use of them, that’s Opportunity.

I used “advantage” and “usefulness” as bridge ideas to understand what the Feel to Predict function produces, before settling on Opportunity.

Feel → Sense: Calibration

If you are peckish, focusing on that lecture becomes harder with every minute. It might not be just hunger; you could be coming down with something. Moments later, the fever makes the room suddenly feel too cold, even though it was fine just a moment ago.

The other day, feeling great after a long chat with an old friend, the air was so sweet and delicious at dusk. You open a new book, and it is already midnight - it feels like 30 minutes. It has probably been a full hour, or even more, since you plunged into the story.

You are at a concert, energized in anticipation. The band starts playing your favorite song, and it sounds better than ever. A sudden call from your partner, worrying words on the phone. You are anxious to get out to a quiet place. The same song sounds like nothing, worse, a nuisance.

You are at a clinic, waiting. Time moves slowly as if minutes take hours. The nurse comes out with reassuring words. You start to relax, and suddenly, an hour has passed. The improved feelings recalibrated the sense of time.

When our feelings affect our senses, that’s Calibration. I chose this word because it works both ways: with peripheral senses, such as warmth, and with experiential senses, such as time passing.

To wrap my head around the Feel to Sense function, I used “time” and “wellbeing” as bridges, but they described the operation’s common content, whereas Calibration describes what the operation does.

A book by Alexey Kopytko. Content licensed under CC BY-SA 4.0