Can you watch yourself thinking?

Consider how you are reading this book, or any book for that matter. You probably go back and forth across a page, and when met with a complicated part, you may ask yourself: What is the gist of this sentence? What is the main idea here? And that familiar part of your mind springs to life, highlighting and summarizing, making decisions, doing the hard work. You can watch it think, go through the usual paces. You probably did that thousands of times. But the other part, that makes requests, can you watch it think? Where is it? What is it? It is there. Inaccessible.

Clearly, these parts, one working and one directing, are related to each other. They speak the same language, after all. They are almost the same, but different. What if we could figure out how they are related? Then, we could work backward from what we observe to what must be there.

Consider the fundamental constraints on energy, on space, on time that we have. You don’t have infinite energy. Your biological brain is probably about 1.4 liters in volume and uses 20 watts of power. So every one of your thoughts costs calories, but the world is infinite. There are more books than you could ever read in your limited lifetime, as simple as that. So you must select, by necessity. Everyone does that, but not everyone does it the same. What makes our filters different? What do we filter for exactly?

Language models can do work our minds can do. Is this intelligence? What kind of intelligence? How does it compare to mine and yours? Not every intelligence is equal. Say, a toad operates in a separate level of comprehension compared to the insects it eats: a fly has no idea what hit it. What are these levels? That’s not about brain size; there are countless species with larger brains than humans, yet they are not that smart. So there’s something else that makes us smarter. What is the structural reason we are as smart?

But first, we have to establish the ground rules. Surprisingly, there are only a few rules.

The Constraints

Boundary

I’m only interested in living systems with a defined boundary: it created the first distinction, inside versus outside. Viruses and self-reproducing enzymes that predated early protocells are out of scope.

Sufficiency

Our environment has infinite ways to kill us. In a world full of threats and competition, a simple life didn’t stand a chance. Simplicity wasn’t getting a pass.

On the other hand, life always had finite means to persist. Every invention, every bit of structure, every pathway took a toll on finite energy, nutrients, and time. Complexity has an upper bound.

Life had to increase complexity only as much as required. Sufficiently so.

Correspondence

Just like if there’s an apple in your hand, there’s an apple in your mind. If there’s a physical distinction, then there’s a conceptual distinction. If the distinction counts, it is in the language. I’m using the physical structure as a map to find the conceptual structure and discover meaning. Our language is the evidence that the map is accurate.

Let’s see how these constraints were in play 4 billion years ago.

The Membrane

When the first protocell appeared four billion years ago, its membrane barely contained the self-reproducing enzymes of the early life to create an individual unit of natural selection. Without a membrane, a better RNA molecule would benefit every other molecule in the pond equally.

Just as it created the first physical distinction between what was “self” and what was “not self,” it made the first categorical distinction: inside vs. outside.

To use this categorical distinction as a tool for organizing concepts, we need a test.

Internal ideas are concepts that relate to what exists or happens within the boundary. If an outside observer must cross an opaque boundary to perceive it, the idea is internal.
External ideas are concepts that relate to what exists or happens outside the boundary. If an outside observer can perceive it without crossing the boundary, the idea is external.

Relative to a given boundary, every concept, no matter how complex, can be sorted according to this most basic distinction: is it “internal” or “external” to the boundary, just as when you cut a piece of paper in two, you have a left side and a right side. There is no “third side” created by the scissors.

A book’s weight is an external concept, measurable without opening it. The narrative within requires boundary crossing: an internal idea. A person’s thoughts? Internal, hidden behind the boundary of the skull.

A conversation is internal to the room (an outsider must cross to perceive it), external to each participant at the table (they perceive it directly, no boundary to cross).

Early life didn’t stop at the fragile fatty acid shells; it needed semipermeable membranes that were selectively open, and they uncovered more categories.

The Four Operations

Even a primitive membrane created a tiny physical separation, allowing an ion gradient to form. The charge gradient generated a voltage across the membrane, and, as the membrane is an effective insulator, the cell accumulated electric charge.

With a charge difference, you have a work capacity: the cell moved from being a prey of its environment to being an agent that could manage inputs and outputs, inner and outer states.

The early cells went from

passive diffusion to active harvesting for nutrients and building materials (moving ions from outside to inside through the membrane)
simple osmotic pressure regulation to active waste management (moving ions from inside to outside through the membrane)
background noise to active sensing (by detecting ions attached outside)
Brownian motion to structural state (by detecting ions attached inside)

Early life went from knowing one categorical distinction to learning four primitive operations, single movements across or along the boundary.

Each operation has a starting point and an ending point relative to the boundary. As the single boundary creates exactly two locations, there are only four possible pairs of sources and destinations.

Imagine watching a highly acclaimed dancer. Every one of their moves and steps has a name in the precise vocabulary of dance. Without names, we can’t see the patterns and anticipate what should happen next. Without names, we can’t understand the dance.

If every signal is a piece of information, what happens when a cell receives an ion from outside? It now possesses the information. It knows. When an ion is detected on the outside of a cell, the cell senses it. When membrane tension changes from within? The protocell feels it. When a protocell expels an enzyme to break down food, it predicts the food is there.

The four operations:

Know: External information internalized across the boundary.
Sense: External signals detected without crossing.
Feel: Internal states detected at the boundary.
Predict: Internal information externalized across the boundary.

At a glance:

	To Inside	To Outside
From Outside	Know	Sense
From Inside	Feel	Predict

The Centipede Test

For each physical distinction, there’s a conceptual category. We started with one: internal versus external. Now we have four new ways to organize ideas.

How do we know we are discovering nature rather than imposing categories? Just as a centipede doesn’t need to understand leg coordination to walk, a protocell knew nothing about signals and information. Every pattern we identify must operate in systems that have zero concept of what they’re doing. If it requires understanding to work, we failed.

But how do we tell if an idea is an idea of knowing or feeling? One question isn’t enough.

New Attributes

Where are these operations relative to the boundary? If we look at the final points of each operation’s movement, Know and Feel belong to internal ideas, while Sense and Predict belong to external ideas.

Operation	Movement	External or Internal?
Know	External → Internal	Internal
Sense	External → External	External
Feel	Internal → Internal	Internal
Predict	Internal → External	External

Knowing if an idea is external or internal to the boundary does not allow us to classify it into four buckets. As each question cuts the possibility space in half, we need at least one more question or attribute to specify exactly one of four binary classes: say, if an idea belongs to knowing or feeling.

Let’s call attribute A the external/internal distinction.

Know and Feel share A=Internal, so they must differ on B
Sense and Predict share A=External, so they must differ on B

So, Know and Sense share the same value for the unknown attribute B, while Feel and Predict share the opposite value of the attribute B.

Alternatively, Know and Predict could share the same value for a different unknown attribute B’, while Feel and Sense share the opposite value of attribute B’. Both attributes, B and B’, satisfy our requirement.

If we look at the truth table with all three attributes, we immediately recognize the binary logical function XOR (exclusive OR).

A	B	B’
0	0	0
1	0	1
0	1	1
1	1	0

So B and B’ are independent of each other, but neither is independent of A (knowing A and B determines B’ while knowing A and B’ determines B).

How does XOR work in action? Close one eye while keeping the other open - you’ve made a wink. This gesture exists only when your eyelids are in different states. Both eyes open? No wink. Both closed? No wink. Different states? Wink appears. That’s XOR, creating something new from differences.

We are looking at more than we bargained for: a third binary attribute that is a valid classification pair to the first attribute. Both B and B’ can be considered a third attribute relative to each other.

Whichever attribute we chose as the canonical attribute (a standard representation, not a superior one) for the classification basis, the other emerges as the third. Neither B nor B’ is “more primary” because they’re both describing the exact structural requirement: the second bit.

For simplicity’s sake, let’s take the grouping where Know and Sense share a value as our canonical attribute B, in addition to attribute A, so that the third attribute will be A^B.

For operations, we have

One inherited attribute (external/internal), A - inherited from the membrane
Two emergent attributes (B, A^B) - they appeared first for operations
And two canonical attributes (A, B)

Operation	A	B	A^B
Know	Internal	0	0
Sense	External	0	1
Feel	Internal	1	1
Predict	External	1	0

This constrained triple’s redundancy is practically convenient: classify using any two, then test it with the third. If the third disagrees, it means you made an error somewhere.

But as with dance moves, the next logical step is to name B and A^B.

Naming Attributes B, A^B

Attribute B

B = 0	B = 1
Know: External → Internal	Predict: Internal → External
Sense: External → External	Feel: Internal → Internal

What is common between Know and Sense, and separates them from Predict and Feel? What is common between Predict and Feel?

The starting point, the origin of the signal.

For Know and Sense, it is the external world, objective reality.
For Predict and Feel, it is the cell itself, the subject.

So Know and Sense are objective operations, while Predict and Feel are subjective operations, and the categorical distinctions are for objective ideas and subjective ideas.

Attribute A^B

A^B = 0	A^B = 1
Know: External → Internal	Sense: External → External
Predict: Internal → External	Feel: Internal → Internal

What is common between Know and Predict, and separates them from Sense and Feel? What is common between Sense and Feel?

Sense stays outside the boundary, Feel stays inside the boundary. These operations preserve the immediate, particular nature of the signal, capturing raw, unmediated experience. They just are, and can be grasped directly without comparison.

So, Feel and Sense are absolute operations, complete in themselves.

Some examples of absolute ideas:

Blue: You perceive blue directly; there is no need to see a different color alongside it.
Triangle: A recognizable shape existing without comparison.
Hunger: When you’re hungry, that’s just the absolute experience.
Pain: The sensation exists in itself, incomparable.
Wall: Lean against it, and it’s just there.
Love: The feeling exists whole, not as a comparison to non-love.

Know and Predict cross the boundary in either direction. When signals cross the boundary, they change the information’s context, integrating with the subject or influencing the environment. These signals are now defined relative to either system’s internal state or environments.

Know and Predict are relative operations that exist through a relationship to context.

Some examples of relative ideas:

Structure: Only exists as a sum of parts. No parts, no structure.
“Taller than”: Only makes sense when comparing two heights.
Speed: Relative to the ground? To other cars? Air speed?
Progress: Only meaningful relative to a starting state.
Champion: Can’t be champion without others to beat.
Friendship: You can’t be a friend in isolation.

All Three Attributes

Operation	Movement	External or Internal?	Objective or Subjective?	Absolute or Relative?
Know	External → Internal	Internal	Objective	Relative
Sense	External → External	External	Objective	Absolute
Feel	Internal → Internal	Internal	Subjective	Absolute
Predict	Internal → External	External	Subjective	Relative

If a distinction counts, it’s in the language. These words existed before. Philosophers have argued about the objective and subjective for millennia, and the distinction between absolute and relative runs through physics and ethics alike. We didn’t name the categories - we found that we already had them named. They were in the language all along.

Next, let’s talk uncertainty. I chose clear-cut examples to make a point, but our language is much more diverse than three binary partitions or a taxonomy with four simple buckets. The same word often has different meanings depending on context, and even within the same context, not every choice is obvious - probabilistic inference to the rescue.

Probabilistic Inference

When considering complex concepts like friendship, we often struggle to place them definitively. Is friendship more internal or external? More objective or subjective?

Infinite precision is infinitely expensive: if we convert this uncertainty into confidence levels, we’ll have usable answers from partial certainties.

Degrees, Not Boxes

Let’s replace the binary yes-or-no with degree measurements. Instead of “Is this internal or external?” we ask “How internal and how external is this?”

Then, the probability for a given category is simply the product of the probabilities.

\[ \begin{aligned} P(Know) &= P(Internal) \times P(Objective) \times P(Relative) \\ P(Sense) &= P(External) \times P(Objective) \times P(Absolute) \\ P(Feel) &= P(Internal) \times P(Subjective) \times P(Absolute) \\ P(Predict) &= P(External) \times P(Subjective) \times P(Relative) \end{aligned} \]

Think of the friendship that you had with your best friend. You express it through visible actions (external), but base it on personal feelings (subjective). You understand friendship in comparison: it is closer than acquaintanceship, less intense than romantic love (relative).

P(Internal) = 0.35, P(External) = 0.65 (not fully external, but more outside than inside)
P(Objective) = 0.3, P(Subjective) = 0.7 (more subjective, starts with you)
P(Absolute) = 0.15, P(Relative) = 0.85 (strongly relative)

P(Predict) will have the highest score with 0.39 (39%), followed by P(Know) at 0.09, P(Feel) at 0.04, and P(Sense) at 0.03.

You did not need to decide if friendship is definitely external. Predicting still won with 39%, with everything else following far behind. You did not need 100% certainty to make decisions that work.

That 16% attributable to other operations is what we lost to aliasing, across many dimensions of friendship (a complex socio-biological phenomenon), to fit it into one of the four boxes.

Selective Filter

If we compute the totals, we find that almost half the probability (46%) landed outside any operation, more than the highest answer. It isn’t aliased-away information about friendship; it’s information that meant nothing, void.

Those combinations did not correspond to anything: no operation is all internal, subjective, and relative, but this contradiction accounts for 21% of the total probability. The system excluded background noise using structural constraints between its own attributes.

Recall the sufficiency constraint. If the environment has infinite ways to kill you, then most structures of matter don’t persist. Life has to find the tiny subset of designs that work against perpetual chaos.

The more attributes you add, the more selective the filter becomes. A classification with more types and attributes that selects only 5% meaning from the chaos suggests that coherent structures are rare.

Protocells did not stop at learning four operations just to listen to the world - they discovered how to act, to respond, in the same four-verb language.

Keyboard shortcuts

A Field Guide to Bounded Intelligence