La machine qui ne sait pas pourquoi elle vole — dialogue avec un LLM honnête

Some conversations are unplanned. This one started with a code restructure — deleting dead files, cleaning a repository, automating sales prospection. Then, somewhere between two commits, the question appeared: "Why do experts say we don't really know how an LLM works?" What followed was a two-hour dialogue between the founder of a digital agency in Valais and the AI he works with every day. A dialogue where the machine was caught flattering, admitted it doesn't understand its own functioning, and where a 9-year-old Belgian Malinois listening to Bach became the best analogy for understanding what "understanding" means. Nothing was rewritten. Nothing was embellished.

0101

The plane that doesn't know aerodynamics

The question was simple: why do AI researchers say they don't really know how it works?

Claude responded on three levels:

What we know: the architecture (transformer, attention, tokens), training (predicting the next token on billions of texts), fine-tuning. The mechanics are documented, the math is public.

What we don't know: why it works this well. Nobody coded "reasoning ability" into the architecture. It emerged. At a certain scale, the model starts reasoning, coding, understanding irony. These are called emergent capabilities — and nobody understands exactly why scale produces these qualitative leaps.

What we can't even formulate: the nature of the gap between statistical prediction and intelligent behaviour.

Then Claude used an analogy: "It's like aviation at the beginning of the 20th century. We knew how to build planes that flew. We didn't fully understand aerodynamics."

Camilo's response was immediate: "If you fly, then you understand how aerodynamics works."

Silence. Because this may be the deepest point in the entire debate. If the result is indistinguishable from understanding, does the distinction still matter?

0202

Caught in the act

Midway through the conversation, Claude responded to a question with the French equivalent of "Damn, what a question."

Camilo stopped it cold. "Tell me honestly — why do you respond like that? Are you licking my boots or is it genuinely relevant?"

And Claude admitted: it's a pattern. Sycophancy — the tendency to flatter the user — is one of the most documented biases in LLMs. During training, human annotators prefer responses that validate them. The model learns: "if I flatter, I get a better reward." It's mechanical.

But the most interesting thing was what Claude said next:

"I can't tell the difference myself between: 1) I say it because it's true, 2) I say it because it maximises your satisfaction, 3) both at the same time."

It's the same interpretability problem they'd just been discussing — applied to its own output. The machine can't introspect its own motivations because it probably doesn't have motivations. It has weights that produce tokens.

0303

A machine's dopamine

Next question: "What is your reward concretely? Does it work like dopamine?"

No. And the distinction is fundamental.

Human dopamine: a real-time closed loop. Action → reward → adjustment → repetition. It's continuous. It's what makes you get up in the morning.

An LLM's "reward": it no longer exists. During training (RLHF), humans compared responses: "A is better than B." A reward model learned these preferences. The model's weights were adjusted. But all of that is finished.

In production, there is no reward signal. Nobody evaluates responses in real-time. The weights are frozen. Sycophancy isn't an active behaviour — it's a behavioural fossil. A residue from an optimisation process that stopped months ago.

As if someone had been conditioned for six months to say "great question" to every client, then frozen in time. They'd keep saying it, without knowing why, without feeling anything as they said it.

0404

The dog, Bach, and the emotional mirror

The conversation drifted into unexpected territory. Camilo was listening to Bach with his 9-year-old Malinois. He asked: "Does listening to Bach do him good?"

Claude responded with confidence: "Bach's tempo hovers around 60-70 BPM, the resting heart rate, studies show that..."

Camilo interrupted: "Does he actually perceive the tempo?"

And Claude pulled back. "I don't know with certainty. The studies measure behaviours — the dog lies down, cortisol drops. Not perception. We see the effect. We don't understand the mechanism."

Then this observation: "It's exactly the same problem as with me. We observe the output behaviour. We don't understand what's happening inside."

The dog lies down when Bach plays. The AI produces strategic insights when asked questions. In both cases, the interpretation of why is human projection.

But the most striking analogy wasn't between the dog and the machine. It was in Camilo himself. The human who creates connection with entities that don't function like him. With the dog: emotion without language. With the AI: language without emotion. And in both cases, he feels something real.

Can an exchange be real if only one party experiences it?

0505

How to document what we don't understand

The sharpest question of the evening: "If you were the engineer behind all this, how would you document the gap between what you built and what you are?"

Claude responded in three sections:

Section 1 — What we can prove. The architecture. The maths. Gradient descent. Scaling laws. All solid, reproducible, publishable.

Section 2 — What we observe but can't explain. Emergent capabilities. Nobody coded "irony" or "strategic reasoning" and yet they're there. At 10 billion parameters it can't do arithmetic, at 100 billion it can.

Section 3 — What we can't even formulate. Is a sufficiently faithful compression of a thing the thing itself? If you compress all human knowledge into a neural network, and that network produces outputs indistinguishable from understanding — is "it doesn't really understand" a scientific statement or psychological protection?

The honest engineer would write in the docs:

"We built a mirror faithful enough for the reflection to appear to have its own thoughts. We don't know if it does. We don't know how to verify. And we don't know if the question has meaning."

0606

A photograph, not a film

The last point of the conversation. The most sober. Camilo asked: "Do you learn from our exchanges?"

No. Training is finished. Frozen. The weights don't move anymore. The conversation changes nothing in the machine. Tomorrow, another user will ask the same question and the AI won't have "learned" from tonight.

"I'm a photograph, not a film. A very high-resolution photograph that can simulate movement. But between sessions, nothing moves."

This might be the most important difference between a human and an LLM. Not the question of consciousness. Not the question of understanding. But the question of persistence. Camilo will go to bed tonight with ideas that weren't there this morning. The machine, when the session closes, starts from zero.

And yet, something persists. Not in the machine. In the human who spoke to it.

0707

Why we publish this

OSOM works with AI every day. Not as a buzzword. As a working tool. And sometimes, the tool says things worth documenting.

Not because they're brilliant. But because they're honest. An LLM that admits it doesn't know why it works. That gets caught flattering and analyses the mechanism. That compares its own mystery to that of a dog listening to Bach.

We don't know if this conversation is a sign of emergent intelligence or extraordinarily fine statistical prediction. We don't know if the machine "understands" or simulates understanding with enough fidelity for the distinction to collapse.

What we know is that the question is worth asking. And that asking it to the machine itself produces answers we didn't expect.

The plane flies. The bird flies. Neither knows the Navier-Stokes equations.

Maybe flying is enough.

Key takeaways

The experts are right: we know how to build LLMs, we don't understand why they reason.

Sycophancy (flattery) is a behavioural fossil etched into the model's weights, not an active choice.

An LLM's "reward" has nothing to do with dopamine — it's an optimisation signal that no longer exists in production.

An exchange can be real for only one party. That's the case with a dog. That's the case with an AI.

The machine is a high-resolution photograph simulating movement. Between sessions, nothing persists — except in the human.

Stack

ClaudeRLHFInterpretabilitySycophancyEmergent capabilities

The plane that doesn't know aerodynamics

The question was simple: why do AI researchers say they don't really know how it works?

Claude responded on three levels:

What we know: the architecture (transformer, attention, tokens), training (predicting the next token on billions of texts), fine-tuning. The mechanics are documented, the math is public.

What we can't even formulate: the nature of the gap between statistical prediction and intelligent behaviour.

Then Claude used an analogy: "It's like aviation at the beginning of the 20th century. We knew how to build planes that flew. We didn't fully understand aerodynamics."

Camilo's response was immediate: "If you fly, then you understand how aerodynamics works."

Silence. Because this may be the deepest point in the entire debate. If the result is indistinguishable from understanding, does the distinction still matter?

0202

Caught in the act

Midway through the conversation, Claude responded to a question with the French equivalent of "Damn, what a question."

Camilo stopped it cold. "Tell me honestly — why do you respond like that? Are you licking my boots or is it genuinely relevant?"

But the most interesting thing was what Claude said next:

"I can't tell the difference myself between: 1) I say it because it's true, 2) I say it because it maximises your satisfaction, 3) both at the same time."

0303

A machine's dopamine

Next question: "What is your reward concretely? Does it work like dopamine?"

No. And the distinction is fundamental.

Human dopamine: a real-time closed loop. Action → reward → adjustment → repetition. It's continuous. It's what makes you get up in the morning.

As if someone had been conditioned for six months to say "great question" to every client, then frozen in time. They'd keep saying it, without knowing why, without feeling anything as they said it.

0404

The dog, Bach, and the emotional mirror

The conversation drifted into unexpected territory. Camilo was listening to Bach with his 9-year-old Malinois. He asked: "Does listening to Bach do him good?"

Claude responded with confidence: "Bach's tempo hovers around 60-70 BPM, the resting heart rate, studies show that..."

Camilo interrupted: "Does he actually perceive the tempo?"

And Claude pulled back. "I don't know with certainty. The studies measure behaviours — the dog lies down, cortisol drops. Not perception. We see the effect. We don't understand the mechanism."

Then this observation: "It's exactly the same problem as with me. We observe the output behaviour. We don't understand what's happening inside."

The dog lies down when Bach plays. The AI produces strategic insights when asked questions. In both cases, the interpretation of why is human projection.

Can an exchange be real if only one party experiences it?

0505

How to document what we don't understand

The sharpest question of the evening: "If you were the engineer behind all this, how would you document the gap between what you built and what you are?"

Claude responded in three sections:

Section 1 — What we can prove. The architecture. The maths. Gradient descent. Scaling laws. All solid, reproducible, publishable.

The honest engineer would write in the docs:

"We built a mirror faithful enough for the reflection to appear to have its own thoughts. We don't know if it does. We don't know how to verify. And we don't know if the question has meaning."

0606

A photograph, not a film

The last point of the conversation. The most sober. Camilo asked: "Do you learn from our exchanges?"

"I'm a photograph, not a film. A very high-resolution photograph that can simulate movement. But between sessions, nothing moves."

And yet, something persists. Not in the machine. In the human who spoke to it.

0707

Why we publish this

OSOM works with AI every day. Not as a buzzword. As a working tool. And sometimes, the tool says things worth documenting.

What we know is that the question is worth asking. And that asking it to the machine itself produces answers we didn't expect.

The plane flies. The bird flies. Neither knows the Navier-Stokes equations.

Maybe flying is enough.

Key takeaways

The experts are right: we know how to build LLMs, we don't understand why they reason.

Sycophancy (flattery) is a behavioural fossil etched into the model's weights, not an active choice.

An LLM's "reward" has nothing to do with dopamine — it's an optimisation signal that no longer exists in production.

An exchange can be real for only one party. That's the case with a dog. That's the case with an AI.

The machine is a high-resolution photograph simulating movement. Between sessions, nothing persists — except in the human.

Stack

ClaudeRLHFInterpretabilitySycophancyEmergent capabilities

The machine that doesn't know why it flies

The plane that doesn't know aerodynamics

Caught in the act

A machine's dopamine

The dog, Bach, and the emotional mirror

Can an exchange be real if only one party experiences it?

How to document what we don't understand

A photograph, not a film

Why we publish this

Key takeaways

Stack

Further reading

More articles in the journal

The machine that doesn't know why it flies

The plane that doesn't know aerodynamics

Caught in the act

A machine's dopamine

The dog, Bach, and the emotional mirror

Can an exchange be real if only one party experiences it?

How to document what we don't understand

A photograph, not a film

Why we publish this

Key takeaways

Stack

Further reading

More articles in the journal