Comment fonctionne vraiment un LLM — tokens, hallucinations, température

Everyone talks about ChatGPT, Claude, Gemini. Few people understand what actually happens when you type a question. It's not magic. It's not intelligence in the human sense either. It's something else — and understanding this "something else" changes the way you use these tools.

The stochastic parrot — an imperfect but useful metaphor

We've often heard that LLMs are "stochastic parrots" — they repeat patterns without understanding. It's partially true, but it's reductive.

An LLM doesn't store text that it regurgitates. It has *learned* statistical relationships between words. When it generates "The cat is on the...", it doesn't search a database. It calculates that "mat", "sofa" or "bed" have a high probability of following, based on billions of examples.

What this actually means: An LLM doesn't "know" anything in the strict sense. It predicts. Very well. But predicting isn't knowing.

Tokens — why AI chops up your words oddly

LLMs don't read words. They read "tokens" — chunks of words.

"Unconstitutionally" becomes several tokens: "Un", "constitu", "tion", "ally". The word "the" in English is a single token. The word "Sion" (the Swiss city) might be split into "S" + "ion".

Why this matters:

- LLMs have a token limit (context). The longer your conversation, the more they "forget" the beginning.

- Rare or technical words cost more tokens = more risk of errors.

- Languages other than English are less efficient (French uses ~30% more tokens to say the same thing).

Training — billions of texts, zero truth

An LLM is trained on massive quantities of text: books, websites, articles, forums, source code. Everything written by humans.

The problem: The internet contains both true and false information. The LLM learns both without distinction. It learns that "the earth is round" AND "the earth is flat" exist as sentences. It doesn't know which one is true — it only knows which one is more frequent in certain contexts.

This is why LLMs "hallucinate": they generate plausible text, not true text. If you ask for an obscure fact, the model will produce something credible — whether it's accurate or invented.

Temperature — the creativity/precision slider

When an LLM generates text, it can be more or less "creative". This is controlled by a parameter called temperature.

- Low temperature (0.1-0.3): predictable, repetitive, "safe" responses

- High temperature (0.8-1.0): varied, surprising, sometimes incoherent responses

In practice:

- For code or facts: low temperature

- For creative writing: higher temperature

- ChatGPT's default setting is a compromise (~0.7)

What this changes for you

Understanding these mechanisms changes the way you interact with an LLM:

1. Don't trust it for facts

Always verify. Especially dates, numbers, and citations.

2. Be precise

The clearer your request, the less the model has to "guess". The less it guesses, the less it invents.

3. Context matters

An LLM with your conversation history performs better than a "cold start" LLM.

4. The limits aren't bugs

Hallucination isn't a problem OpenAI will "fix". It's inherent to how it works.

Key takeaways

-An LLM predicts probable text; it doesn't "understand" in the human sense.
-Tokens limit memory — long conversations lose context.
-Training on the internet includes both true AND false — hence hallucinations.
-Temperature controls the creativity/reliability slider.
-Understanding the limits lets you use the tool better.

Stack

LLMTransformerTokenizationMachine Learning

How an LLM actually works — what nobody explains clearly

The stochastic parrot — an imperfect but useful metaphor

Tokens — why AI chops up your words oddly

Training — billions of texts, zero truth

Temperature — the creativity/precision slider

What this changes for you

1. Don't trust it for facts

2. Be precise

3. Context matters

4. The limits aren't bugs

Key takeaways

Stack

Further reading

More articles in the journal

How an LLM actually works — what nobody explains clearly

The stochastic parrot — an imperfect but useful metaphor

Tokens — why AI chops up your words oddly

Training — billions of texts, zero truth

Temperature — the creativity/precision slider

What this changes for you

1. Don't trust it for facts

2. Be precise

3. Context matters

4. The limits aren't bugs

Key takeaways

Stack

Further reading

More articles in the journal