Kevin Champlin

Glossary

The vocabulary of frontier AI.

Every metric on the chat side rail, every benchmark on the dashboard, every concept in the editorial all link back here. Pedagogy is the spine of this site.

Tokens

tokens

Tokens are the units a language model reads and writes. Roughly four characters of English per token, give or take. Every cost and every limit on this site is denominated in tokens, not words or chara...

token input tokens output tokens

Context window

context-window

The context window is the maximum amount of text, in tokens, that a model can consider in a single request. Everything you send in plus everything the model writes out has to fit. Larger windows enabl...

context length max context context size

Prompt caching

prompt-caching

Prompt caching is a feature that lets the model store a static prefix of your prompt (system instructions, retrieved documents, conversation history) and re-read it at a fraction of the normal input c...

prompt cache cache

Cache read

cache-read

Cache read tokens are input tokens that were served from the prompt cache rather than processed fresh. They are billed at roughly 10 percent of normal input price. A high cache read count on a turn me...

cache read tokens cache_read_input_tokens

Cache creation

cache-creation

Cache creation tokens are input tokens that were written to the prompt cache for reuse on later turns. They cost slightly more than normal input on the way in (roughly 25 percent premium), but every s...

cache creation tokens cache_creation_input_tokens

Stop reason

stop-reason

The stop_reason field in an API response tells you why the model stopped generating. Common values: `end_turn` (the model decided it was done), `max_tokens` (it hit the output cap and was cut off), `s...

stop_reason end reason

Latency to first token

latency-to-first-token

Latency to first token (TTFT) is the time between sending a request and seeing the first token of the response start to stream back. It is what makes a chat interface feel "fast" or "slow" to a user,...

TTFT first token latency time to first token

Hallucination

hallucination

Hallucination is when a model confidently asserts something false. It is not a glitch or a bug; it is the natural output of a system trained to produce plausible text. The model has no internal flag d...

confabulation making things up factual error

Agentic

agentic

"Agentic" describes a system that can take multiple actions in sequence, use tools, observe outcomes, and adjust its plan to reach a goal. It is a descriptor of behavior, not a claim about consciousne...

agent AI agent autonomous agent

Consciousness

consciousness

Consciousness, in this glossary, refers specifically to subjective experience: the property of there being something it is like to be the system. Current AI systems show no evidence of having this pro...

subjective experience sentience awareness
Today, UTC
Monthly
refreshed /cost-of-mind →