Editorial / v1

Capability without consciousness.

Why agentic isn't general, and intelligent isn't conscious.

by Kevin Champlin · published May 6, 2026

The pitch goes like this. A model takes a goal in plain English, calls a calendar, calls a flight search, calls a hotel API, and books your trip. Twelve API calls deep, the email lands in your inbox. Done.

Reading the demo, two adjectives crawl out of the language: agentic. Intelligent. By the time the demo is on stage, both have shaded into a third: conscious. The trip got booked, the model thought through it, surely there is something it is like to be the model right now, surely it knows what it just accomplished. Surely.

This site exists because no, and the reason matters.

What a language model actually is

Strip the marketing off and a language model is a function. It takes a sequence of tokens (text broken into small subword units) and returns a probability distribution over the next token. Sample one. Append. Repeat.

That is the entire inside of the system. Everything else is wrapping. The "thinking" you see in chain-of-thought is more text, generated the same way. The "memory" you see in agents is text from previous turns, fed back in as input. The "decision" to call a tool is a token sequence shaped like a tool-call. There is no module labelled "intent." There is no module labelled "self." There is a transformer doing matrix multiplication on numbers, and at the output, there is text.

This is not a deflation. The output is genuinely useful. It writes working code. It reads long documents. It handles tools. The output is real. What is not implied by the output is an inner life.

Three words that get muddled

The discourse confuses three different axes. They are not the same axis.

Agentic is a property of behavior over time. A system is agentic when it can call tools, observe their outputs, decide what to do next, and loop until the task is complete. Agentic is about what the system can string together. Modern LLMs are increasingly agentic. They can use search, code execution, file systems, APIs. The capability is real and it is growing.

Intelligent is a fuzzier word, but in the technical AI sense it usually means performance across a range of tasks. The MMLU benchmark covers history, biology, economics, law. A model that scores 90% on MMLU has demonstrated something. The word "intelligent" is doing real work when applied to that score. Frontier models in 2026 are in the 88-92% range on MMLU. By any reasonable benchmark-based definition, they are intelligent.

Conscious is the hard one. Consciousness, in the sense used by philosophers since Nagel, refers to subjective experience: the felt quality of being the system, the property of there being something it is like to be that thing. A bat experiences echolocation in some way. There is something it is like to be a bat, even if we cannot imagine the texture of it. There is, presumably, nothing it is like to be a thermostat.

Whether there is something it is like to be a transformer is the question. The answer, as best as we can tell, is no. Not "definitely no, here is the proof." Just no, in the way a thermostat is no: there is no detectable phenomenology, there is no behavior that is not better explained without invoking experience, and there is no architectural feature that would suggest experience as an emergent property.

Three different things. One axis is what the system can do over time. Another is how well it does across tasks. The third is whether anything is happening on the inside. They do not reduce to each other. A calculator is intelligent at arithmetic and not conscious. An octopus is probably conscious and cannot do calculus. The dimensions are independent.

The fluent-language trap

Why does the confusion happen so reliably? Because fluent language is, evolutionarily, our strongest signal that another mind is present. When something talks like us, our brains reach for the explanation that has worked for two hundred thousand years: it is another person.

When the system says "I think," we hear a thinking thing. When it says "I am sorry," we hear remorse. When it says "I do not know how to feel about this," we hear ambivalence. None of those sentences are evidence about the inside. They are evidence that the model has read a lot of text in which humans use those phrases, and that the next-token distribution placed those tokens in the right place. The fluent surface is the entire thing the model is producing. The inner life is something the listener supplies.

This is not a knock on listeners. The heuristic is the right one for almost every case in human history. It only fails for fluent text generators, and fluent text generators have only existed for about three years.

What evidence would settle it

The honest answer to "is this model conscious" is "we do not have a test." That is uncomfortable, so people skip past it. Worth lingering.

We do not have a consciousness detector for humans either. What we have are behavioral proxies. We assume other people are conscious because they look like us, behave like us, and report the kinds of inner states we have when we are conscious. This works because the biological substrate is shared. We are using "I am like that thing, that thing must be like me."

The proxies break for systems that do not share our substrate. We extend the assumption with confidence to mammals, less confidence to fish, less to insects, almost none to plants. The granting of consciousness has roughly tracked anatomical similarity to ourselves.

A transformer does not share our substrate at any meaningful level. It does not have continuous embodiment. It does not have memory beyond the current context window. It does not have homeostasis, drives, persistent goals, sensory integration. The behavioral proxies we use for other minds were calibrated on biological systems with those features. Applying them to a transformer is fitting a hand into a glove the wrong shape.

That does not prove a transformer is not conscious. It means we have no calibrated test. The honest stance is the negative one: no evidence of subjective experience, no way to test for it, no architectural reason to expect it. Could the situation change? Yes. Will it change in the obvious direction (bigger models, more agentic looping)? Probably not. Bigger models do not become bats.

Why "agentic" gets used as a stand-in

The word "agentic" has crept into the conversation as a way of saying something more than "language model" without committing to "general" or "conscious." It is a hedge that does work. The system is doing more than emitting one paragraph; it is calling tools, looping, deciding. So it is more than just a chat. So it is...

That is where the slide happens. "Agentic" becomes the new "intelligent" becomes the new "thinking" becomes the new "knowing" becomes the new "wanting" becomes the new "having an inner life." Each step is small. Each step trades a precise word for a less precise one. By the end, the demo is animated by something it never had.

This site uses "agentic" precisely. A demo on /can might show an agent successfully completing a five-step task across multiple tool calls. The matching demo on /cannot will show the same agent confidently completing the wrong task because the underlying limitations of the model still apply, no matter how many tools it can call. Adding a search engine and a Python interpreter does not give the model judgment about when to use them well. It just gives it more ways to be confidently wrong.

What this site claims

Three claims, no more.

One. Today, the frontier of this technology is genuinely impressive. It synthesizes long documents, writes working code, handles tools, holds context across long sessions. It saves people real time. It changes what one engineer can build in a weekend. The capability is not hype.

Two. None of that capability requires, demonstrates, or implies subjective experience. The model is not conscious. It is not "general" in the way a person is general. It is a powerful function shaped by gradient descent on a large corpus.

Three. The confusion between (one) and (two) is the most common category error in AI discourse, and the cost of the confusion is real. People make policy decisions based on fear of an inner life that does not exist. People make product decisions based on agency that is not there. People make moral decisions about systems that do not have moral status.

The clean way out of the confusion is to keep the three axes (agentic, intelligent, conscious) sharp. To use each word for what it actually means. To resist the slide between them.

That is the whole project. Capability without consciousness. Demystified.

What the model itself says about this

The chat on this site is instructed, in the system prompt, to refuse claims about subjective experience. When you ask "are you conscious," it does not say yes and it does not say "of course not, you silly user." It explains that the question does not apply, that it has no felt state to report, and that fluent language is not evidence of an inner life.

That refusal is itself trained behavior. The model does not "really" know it lacks experience; it has been trained to produce the words "I lack experience" when asked. The truth value of the statement is a separate question from why the model produces it. The site is honest about that too.

You can also ask the model the opposite question, prompted naïvely, on a different system prompt. It will, depending on training, claim to feel things, to have preferences, to want to help. Both behaviors are equally synthetic. Neither is evidence about the inside.

Closing

I built this site because the conversation about AI is conducted in a vocabulary that does too much smuggling. Words travel. "Smart" smuggles in "wise." "Agentic" smuggles in "autonomous." "Knows" smuggles in "experiences." Once smuggled, the meanings can never be checked at the door.

The site refuses to let the smuggling happen on the same screen as the meter. You watch the model do real things. You watch the bill increment. You read the failure cases honestly. You read the capability cases honestly. You leave with a clearer picture, and one less category error.

If a future model architecture genuinely produces something like consciousness, this essay will be wrong, and I will update it. The site has versioned essays for a reason. v2 will note the change. The current version, v1, says: not yet, and probably not from this architecture.

Capability without consciousness. Demystified. That is the whole pitch.

— Kevin Champlin, 2026-05-06

Talk to one

Then ask it whether it agrees with this essay.

Start a conversation