Field Notes · 2026-04-25
Post 01 of 03
First Principles

AI is not intelligent.
Yet.

It is, however, a phenomenally fast librarian with a photographic memory and absolutely no idea what to do on its day off.

~7 min read · Setting expectations

People walk into AI consults with their eyes shining. They've been told their business is about to be replaced by a sentient toaster, that the singularity is queued behind their morning coffee, and that whoever ships the next chatbot wins the universe.

I love the enthusiasm. I have to gently dismantle most of it.

Here's the unromantic truth: today's frontier AI is not particularly intelligent. What it is, and this part is genuinely impressive, is the fastest, broadest, most ridiculously well-read intern humanity has ever produced. It has read everything. It has understood almost none of it. And the gap between those two things is where my consulting work lives.

The smoking gun: a benchmark a child beats

If you only click one link in this post, click this one: the ARC-AGI leaderboard.

ARC-AGI is the benchmark created by François Chollet specifically to be the thing AI can't brute-force. It's a set of small visual puzzles, grid in, grid out, that require you to spot a pattern from a couple of examples and apply it. A bright eight-year-old can solve most of them in an afternoon. The third generation, ARC-AGI 3, raises the bar further: tiny custom-built 2D games with no instructions. You have to figure out the rules by playing.

0.6%
Top frontier model · ARC-AGI-3 (interactive)
100%
A motivated human · same benchmark

That's the entire industry's current ceiling on a task a primary-schooler shrugs off. Sixty hundredths of one percent. The models that write your sales emails, draft your contracts, and refactor your codebase cannot work out the rules of a game they haven't seen before, which is, depending on how you squint, the actual definition of intelligence.

ARC-AGI is the only formal benchmark that has resisted brute-force memorisation since 2019. The puzzles are private. There is no Stack Overflow answer to copy. There is no Reddit thread to scrape. The model has to think, and that turns out to be the part it's worst at.

So what is AI actually doing?

Stripped of the marketing, a large language model is doing one thing astonishingly well: compressing the internet into a probability function. You give it some text, and it produces the statistically most plausible continuation, conditioned on every byte of human writing it ate during training.

That sounds reductive. It is reductive. It's also why these systems work. Almost every model you've used since ChatGPT launched is the same core idea trained on more data with more compute. The current race is a scaling race, not an intelligence race, and so far nothing has come along that genuinely changes that.

What you're paying for when you call GPT-5 or Claude Opus 4.7 is:

What you are not paying for:

Why this is good news for your business

If frontier models were genuinely intelligent, you wouldn't be reading this. You'd be reading a redundancy notice.

Because they aren't, the value you extract from AI is almost entirely bottlenecked by the human pointing it at the right problems. Three things that don't bottleneck you anymore:

What does bottleneck you:

This is the part nobody putting up billboards about "AI transformation" wants to say out loud, so I will: the most expensive component of any working AI deployment is still the human pilot. The model is the engine. The pilot decides where the plane goes, when to pull up, and whether the runway is even pointing the right direction.

The honest test for "real" intelligence

I've adopted a personal benchmark, and you can borrow it: I'll start calling AI "intelligent" when a frontier model crosses 5% on ARC-AGI-3 without bespoke fine-tuning. Not 100%, 5%. The threshold where the system has demonstrably learned something from scratch on a problem it has never seen.

Until then, what we have is a sublime statistical mimic. Worth every cent. Just not the thing the press release said it was.

Where I come in

If your business is going to get value out of AI in 2026, it will be because someone, internal or external, knows the difference between the parts of the system that are genuinely capable and the parts that are theatre. That's the work I do. Two-hour proving ground, no retainer, no PowerPoint. We pick a real bottleneck, we build something that works, and you keep it.

If that sounds like the conversation you've been meaning to have, drop me a line.

Sources & Further Reading
Next · Post 02

What businesses actually get from AI

The real value streams, the data-leakage tax, and why deterministic code is still cheaper than your favourite agent. →