Have We Reached AGI? A Brutally Honest Look at Today's AI

Everyone's debating 'sparks of AGI,' but that's the wrong conversation. This is a practical breakdown of what models like GPT-4o can actually do and where they still completely fail.

May 3, 2026 · 4 min read · SuperThinking team

A detailed illustration of a human brain composed of glowing digital circuits.

The question is suddenly everywhere: "Is this AGI?" Every time a new model drops, the debate reignites. We see a demo of an AI smoothly translating a conversation in real-time or generating a website from a napkin sketch, and the hype cycle spins up again.

Let's cut through it. Asking if we've achieved Artificial General Intelligence is the wrong question. It's a fuzzy, sci-fi goalpost that keeps moving. The better question is: what fundamentally new, complex tasks can we accomplish today that were impossible last year?

Because that’s where the real story is. Not in a philosophical debate, but in tangible, world-changing capabilities that are here right now.

The "Sparks of AGI" Argument

There's no denying that today's top models, like OpenAI's GPT-4o or Google's Gemini 1.5, do things that feel like genuine reasoning. They aren't just regurgitating text or identifying cats in photos anymore.

We're talking about sophisticated, multi-step problem-solving. You can give a model a 200-page PDF of API documentation and ask it to write a Python script that uses five different endpoints to achieve a specific goal. And it will, mostly correctly, on the first try. That's not a parlor trick.

This is called chain-of-thought reasoning, where the model "thinks" step-by-step. The latest models are also multi-modal from the ground up. You can show GPT-4o a screenshot of a live graph from a monitoring tool and ask, "What's the likely cause of this spike, given our system architecture?" It can see the image, process your question, and reason about a potential cause, like a database connection pool getting exhausted.

Consider this workflow, which is now trivial:

  1. Take a photo of a whiteboard diagram for a new app feature.
  2. Upload it to the model.
  3. Prompt it with: Generate the React components for this user interface. Use Tailwind CSS for styling and create placeholder functions for the API calls.

Five years ago, this was pure science fiction. Today, it's a Tuesday afternoon. This ability to fluidly move between modalities—image, code, text, audio—and synthesize information across them is what people are calling "sparks of AGI."

A photo of a whiteboard covered in a complex, hand-drawn system diagram.
A photo of a whiteboard covered in a complex, hand-drawn system diagram.

It feels like a form of general intelligence because it's flexible. It's not a narrow AI trained only on one thing. It's a general-purpose problem-solving tool that can be applied to a seemingly infinite number of domains. That's powerful, and it's easy to see why the AGI conversation gets started.

But it's not the whole story.

Where It All Falls Apart

For all their impressive skills, these models have glass jaws. They fail in ways that reveal they aren't truly thinking in the human sense. Their intelligence is a mile wide and, in some places, an inch deep.

First, they have no persistent memory or true learning. Every chat session starts from scratch. A model can't remember a mistake you corrected it on yesterday unless you include that correction in today's context window. It doesn't learn from experience; it only processes the data given to it in the moment. A human toddler learns that touching a hot stove is bad once. An LLM will make the same logical error a million times unless its training data or prompt is updated.

Second, their reasoning is brittle. They are incredible at solving problems that are similar to what they've seen in their training data. But introduce a novel constraint or a slight twist on a classic logic puzzle, and they often break down. They give confident, plausible-sounding answers that are completely wrong. They don't have a robust world model or an understanding of cause and effect. They have a statistical model of what words tend to follow other words.

A close-up photograph of a disorganized and tangled knot of electronic wires.
A close-up photograph of a disorganized and tangled knot of electronic wires.

Third, there's no agency. An LLM has no goals, no intentions, no desires. It isn't trying to solve your problem. It's just completing a sequence of text based on a probabilistic model. This is the biggest philosophical gap. Intelligence, for humans, is intrinsically tied to agency—the desire to achieve a goal. A model is a tool, like a hammer. An incredibly advanced, talkative hammer, but a hammer nonetheless.

Here's a simple test. Ask a model to invent a new color. It will describe something like "a shimmering, reddish-green" by combining concepts it already knows. It cannot have a novel qualia or subjective experience. It's manipulating symbols, not understanding them.

So, What Do We Call This Thing?

If it's not AGI, what is it? I prefer a more descriptive, less loaded term: an Extremely Competent General-Purpose Pattern Matcher.

This isn't meant to diminish the achievement. Building a machine that can find and manipulate patterns across nearly the entire corpus of human knowledge is one of the greatest technical accomplishments in history. It just helps us frame the capabilities correctly.

It excels at tasks that rely on pattern recognition and transformation:

  • Summarizing text
  • Translating languages (and code)
  • Generating variations on a theme (e.g., writing five different marketing emails)
  • Answering questions based on a provided context

Where it fails is where the task requires true reasoning from first principles, long-term planning, or understanding the physical and social world we inhabit. You can trust it to refactor your code, but you can't (and shouldn't) trust it to give you life advice or devise a novel business strategy from scratch.

Forget the AGI label. It's a distraction. Focus instead on mapping the real-world capabilities of these pattern-matching machines to your problems. Use them as powerful assistants, as tireless junior developers, as creative partners. But don't mistake their fluent prose for true understanding.