Is This AGI? Let's Get Real About Today's AI

Today's AI can write code, pass the bar exam, and even sound empathetic. But it still fails at basic reasoning. We break down what the latest models can actually do and where they fall apart.

June 2, 2026 · 4 min read · SuperThinking team

A curious robot stands before a puddle, its metallic head tilted in confusion.

Everyone is asking if we've reached AGI. It’s the wrong question.

The real question is: what can these tools actually do, and where do they consistently and hilariously fall on their face? The demos are slick. You see GPT-4o reading emotions from a video feed or solving math problems in real-time. It feels like magic.

Then you ask it to plan a simple trip with a few constraints, and it confidently books you a flight that lands after your hotel check-in closes. Or you ask it a simple logic puzzle and watch it tie itself in knots.

This is the paradox of modern AI. It can ace the bar exam but can't figure out which of two boxes is heavier if one has a rock and the other has a feather. So let's skip the philosophical debate and get into what's really happening under the hood.

The "Wow" Moments Are Real

Let's be clear: the latest models are astonishingly capable. They are not just souped-up autocomplete. We've moved from generating plausible text to performing genuine cognitive labor, as long as that labor is well-defined and lives in the digital world.

Just a couple of years ago, getting an AI to write a simple Python script felt like a breakthrough. Now, you can give it a blurry photo of a web app's UI scribbled on a napkin, and it will generate the React and Tailwind code to build it. That's not a small leap.

Multi-modality is the big story right now. You can talk to it, show it things, and have it respond in a natural voice. This makes the interaction feel incredibly fluid and, dare I say, human. I fed the Claude 3.5 Sonnet model a 10,000-word technical paper, and it produced a solid five-bullet summary in about eight seconds. Cost? Less than a penny.

These are powerful force multipliers for specific tasks:

Drafting: Emails, blog posts, marketing copy, legal documents. It provides a C+ draft instantly, letting you focus on editing and refining, not staring at a blank page.
Coding: Writing boilerplate, generating unit tests, explaining foreign codebases, and debugging weird errors. It's like a patient, infinitely available junior dev.
Summarization: Condensing long articles, transcripts, and reports into key takeaways.
Brainstorming: Generating ideas for names, taglines, or different angles on a problem.

This is why people are whispering about AGI. When a machine can do this much white-collar work, it starts to feel like a new kind of intelligence.

A computer monitor is completely covered in a chaotic arrangement of colorful sticky notes.

Where It All Breaks Down

But it's a fragile intelligence. It's a mile wide and an inch deep. The moment a task requires genuine understanding, common sense, or persistent state, the illusion shatters.

These models are masters of syntax and pattern matching, not semantics and causation. They know what word is likely to follow another based on trillions of examples, but they don't know what anything means. This leads to a few classic failure modes.

First, there's the reasoning problem. Ask a simple riddle: "I have two coins that total 30 cents, and one of them is not a nickel. What are the two coins?" Many powerful models will stumble, saying it's impossible. They fixate on "one is not a nickel" and apply it to both coins. The answer, of course, is a quarter and a nickel (the other one isn't a nickel).

Second, they have no persistent memory or world model. Each interaction is stateless. This is why building autonomous agents is so hard. An AI agent tasked with "researching the best coffee makers and ordering one" will often get stuck in a loop, re-reading the same review page or forgetting the budget you gave it three steps ago. We use complex prompt engineering and external memory stores to fake this, but it's a brittle workaround.

Third, they are ungrounded. They have no concept of the physical world. They can tell you the boiling point of water, but they don't understand what "wet" feels like or why dropping a glass on the floor causes it to break. This is why we're not seeing AI-powered robots doing complex chores in our homes yet. Navigating the messy, unpredictable physical world requires an understanding that text-based training can't provide.

A technical blueprint is spread out over a workbench cluttered with tools and parts.

A Skilled Intern, Not a CEO

So, what's the right way to think about these tools? The best metaphor I've found is a brilliant, incredibly fast, and slightly unreliable intern.

This intern can draft reports, write code, and research topics faster than any human. But you can't trust their work without reviewing it. You can't give them a vague, multi-step project and expect them to manage it to completion. And you definitely wouldn't let them make a final, critical decision.

Use AI for what it's good at:

Acceleration: Turn a 10-hour task into a 2-hour task.
Scaffolding: Get the basic structure of a project in place quickly.
Translation: Convert information from one format to another (e.g., code to documentation, transcript to summary).

Don't use it for:

Strategy: High-level planning that requires deep domain knowledge.
Accountability: Anything where a mistake has serious consequences.
Autonomy: Fire-and-forget tasks that require long-term goals.

The AGI conversation is a distraction. It leads us to think in terms of replacing humans instead of augmenting them. The real work isn't about building a god-in-a-box, but about designing robust systems that use AI as a powerful component, with humans firmly in the loop.

Forget AGI. Focus on building tools that are predictably useful, not vaguely intelligent.