Have We Reached AGI? A Practical Reality Check

Pundits are yelling 'AGI!' with every new model release. The truth is we're not there yet, and the gap is what's interesting. Here’s a no-nonsense look at what today's models can actually do and where they fall apart.

May 24, 2026 · 4 min read · SuperThinking team

A classic tin-toy robot staring down at its own reflection in a puddle.

No. We haven't achieved AGI.

Not even close. But it’s finally an interesting question to ask. For years, “AGI” was a sci-fi concept for academics to debate. Now, with models that can pass the bar exam and write code from a scribble on a napkin, it feels tangible. The goalposts for intelligence have moved so fast they’re a blur.

But confusing fluency with genuine intelligence is a trap. The current generation of models are incredible pattern-matching machines, trained on nearly the entire internet. They are masters of remixing and predicting the next most likely word. This gets them surprisingly far, but it isn’t the same as understanding.

Let’s cut through the hype and look at what's real.

What Today’s Models Crush

There’s no denying the magic. If you showed GPT-4 to a developer from 2018, they’d think you were a wizard. The sheer capability in constrained domains is stunning.

First, language and code. These models are fluent in hundreds of programming languages, many more than the best human polyglot. You can ask for a specific, non-trivial task, and get working code back in seconds.

For example, asking Claude 3 for a Python script to check for broken links on a website:

import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

def find_broken_links(url):
    base_url = f"{urlparse(url).scheme}://{urlparse(url).netloc}"
    visited_links = set()
    broken_links = []

    def crawl(page_url):
        if page_url in visited_links or not page_url.startswith(base_url):
            return
        visited_links.add(page_url)
        print(f"Crawling: {page_url}")

        try:
            response = requests.get(page_url, timeout=5)
            if response.status_code >= 400:
                broken_links.append((page_url, response.status_code))
                return

            soup = BeautifulSoup(response.text, 'html.parser')
            for link in soup.find_all('a', href=True):
                href = link['href']
                full_url = urljoin(base_url, href)
                crawl(full_url)
        except requests.RequestException as e:
            print(f"Failed to fetch {page_url}: {e}")
            broken_links.append((page_url, 'Error'))

    crawl(url)
    return broken_links

# Usage:
branches = find_broken_links('https://your-website.com')
print("\nBroken links found:")
for link, status in branches:
    print(f"{link} - Status: {status}")

This isn't just a simple snippet. It handles relative URLs, avoids external sites, and tracks visited links to prevent infinite loops. It's solid, thoughtful code you'd expect from a mid-level developer, generated instantly.

Then there’s multimodality. Models like GPT-4V and Claude 3 Opus can see and interpret images. You can upload a screenshot of a web app and ask it to write the React code to reproduce it. You can take a photo of your fridge and ask for a dinner recipe. This is a powerful, intuitive way to interact with computers that we're only just beginning to explore.

Where The Magic Fails

For all their strengths, today's models have glaring blind spots. They operate in a world of pure text and pixels, with no grounding in reality. They have no body, no senses, and no concept of cause and effect beyond statistical correlation.

This leads to absurd failures in common-sense reasoning. Ask a model, “I have a 12-foot ladder and a 10-foot wall. How can I use the ladder to get on the roof?” It will give you a perfect textbook answer about leaning it at a safe angle. Ask it, “I have a 10-foot ladder and a 12-foot wall,” and it will often give you the exact same answer, completely missing the physical impossibility of the task.

They lack persistent memory and a coherent sense of self. A model doesn't remember your last conversation. Each interaction is a fresh start unless you manually feed the history back into the context window. It can’t learn and grow from experience like a person does. It's perpetually stuck in an eternal present.

A close-up photograph of a chaotic, tangled bundle of multi-colored electronic wires.

This also cripples their ability to plan and execute. You can ask an AI to “create a business plan for a coffee shop,” and it will generate a beautiful, comprehensive document. But it is only a document. The AI cannot then go and execute that plan—it can't research local permits, browse real estate listings, or contact suppliers. It simulates the output of planning, not the process itself.

Early attempts at creating autonomous agents like Auto-GPT highlighted this fragility. They often get stuck in repetitive loops, fail to adapt when a website's layout changes, and rack up huge API bills while accomplishing very little. They are brittle because their underlying model lacks a true world model to guide its actions.

The Real Metric: Autonomy

Instead of chasing the fuzzy, philosophical goal of AGI, we should focus on a more practical metric: autonomous task completion.

The real game-changer won't be a model that can write a slightly better sonnet. It will be an AI that you can give a complex, multi-step goal to and trust it to get it done in the messy, unpredictable real world.

Consider this task: “Book me a weekend trip to Portland for next month. My budget is $800. I need a pet-friendly hotel near the Pearl District, and I prefer morning flights. Add all the confirmations to my Google Calendar.”

For a human, this is a series of tedious but straightforward tasks. For an AI, it’s a minefield:

Decomposition: It must break the goal into sub-tasks: search flights, filter hotels, check availability, book, parse confirmation emails, create calendar events.
Tool Use: It needs to know how to use web browsers, interact with APIs (like Google Flights or Expedia), and use a calendar.
Self-Correction: What if the first hotel it finds is just over budget? It needs to decide whether to compromise on location, price, or ask you for guidance. What if the booking website fails? It needs to try again or find an alternative.
State Management: It must keep track of what it has done and what comes next, holding the entire plan in its “mind” over a potentially long period.

A coffee-stained napkin with a hand-drawn map showing a path between several points.

This is the frontier. It’s less about raw intelligence and more about reliability, robustness, and the ability to navigate ambiguity. Building agents that can do this is the core challenge for the next five years.

So when you see the next flashy demo, ask yourself not just what it can create, but what it can do. The path to truly useful AI isn't through better chatbots, but through more capable and autonomous agents. The breakthrough we're waiting for is one of action, not just words.