The math on artificial intelligence agents doesn’t add up

✨ Explore this trending post from WIRED 📖

📂 **Category**: Business,Business / Tech Culture,Backchannel

💡 **What You’ll Learn**:

Great artificial intelligence Companies have promised us that 2025 will be “the year of the AI customer.” It turned out to be common We talk about it AI agents, and start implementing that transformative moment until 2026 or perhaps later. But what if the answer to the question was “When will our lives become fully automated with generative AI robots doing our tasks for us and essentially running the world?” It’s, like that New Yorker cartoon, “How about never doing it?”

That was basically the message of a research paper published without much fanfare a few months ago, in the middle of a hyped-up year for “agent AI.” Titled “Hallucinating Stations: On Some Fundamental Limitations of Transformer-Based Language Models,” it aims to show mathematically that “LLMs are unable to implement computational and agentic tasks beyond a certain complexity.” Although the science is beyond my abilities, the authors—a former CTO at SAP who studied artificial intelligence under one of the field’s founding minds, John McCarthy, and his teenage prodigy son—have punctured the vision of an agentic paradise with mathematical certainty. Even inference models that go beyond the MBA word prediction process won’t solve the problem, they say.

“There is no reliable way,” Vishal Sikka, the father, told me. After a career that included, in addition to SAP, a stint as CEO of Infosys and a member of Oracle’s board, he currently heads an AI services startup called Vianai. “Should we forget the AI agents that run nuclear power plants?” I ask. “Exactly,” he says. Maybe you can have him submit some paperwork or something to save time, but you may have to succumb to some mistakes.

But the AI industry is different. For one thing, the big success in AI has been programming, which took off last year. Just this week in Davos, Demis Hassabis, Google’s Nobel Prize-winning head of AI, announced breakthroughs in reducing hallucinations, and big scalers and startups alike are pushing the agent narrative. Now they have some backup. A startup called Harmonic has announced a major breakthrough in AI coding that is also based on mathematics, and tops the standards credibility.

Harmonic, co-founded by Robinhood CEO Vlad Tenev and Todor Achim, a Stanford-trained mathematician, claims that this latest improvement to its product called Aristotle (no arrogance there!) is an indication that there are ways to ensure the trustworthiness of AI systems. “Are we doomed to be in a world where AI generates waste and humans can’t really verify it? That would be a crazy world,” Achim says. The combinatorial solution is to use formal methods of mathematical reasoning to verify the outputs of the LLM. Specifically, it encodes the output in the Lean programming language, which is known for its ability to verify coding. Harmonic’s focus to date has certainly been limited: its main mission is the pursuit of “superior mathematical intelligence,” and programming is a fairly organic extension. Things like historical articles – which cannot be verified mathematically – go beyond their limits. for now.

However, Achim doesn’t seem to think trusted customer behavior is as much of a problem as some critics think. “I would say that most models at this point have the level of pure intelligence required to think through booking an itinerary,” he says.

Both sides are right, or perhaps on the same side. On the one hand, everyone agrees that hallucinations will remain an uncomfortable reality. In a paper published last September, OpenAI scientists wrote: “Despite significant progress, hallucinations continue to plague the field, and are still present in state-of-the-art models.” They substantiate this unhappy claim by asking three models, including ChatGPT, to provide the title of the lead author’s thesis. All three composed false titles and all misreported the year of publication. In a blog post about the research, OpenAI stated bleakly that in AI models, “accuracy will never reach 100 percent.”

🔥 **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#math #artificial #intelligence #agents #doesnt #add**

🕒 **Posted on**: 1769405964

🌟 **Want more?** Click here for more info! 🌟

The math on artificial intelligence agents doesn’t add up

By

Leave a Reply Cancel reply