If you tuned in for Google I/O, OpenAI’s Spring Update, or Microsoft Build this month, you probably heard the term AI agents come up quite a lot in the last month. They’re quickly becoming the next big thing in tech, but what exactly are they? And why is everyone talking about them all of a sudden?
Google CEO Sundar Pichai described an artificial intelligence system that could return a pair of shoes on your behalf while onstage at Google I/O. At Microsoft, the company announced Copilot AI systems that could independently act like virtual employees. Meanwhile, OpenAI unveiled an AI system, GPT-4 Omni, that can see, hear and talk. Prior to this, OpenAI CEO Sam Altman told MIT Technology that helpful agents hold the technology’s best potential. These types of systems are the new benchmarks all the AI companies are trying to achieve, but that’s easier said than done.
Simply put, AI agents are just AI models that do something independently. It’s like Jarvis from Iron Man, Tars from Interstaller, or HAL 9000 from A Space Odyssey. They go a step further than just creating a response like the chatbots we’ve become familiar with – there’s action. To start out, Google, Microsoft, and OpenAI are trying to develop agents that can tackle digital actions. That means they’re teaching AI agents to work with various APIs on your computer. Ideally, they can press buttons, make decisions, autonomously monitor channels, and send requests.
“I agree that the future is agents,” said Echo AI founder and CEO Alexander Kvamme. His company builds AI agents that analyze a business’ conversations with customers and deliver insights on how to improve that experience. “The industry’s been talking about it for years and it hasn’t materialized yet. It’s just such a hard problem.”
Kvamme says a truly agentic system needs to make dozens or hundreds of decisions independently, which is a hard thing to automate. To return a pair of shoes for example, as Google’s Pichai explained, an AI agent may have to scan your email to look for a receipt, pull your order number and address, fill out a return form, and fulfill various actions on your behalf. There are many decisions in that process you don’t even think about, but you’re subconsciously making.
As we’ve seen, large language models (LLMs) are not perfect even in controlled environments. Altman’s new favorite thing is calling ChatGPT “incredibly dumb,” and he’s not exactly wrong. When you’re asking LLMs to work independently out on the open internet, they’re prone to mistakes. But that’s what countless startups, including Echo AI, are working on, as well as larger companies like Google, OpenAI, and Microsoft.
If you can create agents digitally, there’s not much of a barrier to creating agents that work with the physical world as well. You just have to program that task to a robot. Then you really get into the stuff of science fiction, as AI agents offer the potential to assign robots a task like “take that table’s order” or “install all the shingles on this roof.” We’re a long way from there, but the first step is teaching AI agents to do simple digital tasks.
There’s an often talked about problem in the world of AI agents: making sure you don’t design an agent to do a task too well. If you built an agent to return shoes, you’d have to make sure it doesn’t return all your shoes, or perhaps all the things you have receipts for in your Gmail inbox. Though it sounds silly, there’s a small but loud cohort of AI researchers who worry overly determined AI agents could spell doom for human civilization. I suppose when you’re building the stuff of science fiction, that’s a valid concern.
On the other side of the spectrum are optimists, like Echo AI, who believe this technology will be empowering. This divergence in the AI community is quite stark, but the optimists see a liberating effect with AI agents that’s comparable to the personal computer.
“I’m a big believer that a lot of the work that [agents] are going to solve is work that humans would prefer not to do,” Kvemme said. “And there’s higher value use for their time in their life. But again, they have to adapt.”
Another use case of AI agents is self-driving cars. Tesla and Waymo are currently the front runners in this technology, where cars use AI technology to navigate city streets and highways. Though it’s niche, self-driving technology is a fairly developed area of AI agents, where we’re already seeing AI operating in the real world.
So, what is going to get us to this future where AI can return your shoes? Firstly, the underlying AI models likely have to get better and more accurate. That means updates to ChatGPT, Gemini, and Copilot will probably precede fully functioning agent systems. AI chatbots still have to get past their huge hallucination problem, which many researchers don’t see an answer to solving. But there also needs to be updates to the agent systems themselves. Currently, OpenAI’s GPT store is the most flushed-out effort to develop a network of agents, but even that is not very advanced just yet.
While advanced AI agents are definitely not here yet, that’s the goal for many large and small AI companies nowadays. That could be the thing that makes AI significantly more useful in our everyday lives. Though it sounds like science fiction, there are billions of dollars being spent to make agents a reality in our lifetime. However, it’s a tall promise for AI companies who have struggled to get chatbots to reliably answer basic questions.