There’s been a simple question going around AI circles in recent months: “What’s next?”
Since the launch of OpenAI’s GPT-4 in March last year, each passing month has made it clear that the ChatGPT-maker’s top model set a performance bar that has been curiously difficult for competitors to clear.
Big Tech giants like Google and Meta have released rival models like Gemini and Llama, which proved competitive at most. So, too, have younger startups like Anthropic and Mistral. None of them quite seemed to introduce a step-change in capabilities the way GPT-4 did, though.
Gary Marcus, a professor examining AI, suggested last month that it was a sign AI models were reaching a “point of diminishing returns.”
Questions have also emerged about whether we’ve already seen the limits of what AI can do.
While ChatGPT has shown its usefulness in the workplace to the classroom, a technology touted by Bill Gates as something “entire industries will reorient around” should probably have more to show for it than chatbots prone to hallucinations.
AI companies seem ready to end that chatter, though; this past week, they have tried to address the “what’s next?” question head-on.
OpenAI and Google present their vision
OpenAI kicked off a week of showcases by revealing a powerful new model: GPT-4o.
Though it wasn’t quite the GPT-5 many hoped for, the “o” attached to GPT-4 — which stands for “omni” — introduced some serious advances in OpenAI’s technology while hinting at the direction in which it planned to take the AI revolution.
In a presentation on Monday, OpenAI’s chief technology officer, Mira Murati, talked through the new flagship model’s ability to “reason across audio, vision, and text in real time” to create what the company described as a “much more natural human-computer interaction.”
Through several demos, OpenAI introduced a version of ChatGPT powered by GPT-4o reminiscent of the Scarlett Johansson-voice AI assistant Samantha from the 2013 movie “Her.”
Nathan Lambert, research scientist at the Allen Institute for AI, wrote on his Interconnects Substack page that the reveal shows OpenAI leading us toward a world where “intelligence, attention, positive feedback,” qualities “fundamentally craved by humans,” come from AI.
“GPT-4o’s demo showcases that we are intentionally marching toward this reality with no shadow of regret,” Lambert wrote. While some have been blown away by the chance to have a slightly flirty-sounding AI by their side at all times, others have been a bit weirded out.
Google took its turn at revealing updates to its AI during the annual I/O event for developers on Tuesday.
The search giant, which has been seen as playing catch-up to OpenAI, showed off its new Project Astra agent, which aims to make AI “truly helpful in everyday life.”
Like the new ChatGPT, Google’s multimodal assistant — powered by its AI model Gemini — has been built to respond to queries in real time. It has vision and audio capabilities, on top of the usual text responses chatbots have become known for.
In practice, that means you can point your camera at a scene and get Astra to pretty quickly identify, say, something that walks on four legs, and then go on to have a conversation with you about said tetrapod.
Google’s announcements didn’t stop there. The company has also shaken up the way its core product works, integrating AI into search so that Google will “do the searching for you.”
Liz Reid, head of Google Search, explained it like this: “Sometimes you want a quick answer, but you don’t have time to piece together all the information you need. Search will do the work for you with AI Overviews.”
Collectively, the reveals from both companies have made one thing clear: the AI revolution is about to feel a lot more real.
Chatbots will start to function more like companions who engage with you in longer periods of dialogue rather than machines that respond to ad hoc requests. Familiar services like Google Search, meanwhile, could actually begin to feel different.
The consequences of all these developments, though, are not yet clear. Social interactions could feel different in the future if people interact more with AI chatbots. Publishers might feel pain if a Google newly powered by AI diverts traffic away from them.
It’s clear, though, that a new phase in the AI era is just getting started.