Vladimir Lenin is cited as having said, “There are decades where nothing happens. And then there are weeks, where decades happen.” The previous week was one such in the world of AI, even by the standards set by OpenAI’s 2022 launch of ChatGPT. It started with a bang on Monday, 13 May, when OpenAI unveiled its new flagship product GPT4o.
Tuesday was Google’s turn, with 121 mentions of AI over 110 minutes at its I/O conference. The same day, Ilya Sutskever, OpenAI’s co-founder and chief scientist who had raised safety warning flags, left the company. The ripples created by each can have a profound impact on AI’s future.
Let’s start with the GPT4o announcement, for which OpenAI built effortless multimodality into its existing flagship product. The model has its expected share of gee-whiz features: fluid simultaneous translation, an ability to detect human emotion beyond just voice and an enhanced ability to write code, among others.
Quickly, people started discovering even more impressive use cases, like two GPT4o AI’s talking to each other, personalized step-by-step trigonometry tutoring, and helping a sightless man hail a cab in London.
Many people had expected a new GPT5 or GPT 4.5 model, but to me, this is bigger. The reason is simple. Gartner made an insightful statement on GenAI, saying that “It is not a technology or a trend. It is a profound shift in the way humans and machines interact.” Bill Gates followed it up with, “AI is the new UI.” User interface, that is.
With GPT4 and others, with its text interface and lagged voice interface, you could sense you are talking to a machine. With GPT4o, if you didn’t know it is an AI bot, you would believe it’s a human you are conversing with, seeing the same things you see, feeling the same emotions you feel and also cracking the same jokes your friends do. With GPT4o, the Sound Turing Test has been passed. The model has moved beyond voice to sound.
The next day, Google picked up the gauntlet OpenAI had thrown. It made a plethora of impressive announcements, though a lot were prototypes. Ask Photos allows intuitive search through Google Photos. It announced a more powerful and advanced version of its LLM Gemini, an intriguing AI agent that can return products you have shopped, and another that alerts you right away to scam phone calls.
Project Astra impressed onlookers with its ability to recognize code and cities, and even say where you have forgotten your glasses. It displayed better text-to-image, text-to-music and text-to-video conversions done by its AI tools.
The subtext: Basically, there is no Google product which is not going to be baptized with AI. Google Search, with 2 billion plus users and 6 million searches a minute, gets a GenAI makeover. Gmail with 1.8 billion users gets a strong dose of Vitamin AI. YouTube’s 1.8 billion users can have AI-generated text summaries of the nearly 4 billion videos that the site hosts. Another 4 billion Android users get AI on tap. The list goes on.
Ironically, however, it seems that Google is following the Microsoft playbook here. Microsoft famously had an EEE strategy of ‘Embrace, Extend and Extinguish’: First it created a product using open standards, then created a proprietary extension which quickly gained dominance through its brute distribution and ownership of the PC market, and it finally used this extension to swamp the market and extinguish its competitor. Latest example: MS 365 has 345 million users, 320 million of them get Teams free; rival Slack languishes at 39 million.
So, OpenAI the plucky innovator can launch eye-popping products galore like ChatGPT, Sora and GPT4o, but what it lacks is distribution reach. The ChatGPT needle is stuck at 100 million plus. Impressive, but small potatoes compared to Google’s sway over the internet with billions of users everywhere. Thus, Google does not need to out-innovate OpenAI. It just needs to out-distribute it, and that is precisely what its I/O huddle demonstrated.
While all this was exciting, the canary in the coal-mine could be Ilya’s exit from OpenAI. With him went other prominent researchers, and the super-alignment team that was responsible for building safe artificial general intelligence (AGI) has been disbanded. This signals OpenAI’s transition from an idealistic research lab to a capitalist entity driven by shareholder returns. This is where OpenAI and Google are similar. When it started, Google flaunted its “Don’t be evil” motto. But over the years, that got a quiet burial in the graveyard of capitalism.