Sunday, December 22, 2024

‘Project Jarvis’ leak highlights Google Gemini 2.0’s superpower

Must read

Key Takeaways

  • A new report has shed light on Google’s upcoming AI agent codenamed “Project Jarvis.”
  • Project Jarvis can reportedly automate web-based tasks, such as booking flights, and will be powered by Gemini 2.0.
  • Google is expected to make Project Jarvis official in December, but its rollout could be limited to a few people for testing purposes.



Google has steep ambitions for AI, as is evident from the numerous tweaks and upgrades it has made to the Gemini chatbot over the past few months. During the I/O developer conference in May, the company briefly spoke about a “universal AI agent helpful in everyday life,” with Google saying that some of this AI agent’s functionality could land on Gemini this year. A fresh report over the weekend has revealed some new details about Google’s plans for this AI agent.

Related

Google’s Project Astra takes Gemini AI into the real world

The Google Glass idea is evolving in hiatus


According to exclusive reporting by The Information (paywalled), this under-development project — supposedly codenamed Project Jarvis — will leverage a user’s web browser to perform one of many tasks, such as booking flights, researching information, or buying a product (via The Verge). Google plans to introduce Project Jarvis in December, with the experience tailored for Google Chrome, the report claims.

It will be powered by Gemini 2.0, which is expected to land by December, so the timing couldn’t be better. Google wants to roll out this AI agent’s capabilities to a small batch of users initially for testing, so we’re not expecting to find broad access to Jarvis when it’s officially introduced. It’s also worth remembering that the December release timeline is not set in stone, and Google may choose not to show off Jarvis and its capabilities by then, as The Information points out.


So how does it work?

Gemini Live running on the Google Pixel 9 Pro XL


Based on the publication’s reporting, Project Jarvis is designed to “automate everyday, web-based tasks” by capturing screenshots of the screen continuously and understanding them before the user can offer additional input via the text box or by tapping a button. However, it is noted that the responses are somewhat slow right now “because the model needs to think for a few seconds before taking each action.” This suggests that Jarvis may not be ready for primetime just yet.

The publication reportedly spoke with three people who had direct knowledge of the matter, though there are no images or videos available to demonstrate how Jarvis would work. But as our very own Will Sattelberg noted in his coverage of I/O 2024, this AI agent looks like “a functional version of what Humane and Rabbit promised on their dedicated hardware.” With December almost here, we hope to learn more about Project Jarvis and what it can do over the coming weeks.


Latest article