Sunday, November 3, 2024

Some questions about Pixel Screenshots: Gemini integration, Ask Photos overlap?

Must read

From what has leaked, Google is positioning Pixel Screenshots alongside Gemini and Circle to Search as the Pixel 9’s biggest assistive features. 


9to5Google has a rebooted newsletter that highlights the biggest Google stories with added commentary and other tidbits. Sign up here!


What we know so far 

Officially, from leaked advertising, “Pixel Screenshots helps you save info that you want to remember later – like events, places and more. So you can find what you need, right when you need it.” Meanwhile, Google’s teasers positioned it as helping you solve the problems of:

  • Forgetting what restaurant your friend liked 
  • Forgetting the movie your friend recommended
  • Forgetting the show your friend recommended

When “AI processing” is enabled, according to the set-up experience, Pixel Screenshots will “use AI to summarize your new and existing screenshots and answer your questions about the information in them.” Existing captures are presumably those transferred from your old phone. Meanwhile, new screenshots taken after this feature is turned on will “save metadata like web links, app names, and when the screenshot was taken.”

Notably, you can use Pixel Screenshots without AI processing and presumably have the app serve as a faster, more dedicated way to access screenshots than Google Photos. We also see what could be the app’s icon: three blue screenshots.

Design-wise, you get an app that shows a grid of screenshots, with some featuring the overlaid text/AI summaries: “Don’t forget to buy milk at the shop,” “Book vet appointment,” “Days trip idea,” and “WiFi Password for new apartment.” Most captures are badged with the responsible app icon in the bottom-right corner, like Google Photos, Messages, Google Search, and Maps. Others, however, do not note the app.

Material You is leveraged throughout, with new vertical pill-shaped buttons for back (?) and what appear to be grid options up top. There’s a floating “Search Screenshots” bar at the bottom with voice input and a FAB. That’s presumably for adding non-screenshots, like images you download from the web, for analysis. 

You can search in a conversational manner with the results page showing the screenshot with your desired information. Notably, it’s branded with Gemini’s sparkle in the top-right corner. 

What’s the Gemini connection?

I assume Google is leveraging the on-device Gemini Nano with Multimodality teased at I/O 2024 to understand what’s happening in a screenshot, like the included media and any surrounding text that appears.

However, does Pixel Screenshots in any way integrate with the Gemini app? For example, can you ask Gemini via its panel or “Hey Google” for information from screen captures, or is Pixel Screenshots siloed?

A walled off experience might make sense for privacy, with the manual capture nature putting it above Microsoft’s Recall already, but that doesn’t feel like a cohesive experience.

Ask Photos competitor?

The other question I have is whether the functionality of Pixel Screenshots competes with the Gemini-powered Ask Photos? That Google Photos feature might be more focused on actual photographs you take, but all your screenshots appear there already. It would be weird for that not to be leveraged to learn more about your world and therefore answer more questions about it.

There seems to be a lot of overlap between what the two are ultimately trying to do: surface your information in a conversational manner. Additionally, I’m curious whether cloud processing allows for more in-depth analysis to be conducted compared to the on-device nature of Pixel Screenshots. 

FTC: We use income earning auto affiliate links. More.

Latest article