Thursday, September 19, 2024

Getting started with Google Gemini: The basics of Android’s latest assistant

Must read

Google has been pushing Gemini, part artificial intelligence chatbot and part digital assistant, on Android phones for some time. You can now pick whether you’d like to use the Google Assistant or Gemini, but that doesn’t exactly tell the whole story. If you want to use the Gemini app on an Android device, you have to use it as your Android assistant. This requirement makes the choice of whether to use Gemini on your smartphone a tough one. However, there are plenty of cool and exciting use cases for Gemini — and they might just be compelling enough to make you switch.



Related

Google Gemini: Everything you need to know about Google’s next-gen multimodal AI

Google Gemini is here, with a whole new approach to multimodal AI: Here’s what you should know.


Gemini is far from the first AI chatbot to come out, and it’s not even the only multimodal interface available on Android. With that said, a lot of new AI users are struggling to figure out exactly how to incorporate all these features into their daily lives. After learning the basics of Google Gemini and how it fits into Android, you’ll be able to get started on your Gemini journey like a seasoned pro.


Setting up Gemini as your main assistant

Google Assistant is still there by default, but Gemini can take over instead

A Samsung smartphone on a kitchen counter, showing the Google Gemini interface

The first thing you need to do is install the Gemini app, if you haven’t already. Google is still working out some of the kinks with Gemini, so it isn’t the Android voice assistant by default. After downloading the app, you’ll be greeted with the screens below that explain the differences between Google Assistant and Gemini. This is also where you’ll accept Gemini’s terms and conditions, and be sure to look over them carefully. Human reviewers at Google will look at your interactions with Gemini to make improvements, so don’t share anything with Gemini that you wouldn’t feel comfortable handing over to a stranger.


As we briefly mentioned earlier, Google made the unfortunate decision to tie the Gemini app to Android’s digital assistant. When you set up the Gemini app, you will have to make Gemini your default Android assistant. You can always go back to the Google Assistant, but you’ll lose access to the Gemini app when you do. For those who want to use Google Assistant and Gemini, your only option is to access Gemini through the web client.

Google will warn you that Gemini can’t do everything that Google Assistant can, so it’s not a complete replacement. While this is true, there is a Google Assistant extension built into Gemini that tackles some digital assistant functions that Gemini isn’t ready for. Most people should be able to use Gemini and the Assistant extension without missing any noticeable features.


How you can interact with Gemini

It’s easy to use voice, text, or images to get help from Gemini

There are three main ways to interact with Gemini: voice, text, and images. In general, triggering Gemini will be the same as the Google Assistant, like holding the power button on your smartphone. You can use Hey, Google to call upon Gemini as long as you have the feature enabled and Voice Match turned on. After triggering Gemini by the method of your choice, you can start interacting with the assistant the way you usually would.

You’re probably used to using voice to interact with the Google Assistant, so not much will be new when you switch to Gemini. One way Gemini really separates itself from the Google Assistant is through its image-based capabilities. You can share your screen with Gemini, and this allows it to answer questions in context based on what you’re viewing. This can be helpful if you can’t understand something, need background context, or just want a quick summary. Think of it like Circle to Search supercharged with Google’s large language models.


To get an idea of how Gemini’s context-aware image capabilities work in the real world, take a look at this example below. I gave Gemini access to the Android Police article I was reading about Dave Burke stepping down from the helm of Android’s engineering team, and started to ask some questions about it. Keep in mind that Gemini won’t automatically have access to your screen — you need to manually tap Add this screen to share its contents. This is to protect your privacy and ensure you’re okay with showing Google what’s on your screen. After you tap that share button, you can start asking Gemini about what you’re looking at.


The questions can be as basic or complex as you’d like. To start, I asked Gemini who this article was about. The chatbot quickly replied with the answer of Dave Burke and gave some background on his role at Google. You can expect results to be more accurate since you’re supplying Google with contextual information, but being AI in 2024, there is still the potential for Gemini to slip up. Luckily, you can view the source of the information right below Gemini’s answer. In this case, the Android Police article we started with is where you’d go to fact-check the information.

A better real-world use case for Gemini’s multimodal capabilities is to summarize the contents of your screen, especially if you are reading a long article. I asked Gemini to specifically use two sentences to describe this article, and it did so flawlessly. Being specific with your prompts is a good thing, and adding context when possible will also improve results. You can continue the conversation as long as you’d like, and Gemini can loop in web results and data in its knowledge base to provide the best possible answers.


Google creating a summary of an Android Police article in Gemini.

There are a handful of excellent ways to use Gemini, including many of the ways you’re familiar with using Google Assistant. You can ask Gemini the time, the weather, or the name of the current US president — but those are all things we’ve been able to do with digital assistants for what feels like forever. Gemini’s calling card is the ability to answer questions and complete actions in context. You’ll want to share your screen whenever it makes sense to have the best experience with Gemini. Or, connect Gemini with other Google services, which we’ll go over next.

Don’t forget about extensions

Gemini can interact with plenty of other Google services

The YouTube extension in the Gemini app for Android.


For situations where you need to give Gemini more information than just what’s on your screen, there are Extensions. These tie into other Google services, like Workspace or YouTube. You’re able to ask Gemini about a Google Doc you have access to, or to check up on information from Google Flights. This lets you leverage Gemini to get help with personalized situations — not just ones you might stumble upon on the internet. This can be really useful, but this is another good time to point out that Gemini shouldn’t be used with any sensitive information.

Another great use case for Gemini is for discovery or ideation. You can tell the chatbot exactly what you need to do, and it’ll give you a few ideas about how to get it done. Or, you can ask for suggestions about what to watch on YouTube. In the example below, I asked Gemini to give me a few YouTube videos that recapped Apple’s WWDC 2024 conference.


To get the most out of Gemini, you need to play around with it and find its limits. For the average Android user, the best Gemini features help you with everyday tasks you didn’t even know could be improved. These include summarizing long articles or answering questions in context with Extensions. Having a multimodal AI model built right into your smartphone opens up a lot of possibilities, and hopefully we’ve given you a few ideas on how you can use Gemini on your device.

Related

Google Gemini tips and tricks: Put Google’s most capable AI model to good use

Not sure what’s up with Gemini? Here are the goods

Latest article