My skepticism is immediately piqued when companies start talking about how they are integrating AI into their products. Because artificial intelligence is a hot topic right now, many of the announcements and demos about AI have a “yeah, us too” feel to them, regardless of whether they add any real value.
This has become a bit of an occupational hazard for those of us who write about mobile phones for a living, as it seems like no phone can be released these days without rattling off a laundry list of AI features, many of which are just fancy tricks that provide little lasting value to people who actually use the phone.
So when last week's Made by Google event kicked off with a discussion of the company's Gemini AI model and what it means for mobile devices, I felt my brain actively organizing a sit-down strike. “Here we go again,” I thought, bracing myself for a set of features that looked great on the demo stage but had little practical use.
To my surprise, once the event was over and I had a chance to see some additional demos up close, I was pretty impressed with what Google's doing with Gemini and what it means for devices like the Pixel 9 models that launch later this week. While I still have doubts about the overall potential of AI in mobile devices, I'd be pretty stubborn not to acknowledge that many of Gemini's features can help us do more with our smartphones in ways that were unimaginable until recently.
Multimodal
(Image courtesy of Google)
Gemini is also multimodal on mobile devices, meaning it can recognize not only text prompts but also images, code, and video and pull information from those sources, which makes the Gemini-powered Assistant on Pixel devices incredibly powerful.
For example, in one demo, a woman had Gemini create a task to remind her to pre-order the phone with a handwritten reminder about the Pixel 9's August 22 ship date, then add the date to her calendar. Sure, you could do all of these things yourself, without the digital assistant's intervention. But it would require you to stop what you're doing and launch at least two apps so you can manually enter reminders and events. Gemini does it for you, without interrupting your flow.
(Image courtesy of Future)
In another demo at the Made By Google event, Gemini was tasked with watching a YouTube video about Korean restaurants and putting together a list of each individual dish in the Notes app. As an avid watcher of cooking videos, this feature is a real time saver. Instead of writing down the ingredients and prep steps myself, which would mean scrolling through the video to see if I missed anything, I can let Gemini do it for me, saving me time. When I'm done watching the video, I have a clear transcript waiting for me so I can focus on what's being said.
Helpful AI
(Image courtesy of Future)
These examples resonate because they show how AI can save you time by performing tedious tasks for you. Where device makers go wrong is when they bring that time-saving mentality to tasks that require more creativity — a mistake well-known to Google, who drew notoriety for featuring AI-generated fan mail in an Olympic ad.
Missteps aside, Google seems to really understand that Gemini works best when it removes obstacles that stop users from getting things done. Take Gemini with its research feature, which Google plans to add to Gemini Advanced in the coming months. With that feature, you can tell Gemini to search for specific information online. In the demo I saw, Gemini was being used to research after-school programs for kids interested in martial arts.
Before you begin, Gemini with Research will show you a plan of action that lists what you plan to research, which you can review and adjust. It will then scour the web for relevant information to incorporate into a report available through Google Docs, with links to the online sources Gemini finds in the report.
This is a very appealing approach to online research, especially since Gemini can probably compile information faster and more thoroughly than I can. But I'm not completely ignored by the AI; it can fine-tune my research plan by suggesting what to look for, or track a source to see if it meets my criteria. Rather than letting Gemini plod through all the tasks, I let it do the heavy lifting instead, allowing me to make an informed decision at the end.
People who can speak
(Image courtesy of Google)
We also love the Gemini Live feature, which was shown off at Made by Google and is now rolling out to Gemini Advanced subscribers. This is the voice component of Google's chatbot, designed to sound natural and conversational. This is great news for those of us who have struggled to communicate with voice assistants like Siri by using the wrong words or not communicating our situation clearly enough.
A potentially useful aspect of Gemini Live is that you can use the voice assistant as a brainstorming tool: you ask Gemini a question (such as what you're planning to do this weekend), and the assistant starts rattling off ideas, with the ability to interrupt at any time to dig deeper into ideas that seem promising, and because Gemini understands the context of the conversation, questions can be informal and improvised.
Google Gemini Live… The audio sounds supernatural. Here are some options #MadeByGoogle pic.twitter.com/YJWGEuqSAXAugust 13, 2024
It's hard to say you'll really like Gemini Live until you've tried it for yourself, but the conversational aspects are fascinating. Like the ability to tweak Gemini's parameters in Research, this strikes me as a more collaborative use of AI, with the chatbot responding to user input and tailoring the information presented to the user's specifications.
Gemini Outlook
I still have doubts about AI on phones in general, and Gemini in particular. Features like image generation have never piqued my interest, and there's always the risk that particularly lazy or unethical users will appropriate the output of something like Gemini with Research as their own. And of course we all need to be vigilant about whether the promised privacy safeguards are being implemented.
But my general complaint about AI features on phones is that most of them feel like fancy party tricks, which is definitely not the case with the Gemini features I saw this week. Google is clearly developing tools to let your phone do even more, and there's enough here to get you excited about the possibilities.