Google's generative AI tools are receiving some of the enhancements the company previewed at Google I/O: Starting this week, the company is rolling out the next-generation version of its Imagen image generator, reintroducing the ability to generate AI people (after an embarrassing controversy earlier this year). Google's Gemini chatbot is also getting Gems, the company's approach to bots with custom instructions, similar to ChatGPT's custom GPTs.
Google's Imagen 3 is an upgrade to the company's image generator on Gemini. The company says this next-generation AI model will “set a new standard for image quality,” with built-in guardrails to avoid over-correcting for diversity, like the strange historical AI images that made headlines earlier this year.
“Across a range of benchmarks, Imagen 3 performs well against other image generation models,” Dave Citron, product manager at Gemini, said in a press release. The tool can guide image generation with additional prompts if you don't like the image it initially generates.
Citron says Imagen 3 is “superior” to competing products, and it also includes Google's SynthID tool to watermark images, making it clear they were created with AI and are not genuine.
Citron said the person generation feature will be back for paid users in the coming days, days after Google removed the feature a few months ago. He said the new guardrails will prevent the generation of “photorealistic, identifiable individuals,” a far cry from the problematic deepfakes generated by Elon Musk's Grok. They also prohibit children and (as with other image generation tools) gore, violence, and sexual scenes. The product manager backed up the expectations by saying Gemini's images aren't perfect, but promised the company will continue to listen to user feedback and improve it accordingly.
Starting this week, the Imagen 3 model will be available to all users, but the reintroduction of imagery featuring people will begin with paid users. English-speaking Gemini Advanced, Business, and Enterprise users can expect people image generation to resume “in the coming days.”
First previewed at Google I/O 2024, Gems are Google's custom chatbots with user-created instructions. It's essentially Gemini's answer to OpenAI's GPT, which Google's competitor rolled out late last year. Gems will start rolling out in the coming days.
“With Gems, you can create a team of experts to help you think through tough projects, brainstorm ideas for an upcoming event, or write the perfect caption for a social media post,” Citron wrote. “Gems can also remember detailed instructions, saving you time on tedious, repetitive, or difficult tasks.”
In addition to the blank slate of custom gems, Gemini also includes pre-made gems that are “helpful to get you started” and can inspire new ideas. Pre-made gems include:
Learning Coach – helps you understand complex topics
Brainstorming – generating new ideas
Career Guide – Guiding you through skill upgrades, decisions and goals
Writing Editor – Provides constructive feedback on grammar, tone and structure
Coding Partners – Improve your coding skills and inspire new projects
Gems is available on desktop and mobile starting today, but it's only available to Gemini Advanced, Business, and Enterprise subscribers, so you'll need a paid plan to try it out.