Google Whisk is an innovative way to create AI visuals by using image prompts. Here’s how you can try it

December 31, 2024

Google’s new AI tool, Google Whisk, makes it easier to create visual concepts. Instead of asking for a description of what you see in your head, Whisk lets users input three image prompts. One for the subject, one scene, and one style. Whisk handles the rest for you, making it easier to experiment with new ideas.

While the best AI image creators require you write a detailed prompt to get started, Whisk does that behind the scene. Google’s Gemini algorithm analyzes the images you upload to the web-based Whisk interface and automatically creates a caption for each. The images are then fed to the Imagen 3 model which creates a matching picture.

You could, for example, use a photo of an urban landscape as the scene and a picture of a car to represent the subject. You can then add a watercolor style to see the results. You’ll receive a pair images based on the inputs you make when you click the button.

It’s simple to remix the images from here. You can specify text-based details in the interface to alter the results. You can also drop in different images or play the dice to get some inspiration. The feed displays new results in pairs, which makes it easy to generate ideas. You can refine images by revealing text prompts and adding more detail.

Google Labs – YouTube

Although Whisk is designed to eliminate text-based prompts from the process, Google offers the option to refine them because the results may not always match the source material.

Google explained in a blog about the experimental tool that Whisk “captures the essence of your subject, not an exact copy.” It is only as effective as Gemini’s analysis of the images submitted. This is impressive but it can’t get inside your head. You might expect Whisk to focus on one detail in an image when it actually focuses on another.

This post goes on to explain: “Since Whisk only extracts a few key features from your image it may generate images that are different from your expectations. The generated subject could have a different skin tone, hairstyle, or height. We know these features are important for your project, and Whisk might miss the mark. So we allow you to view and edit the underlying prompts anytime.”

Despite its shortcomings, Whisk is an interesting application of Google AI tools. The generative models used are the same as when you chat with Gemini using its text interface. Whisk relies on image inputs instead of text, making it more intuitive and accessible for visual creators.

According to early feedback from digital creators, Google refers Whisk as “a type of creative tool”which is intended for “rapid exploration, not pixel perfect edits.”

How to try Google Whisk.

Google Whisk can only be used by users in the US. You can use your web browser to try it out if you’re in the US. The experimental tool can be used for free. Google will use the data you collect from Whisk to refine and develop future AI-based products.

{{post_title}}

Google Whisk is an innovative way to create AI visuals by using image prompts. Here’s how you can try it

Google Labs – YouTube

How to try Google Whisk.

You might also enjoy…

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

Google Labs – YouTube

How to try Google Whisk.

You might also enjoy…

RELATED ARTICLES

Improving Deep Learning with a Little Help from Physics

ShengShu launches Vidu Q1, which puts full-stack video and audio in...

Cyberpunk 2077 Ultimate Edition on Switch 2 uses DLSS