Google DeepMind’s Genie 3 is able to dynamically change the state of its virtual worlds

Google DeepMind released Genie 2 at the beginning of December

The Genie AI system family is what’s known as a world model. They can generate images as a user – a human, or more likely an automated AI agent – moves through the world that the software is simulating. The video of the model’s action may look like it was created by a video game. However, DeepMind has always used Genie 2 to train other AI systems so that they can be better at their intended tasks. DeepMind’s new Genie 3 model was announced by the lab on Tuesday. The lab believes that it has created an even better system to train AI agents.

The jump between Genie 2 & 3 doesn’t seem as dramatic at first glance as the one made by the model last year. DeepMind’s Genie 2 system was able to generate 3D worlds and accurately reconstruct a part of the environment, even after the user left the scene. Prior world models were often weak in terms of environmental consistency. For example, Decart’s Oasis system had difficulty remembering the layouts of the Minecraft levels it would generate.

The enhancements made by Genie 3 may seem modest in comparison, but Shlomi Fruchter and Jack Parker-Holder from DeepMind, who spoke at a Google press conference held before today’s announcement, said that they are important steps on the road to artificial general intelligence. What exactly does Genie do better

? It outputs footage in 720p instead of 360p as its predecessor. It can also sustain a “consistent” for longer. Genie 2’s theoretical limit was 60 seconds. In practice, however, the model would start to hallucinate earlier. DeepMind claims that Genie 3 can run for several minutes without producing artifacts.

DeepMind has also added a new capability to the model. “promptable world events.” Genie 2 had an interactive aspect in that the user or AI agent could input movement commands, and the model would respond once it generated the next frame. Genie 3 performs this in real-time. Text prompts can be used to modify the simulation. These text prompts instruct Genie to change the state of the generated world. In a DeepMind demo, the model was instructed to insert a herd deer into a scene where a person was skiing down a mountain. DeepMind says that the deer did not move in a realistic way, but this feature is what makes Genie 3 so special.

The lab, as mentioned previously, primarily envisions this model as a training and evaluation tool for AI agents. DeepMind claims Genie 3 can be used to train AI systems how to handle “what if” situations that aren’t covered in their pre-training. Fruchter pointed out that Genie 3 can be used to teach self-driving cars how to safely avoid pedestrians who walk in front of them.

(

Google DeepMind)

Despite improvements DeepMind made to Genie the lab admits that there is still much to be done. The model struggles to render text and can’t accurately generate real-world places. DeepMind believes that for Genie to truly be useful, the model must be able sustain a simulated environment for hours, and not minutes. The lab believes Genie is still ready to have a real-world effect. Parker-Holder said

“We already at the point where you wouldn’t use [Genie] as your sole training environment, but you can certainly finds things you wouldn’t want agents to do because if they act unsafe in some settings, even if those settings aren’t perfect, it’s still good to know,” . Genie 3 will not be available to the public for the time being “You can already see where this is going. It will get increasingly useful as the models get better.”

. DeepMind has said that it is working to make this model available to more testers.

www.aiobserver.co

More from this stream

Recomended