Wednesday, December 18, 2024

Google’s Genie 2 AI tool can generate a playable 3D world from a “single prompt image”

Must read

Google’s AI tool Genie 2 is a “large-scale foundation world model” capable of generating “an endless variety of action-controllable, playable 3D environments” from a single image prompt.

Genie 2 can create different perspectives, such as first-person view, isometric views, or third person driving videos, as well as “complex 3D visual scenes,” with interactive objects like doors and explosive barrels.

Physics effects include smoke, gravity, lighting, and reflections can also be “rapidly” prototyped and played by human or “AI agent” using keyboard and mouse. According to a report detailing the advanced tech, this enables artists and designers to prototype quickly, “which can bootstrap the creative process for environment design, further accelerating research.”

“Thanks to Genie 2’s out-of-distribution generalisation capabilities, concept art and drawings can be turned into fully interactive environments,” the report explained. “This enables artists and designers to prototype quickly, which can bootstrap the creative process for environment design, further accelerating research.

“While this research is still in its early stage with substantial room for improvement on both agent and environment generation capabilities, we believe Genie 2 is the path to solving a structural problem of training embodied agents safely while achieving the breadth and generality required to progress towards AGI.”

The full report, including examples, is available on Google’s Deepmind sub-site.

Earlier today, UK specialist media publisher Future signed a strategic partnership with OpenAI to use its ChatGPT tool across its sales, marketing, and editorial businesses.

Latest article