Eli Collins, a vice president of product management at Google DeepMind, first demoed generative AI video tools for the company’s board of directors back in 2022. Despite the model’s slow speed, pricey cost to operate, and sometimes off-kilter outputs, he says it was an eye-opening moment for them to see fresh video clips generated from a random prompt.
Now, just a few years later, Google has announced plans for a tool inside of the YouTube app that will allow anyone to generate AI video clips, using the company’s Veo model, and directly post them as part of YouTube Shorts. “Looking forward to 2025, we’re going to let users create stand-alone video clips and shorts,” says Sarah Ali, a senior director of product management at YouTube. “They’re going to be able to generate six-second videos from an open text prompt.” Ali says the update could help creators hunting for footage to fill out a video or trying to envision something fantastical. She is adamant that the Veo AI tool is not meant to replace creativity, but augment it.
This isn’t the first time Google has introduced generative tools for YouTube, though this announcement will be the company’s most extensive AI video integration to date. Over the summer, Google launched an experimental tool, called Dream Screen, to generate AI backgrounds for videos. Ahead of next year’s full rollout of generated clips, Google will update that AI green-screen tool with the Veo model sometime in the next few months.
The sprawling tech company has shown off multiple AI video models in recent years, like Imagen and Lumiere, but is attempting to coalesce around a more unified vision with the Veo model. “Veo will be our model, by the way, going forward,” says Collins. “You shouldn’t expect five more models from us.” Yes, Google will likely release another video model eventually, but he expects to focus on Veo in the near future.
Google faces competition from multiple startups developing their own generative text-to-video tools. OpenAI’s Sora is the most well-known competitor, but the AI video model, announced earlier in 2024, is not yet publicly available and is reserved for a small number of testers. As for tools that are widely available, AI startup Runway has released multiple versions of its video software, including a recent tool for adapting original videos into alternate-reality versions of the clip.
YouTube’s announcement comes as generative AI tools have grown even more contentious for creators, who sometimes view the current wave of AI as stealing from their work and attempting to undermine the creative process. Ali doesn’t see generative AI tools coming between creators and the authenticity of their relationship with viewers. “This really is about the audience and what they’re interested in—not necessarily about the tools,” she says. “But, if your audience is interested in how you made it, that will be open through the description.” Google plans to watermark every AI video generated for YouTube Shorts with SynthID, which embeds an imperceptible tag to help identify the video as synthetic, as well as include a “made with AI” disclaimer in the description.
Hustle-culture influencers already try to game the algorithm by using multiple third-party tools to automate the creative process and make money with minimal effort. Will next year’s Veo integration lead to a new avalanche of low-quality, spammy YouTube Shorts dominating user feeds? “I think our experience with recommending the right content to the right viewer works in this AI world of scale, because we’ve been doing it at this huge scale,” says Ali. She also points out that YouTube’s standard guidelines still apply no matter what tool is used to craft the video.
AI art oftentimes has a distinct aesthetic, which could be concerning for video creators who value individuality and want their content to feel unique. Collins hopes Google’s thumbprints aren’t all over the AI video outputs. “I don’t want people to look at this and say, ‘Oh, that’s the DeepMind model,’” he says. Getting the prompt to produce an AI output aligned with what the creator envisioned is a core goal, and eschewing overt aesthetics for Veo is critical to achieving a wide-ranging adaptability.
“A big part of the journey is actually building something that’s useful to people, scalable, and deployable,” says Collins. “It’s not just a demo. It’s being used in a real product.” He believes putting generative AI tools right inside of the YouTube app will be transformational for creators, as well as DeepMind. “We’ve never really done a creator product,” he says. “And we certainly have never done it at this scale.”