Google is taking another step into text-to-video with Lumiere, a new AI model capable of creating surprisingly high-quality content.
The tech giant has certainly come a long way since the days of Imagen Video. The subjects of Lumiere's videos are no longer those nightmarish creatures with melting faces. Now things seem more realistic. Sea turtles look like sea turtles, the fur on the animals has the right texture, and the people in the AI clips have real smiles (for the most part). What's more, there's very little of the weird jerky motion seen in other text-to-video AIs. The movement is pretty much smooth like butter. Inbar Mosseri, head of the research team at Google Research, posted a video on her YouTube channel demonstrating Lumiere's capabilities.
Google has put a lot of effort into making Lumiere content look as realistic as possible. The development team has accomplished this by implementing what is called the Space-Time U-Net (STUNet) architecture. The technology behind STUNet is very complex. But as Ars Technica explains, it allows Lumiere to understand where objects are in the video, how they move and change and display those actions at the same time resulting in seamless creation.
This is in contrast to other generative platforms that first create keyframes in clips and then fill in the gaps. Doing so creates the jerky movement that the technique is known for.
Well equipped
In addition to creating text-to-video conversion, Lumiere has many features in its toolkit including multimedia support.
Users will be able to upload source images or videos to the AI so it can edit them according to their specifications. For example, you can upload a photo The girl with the pearl earring by Johannes Vermeer and turned into a short clip where she smiles instead of stares blankly. Lumiere also has an ability called Cinemagraph which can animate highlighted parts of images.
Google demonstrates this by identifying a butterfly sitting on a flower. Thanks to artificial intelligence, the resulting video shows the butterfly flapping its wings while the flowers surrounding it remain still.
Things get especially impressive when it comes to video. Video Inpainting, another feature, works similarly to Cinemagraph where AI can edit parts of clips. Women's green plaid dress can be transformed into shiny gold or black. Lumiere goes a step further by offering a video mode for changing video themes. An ordinary car on the road can be transformed into a vehicle made entirely of wood or Lego bricks.
Still in the works
It is not known if there are plans to release Lumiere to the public or if Google intends to implement it as a new service.
Perhaps we could see AI appearing on a future Pixel phone as an evolution of the Magic Editor. If you're not familiar with it, Magic Editor uses “AI processing.” [to] “Intelligently” change areas or objects in photographs on the Pixel 8. Video Inpainting seems to us to be a natural progression of the technology.
For now, it looks like the team will keep it behind closed doors. Although this AI may be impressive, it still has its problems. Jerky animation is present. In other cases, people's limbs are somewhat deformed. If you want to learn more, you can find a Google research paper about Lumiere on Cornell University's arXiv site. Be warned: it's dense reading.
And be sure to check out TechRadar's roundup of the best AI generators of 2024.