Introducing Textual content-to-Video Generator, Tune-A-Video

January 13, 2023

1

Because the delivery of text-to-image DALL-E by OpenAI, the AI world has been working in the direction of related fashions, for instance, Midjourney, and Imagen, to call a couple of. Quickly got here text-to-video fashions like Transframer, NUWA Infinity, CogVideo, and many others. Even text-to-voice fashions like VALL-E had been lately unveiled by Microsoft.

Final month, researchers from Present Lab, Nationwide College of Singapore got here up with a text-to-video generator known as Tune-A-Video (TTV) to handle the problem of One-Shot Video Technology, the place solely a single text-video pair is supplied for coaching an open-domain text-to-video generator. With customised Sparse-Causal Consideration, Tune-A-Video expands spatial self-attention to the spatiotemporal area utilizing pretrained text-to-image (TTI) diffusion fashions.

Test the unofficial implementation of Tune-A-Video right here.

Join your weekly dose of what is up in rising expertise.

In a single coaching pattern, the projection matrices within the consideration block are modified to incorporate the related movement info. Tune-A-Video can create temporally coherent movies for numerous purposes, together with altering the topic or background, modifying attributes, and transferring kinds.

It was found that TTI fashions may produce pictures that match verb phrases effectively and that increasing TTI fashions to generate totally different pictures directly demonstrates unexpectedly sturdy content material consistency.

Obtain our Cellular App

Nice-Tuning: TTI fashions are expanded to TTV fashions utilizing TTI mannequin weights which have already been pretrained. The text-video pair is then subjected to one-shot tuning with the intention to create a one-shot TTV mannequin.

Inference: A modified textual content immediate is used to generate new movies.

After receiving a video and textual content pair as enter, it modifies the projection matrices in consideration blocks.

Learn the complete paper right here.

Previous article$20K Buys Insider Entry to Telegram Servers, Darkish Internet Advert Claims

Next articleMethods to Ask Probing Questions Utilizing the Funnel Method

Introducing Textual content-to-Video Generator, Tune-A-Video

Join your weekly dose of what is up in rising expertise.

Obtain our Cellular App

DoNotPay AI Lawyer: Would You Like A Robotic Lawyer To Symbolize You?

Create a Native dbt Challenge

Shutterstock Companions With Meta to Gasoline Generative AI Ambitions

LEAVE A REPLY Cancel reply

Most Popular

Torque Social Hour: CertifyWP | @thetorquemag

Methods to Ask Probing Questions Utilizing the Funnel Method

$20K Buys Insider Entry to Telegram Servers, Darkish Internet Advert Claims

Twitter says it is innocent for information leak of 200M customers — why we’re skeptical

Recent Comments

ABOUT US

POPULAR POSTS

Torque Social Hour: CertifyWP | @thetorquemag

Methods to Ask Probing Questions Utilizing the Funnel Method

$20K Buys Insider Entry to Telegram Servers, Darkish Internet Advert Claims

POPULAR CATEGORY