OpenAI Sora: Creating Video from Text

OpenAI Sora: Sora is a text-to-video-generative AI model. It can generate videos in different resolutions and aspect ratios and can also rework existing videos allowing for a rapid change of picture, lighting, and shooting style all from a text prompt.

With Sora, you can generate up to a minute of video in a FHD+ resolution. The samples we’ve seen so far look promising. OpenAI uploaded some of Sora’s generated video samples, see here.

Sora can generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.

Sora builds on past research in DALL·E and GPT models. It uses the recaptioning method from DALL·E 3, which involves generating highly descriptive captions for the visual training data.

Moreover, aside from generating a video solely from a text prompt, the model can take an existing still image and generate a video from it, animating the image’s contents with accuracy and attention to small details.






