Sora's Arrival: A New Era of Neo-Dadaism in Hollywood?

Published on March 27, 2024
Tag: Sora training, Hollywood file, GPU power

Sora brings another wave of awe to the world! The first batch of short films, collaborated on by directors and artists with Sora, has been released. When discussing their initial impressions, they praise Sora for turning impossible ideas into reality. What's most astonishing is its ability to create completely surreal content.

Just days ago, rumors were swirling about OpenAI pitching Sora to Hollywood. Yesterday morning, the first wave of directors and artists granted Sora authorization has already unveiled their latest short films. It's been nothing short of lightning speed!

Hollywood has seemingly transformed overnight into what some are dubbing Sorawood.

One observer remarked, "My initial takeaway is this: human creativity is paramount, and Sora's allure lies in its ability to infuse surrealism into reality." Are we on the cusp of a new era of neo-Dadaism?

How impactful are these latest Sora-produced short films? Let's dive in:

"Alien Species" Documentary: Enter the Flying Pig!

Sora Fly Pig

This could easily be hailed as an epic "teaser trailer" for an animal documentary. Sora has breathed life into a myriad of alien species previously unseen, all born purely from the imagination.

Cats with fish-like tails, giraffes sporting crane-like lower bodies, sharks sprouting octopus-like tentacles overnight, bees boasting horse heads—the creative possibilities seem endless.

Sora Fly Pig

At the helm of this creative endeavor is Don Allen III, a versatile artist, speaker, and consultant whose journey began at DreamWorks Animation. Having collaborated with numerous tech and entertainment giants, Don has delved into the realms of mixed reality, virtual reality, and AI applications.

"I have long been immersed in creating alien species within augmented reality. These imaginative combinations can now be prototyped with greater ease, and these 3D characters can be fully realized and placed into spatial computing environments."

Don emphasizes that Sora's extraordinary nature is its greatest asset—it's not bound by conventional physical laws or traditional modes of thinking. Through collaboration with Sora, Don's focus has shifted from "technical hurdles to pure creativity... ushering in a new realm of instant visualization and rapid prototyping."

Simultaneously, Don notes, "This allows me to devote more time and energy to more essential areas... and to the emotional depth I expect my characters to convey."

Diverse Elements in Combination

Let's take a look at another short film produced by the creative agency Native Foreign.

This video is a composition of various elements, featuring vintage city street scenes, a man falling in love with a woman at a bar, and a car floating on the ocean's surface.

Based in Los Angeles, California, Native Foreign is an Emmy-nominated creative agency renowned for its expertise in brand storytelling, motion graphics, title design, and advanced generative AI workflows.

Co-founder Nik Kleverov is harnessing Sora to visualize concepts and iterate on creative ideas quickly for brand partners. He believes that storytelling is no longer strictly constrained by budget limitations.

"As a creative who thrives on dynamic thinking, I feel that any idea can come to life when using Sora. It's liberating to see the possibilities unfold."

Cost Estimation for Sora Models

While the astonishing results of these tests are undeniable, the associated costs are prohibitively high.

A recent report from Factoral Funds has estimated the costs of training and inference for Sora models.

Highlights from the article include:

1. Training Requirements: Sora training demands significant computing resources, with an estimated need for 4211–10528 Nvidia H100 GPUs running for a month.

2. Inference Costs: Approximately one H100 GPU can generate 5 minutes of video per hour.

Module Training Computation Estimation: Extrapolating from DiT to Sora

According to OpenAI's report, although detailed information on Sora is limited, it can be regarded as an extension of DiTs (Diffusion Transformers) in video generation based on data from the Diffusion Transformers paper. The DiT-XL model has 675M parameters and utilizes approximately 1021 FLOPS, equivalent to approximately 0.4 Nvidia H100 running for a month (or 12 days using one H100).

Calculation Multiplier: Assuming videos are encoded at 24fps, a 1-minute video contains 1440 frames. Considering Sora's spatial and temporal compression, with an 8x compression rate as per the Diffusion Transformers paper, approximately 180 frames can be represented in latent space.

Therefore, compared to processing images with DiTs (Diffusion Transformers), processing videos requires at least 180 times the computation.

Model Size and Dataset: It is estimated that Sora's model parameters far exceed 675M. Assuming a model with 20B parameters, the computational requirement increases by 30 times compared to DiTs. Additionally, the dataset for Sora training is significantly larger than Diffusion Transformers, multiplying the dataset size by 4-10 times.

Considering these factors, the computational requirements for Sora training are determined.

Inference vs. Module Training Computation

Comparison of inference and module training vomputation: module training vomputation involves one-time large-scale calculations, while inference computation, although smaller, will be frequently invoked as the model is widely applied.

The balance point refers to the moment when the computation cost for inference exceeds that required for training.

Based on the extrapolation from DiT to Sora, the computational cost for Sora to generate a segment of video is approximately 708×10^15 FLOPS, equivalent to each Nvidia H100 GPU generating 5 minutes of video per hour.

After generating 15.3M to 38.1M minutes of video, inference computation will surpass module training vomputation. According to estimates, YouTube uploads approximately 43M minutes of video per day, so Sora's balance point will be reached soon in practical applications.

GPU Power Shortage

Sora demands immense GPU power. While these costs may be a drop in the bucket for top-tier Hollywood producers and directors, such as Tyler Perry, who scrapped an $800 million studio expansion plan due to Sora, they remain unaffordable for many aspiring creators and small companies seeking to nurture creative ideas.

This year has seen a shortage of high-end GPUs suitable for large-scale computation. Economic sanctions in specific countries further restrict enterprises from selling high-end GPUs like Nvidia RTX 4090, Nvidia A100, and H100 to China, Vietnam, Saudi Arabia, and others. This exacerbates the urgent shortage of GPUs.

However, this shortage has also spurred the rise of the GPU leasing and GPU server leasing industries. View more GPU leasing services for Sora online.

In addition to the Nvidia RTX 4090, Nvidia A100, H100, the GPU online services with GeForce RTX 2060, RTX 4060, Nvidia A5000, Nvidia V100, Nvidia A40, A6000 are also suitable options for Sora training and Sora online services, offering a more cost-effective solution compared to H100.

The Trend towards GPU Power for AI Model Training in the Film Industry

In conclusion, utilizing GPU power for models like Sora is becoming a significant trend in the film industry. This promises a future where more creative and cost-effective works will be produced. The spring of AI in the film industry is on the horizon!