Generative AI Video Models Face Significant Energy Challenges

Key Points
- Open‑source AI video models consume roughly 90 watt‑hours per clip.
- Video generation is about 30× more energy‑intensive than image generation.
- Video generation is about 2,000× more energy‑intensive than text generation.
- A ten‑second, 240 fps video requires the creation of 2,400 individual frames.
- Energy use comparable to running a 65‑inch TV for about 37 minutes per video.
- Study conducted on Nvidia H100 SXM GPU, a common data‑center processor.
- Calls for AI companies to disclose detailed power‑usage metrics.
- Increasing AI video usage could strain electrical grids as demand rises.
A recent study measuring the power usage of open‑source generative AI video tools found that creating a single AI‑generated video consumes roughly 90 watt‑hours of electricity—far more than image or text generation. The research, conducted on an Nvidia H100 GPU, showed video diffusion to be about thirty times costlier than image generation and two thousand times costlier than text generation. These findings highlight the growing energy demands of AI video models and raise concerns about transparency and sustainability as the technology scales.
Energy Demands of AI Video Generation
Generative artificial intelligence has become a major driver of electricity consumption, especially as video‑creation models enter mainstream use. While text‑based AI queries already require notable compute resources, the shift to generating moving images multiplies the workload dramatically. Video generation involves producing thousands of individual frames for each second of output, turning a simple request into a high‑intensity compute task.
Study Methodology and Findings
Researchers examined several open‑source video diffusion models using an Nvidia H100 SXM GPU, a high‑performance processor common in modern AI data centers. By varying factors such as video length, resolution, and denoising intensity, the team measured the electricity drawn for each configuration. For a typical ten‑second clip rendered at 240 frames per second, the model generated 2,400 separate images, a process that proved substantially more power‑hungry than text or image generation.
The study quantified the energy use as follows:
- One AI‑generated video consumed approximately 90 watt‑hours.
- Generating a single image required about 2.9 watt‑hours.
- Producing a text response used roughly 0.047 watt‑hours.
These numbers translate to video diffusion being thirty times more costly than image generation and two thousand times more costly than text generation. To put the consumption in everyday terms, an energy‑efficient LED bulb draws 8–10 watts, while a typical 65‑inch television consumes around 146 watts. Running a video‑generating AI model for the duration of one clip is comparable to powering that television for about thirty‑seven minutes.
Broader Context and Industry Response
The findings arrive at a time when major AI providers are rolling out consumer‑facing video tools. Although the study focused on open‑source models and excluded high‑profile products such as OpenAI’s Sora and Google’s Veo 3, the energy implications likely extend to those platforms as well. As AI adoption accelerates, the demand on electrical grids and data‑center capacity grows in parallel, prompting industry leaders to invest heavily in new infrastructure.
Calls for greater transparency have intensified, with experts urging AI firms to disclose precise power‑usage metrics. Without clear data, users cannot make informed decisions about the environmental impact of their AI interactions. The research underscores the need for both more efficient model architectures and clearer reporting on energy consumption.
Implications for Users and Policymakers
For end users, the study suggests a need to evaluate the necessity of AI‑generated video content, especially when alternatives exist. Policymakers and energy regulators may also need to consider the cumulative effect of AI workloads on regional power supplies, particularly as AI‑driven services become ubiquitous.
Overall, the research paints a picture of a technology with impressive creative capabilities but a steep energy price tag. Addressing this challenge will require coordinated efforts across model developers, hardware manufacturers, and the broader AI ecosystem.