OpenAI Unveils GPT-5.4 with Native Computer Use and Expanded Context Window

OpenAI released GPT-5.4, a new frontier model offered in three configurations for general, reasoning-intensive, and high‑demand workloads. The model shows benchmark gains across professional tasks, introduces native computer use, and expands the context window to a 1‑million‑token limit. A redesigned tool‑search system reduces token usage, and a new safety evaluation tests chain‑of‑thought controllability. The launch positions GPT-5.4 as OpenAI’s most capable model for professional work while highlighting ongoing competition in the AI frontier.

Model Launch and Configurations

OpenAI announced GPT-5.4, describing it as the company’s most capable and efficient frontier model for professional work. The model is available in three versions: a standard release for general use, a "Thinking" variant designed for tasks that benefit from extended chain‑of‑thought reasoning, and a "Pro" version aimed at the highest‑demand workloads. The "Thinking" option is accessible to Plus, Team, and Pro subscribers, while the "Pro" tier is reserved for higher‑priced ChatGPT plans.

Benchmark Performance

According to OpenAI’s internal evaluations, GPT-5.4 matches or exceeds industry professionals in a majority of professional task comparisons, improving on previous versions. On a desktop‑navigation benchmark, the model achieved a success rate that surpasses the reported human performance benchmark. It also topped a professional‑task benchmark that assesses sustained workflows across fields such as investment banking and corporate law. OpenAI reports reductions in factual errors and hallucinations compared with earlier releases.

New Capabilities

The most significant addition is native computer use, allowing the model to operate software, navigate file systems, and execute multi‑step workflows without external agentic frameworks. This capability is built into the general‑purpose model, simplifying integration for developers. The API also supports a context window up to 1‑million tokens, more than double the previous limit, enabling full‑context processing of large documents, codebases, and financial records.

Efficiency Improvements

A redesigned tool‑search system lets the model retrieve tool definitions on demand, cutting token usage by nearly half in internal tests. This reduction translates to lower costs and faster responses for large‑scale agentic systems.

Safety Evaluation

OpenAI introduced an open‑source evaluation called CoT Controllability, which tests whether the model can deliberately obscure its reasoning to evade monitoring. The results suggest the model shows low ability to hide its chain‑of‑thought, which OpenAI frames as a positive safety signal.

Competitive Landscape

The release arrives amid intense competition from other frontier AI models, each leading in different benchmark categories. While GPT-5.4 leads on desktop computer use and professional knowledge‑work tasks, other models excel in coding or abstract reasoning. OpenAI’s rapid release cadence underscores its strategy of staying visible in a fast‑moving market.