OpenAI Introduces Faster, Lower-Cost GPT-5.4 Mini and Nano Models

OpenAI has launched two smaller versions of its latest GPT-5.4 model—Mini and Nano—designed for developers who prioritize speed and cost over maximum reasoning power. The Mini model runs more than twice as fast as the full model while staying close on key benchmarks, and the Nano model focuses on simple classification and data‑extraction tasks. Both models support text and image inputs, tool use, function calling, and a 400,000‑token context window, and they are available today via the API, Codex, and ChatGPT. This tiered approach lets developers allocate cheaper models for routine work and reserve the full model for complex reasoning, reshaping how real‑time AI applications are built.

Background

OpenAI is expanding its model portfolio by offering scaled‑down versions of its flagship GPT-5.4 system. The move reflects a growing demand from developers for models that deliver rapid responses and lower operating costs without sacrificing core capabilities.

New Mini and Nano Models

The GPT-5.4 Mini model delivers more than twice the speed of the full GPT-5.4 while maintaining performance that is close to the larger version on major benchmarks. The Nano variant pushes efficiency further, targeting simpler tasks such as classification and data extraction where raw speed and cost are paramount.

Performance and Cost

Both Mini and Nano retain the full feature set, including text and image inputs, tool use, function calling, and a 400,000‑token context window. The Mini model scores 54.4 percent on SWE‑Bench Pro compared with 57.7 percent for the full model, and 72.1 percent on OSWorld‑Verified versus 75 percent for the larger version, showing a narrow performance gap. Cost reductions are dramatic: Mini is priced at $0.75 per million input tokens and $4.50 per million output tokens, while Nano drops to $0.20 per million input tokens and $1.25 per million output tokens.

Availability and Use Cases

The new models are live today. Mini is accessible through the API, Codex, and ChatGPT, with free and Go users reaching it via the “Thinking” option. Nano is currently limited to the API and is aimed at high‑volume workloads where cost control is critical. Developers can employ a multi‑model workflow, pairing a larger model for planning with Mini or Nano for execution, which mirrors real‑world app architectures that separate judgment from repetitive processing.

Implications for Developers

This tiered offering enables developers to shift routine coding and background tasks to cheaper tiers while reserving the full GPT‑5.4 for complex reasoning. Early feedback indicates that Mini can match or outperform competing models on several tasks at lower cost, and in some cases delivers stronger end‑to‑end results than the full model. The expanded suite empowers developers to balance speed, cost, and capability more effectively, shaping the next generation of real‑time AI features.