The new model introduces a 1-million-token context window, stronger reasoning benchmarks, and improved efficiency for enterprise and developer workloads.
OpenAI Releases GPT-5.4
OpenAI has launched GPT-5.4, positioning it as its most capable and efficient frontier model for professional tasks.
The release includes multiple variants designed for different workloads:
- GPT-5.4 (Standard) – balanced performance for general applications
- GPT-5.4 Thinking – optimized for deep reasoning tasks
- GPT-5.4 Pro – tuned for high performance and speed
The model is available through OpenAI’s API platform, targeting developers building enterprise-grade AI systems.
Massive Context Window for Complex Workflows
One of the headline upgrades is the context window, which can now reach 1 million tokens.
That capacity allows the model to process extremely large datasets within a single prompt.
Examples include:
- Entire code repositories
- Long legal documents
- Extensive financial datasets
- Large research reports
For professional workflows, the larger context window makes it easier to build multi-step AI systems that operate on complex information environments.
Improved Efficiency and Benchmark Performance
OpenAI emphasized that GPT-5.4 is significantly more efficient with tokens, meaning it can solve tasks with fewer tokens than previous models.
The model also posted strong benchmark results across multiple categories.
Key performance highlights include:
- Record scores on computer-use benchmarks OSWorld-Verified and WebArena Verified
- 83% score on OpenAI’s GDPval test, which evaluates knowledge-work tasks
The model also ranked first on Mercor’s APEX-Agents benchmark, which measures professional abilities in fields such as law and finance.
Mercor CEO Brendan Foody said the model excels at producing long-horizon outputs, including:
- Slide decks
- Financial models
- Legal analysis
He noted the system delivers these outputs faster and at lower cost than competing frontier models.
Reducing Hallucinations
OpenAI also highlighted improvements in factual reliability, a persistent challenge in large language models.
According to the company:
- GPT-5.4 is 33% less likely to make errors in individual claims compared with GPT-5.2
- Overall responses are 18% less likely to contain inaccuracies
Reducing hallucinations remains a key priority as AI systems increasingly handle professional decision-making tasks.
A New System for Tool Use
Alongside the model launch, OpenAI introduced a new API feature called Tool Search.
Previously, developers needed to include full tool definitions inside system prompts.
That approach created problems:
- Prompts became large and inefficient
- Token usage increased as more tools were added
With Tool Search, models can look up tool definitions dynamically when needed.
The result:
- Faster responses
- Lower token costs
- Better performance in complex AI agent systems
Safety Testing for AI Reasoning
OpenAI also introduced a new safety evaluation focused on chain-of-thought reasoning.
Chain-of-thought refers to the step-by-step reasoning explanations models produce when solving complex tasks.
Researchers have long worried that models could misrepresent their reasoning, intentionally or accidentally.
OpenAI’s testing suggests this risk is lower in the GPT-5.4 Thinking variant.
The results indicate the model is less able to hide its reasoning, meaning chain-of-thought monitoring remains an effective safety measure.
The Next Step in Frontier AI
GPT-5.4 arrives amid intensifying competition in frontier AI models, as companies race to build systems capable of professional-level knowledge work.
With improvements in context length, efficiency, and reasoning, OpenAI is positioning the model as a foundation for AI agents that can manage complex workflows across industries.
The real question now isn’t whether AI can assist professionals—it’s how many tasks it can eventually handle on its own.
TL;DR
OpenAI has released GPT-5.4, a new frontier model with standard, Pro, and Thinking versions. The model features a 1-million-token context window, improved reasoning benchmarks, better efficiency, and reduced hallucination rates, targeting enterprise-level AI applications.
AI Summary
- OpenAI launches GPT-5.4 with Pro and Thinking variants.
- Supports 1 million token context window via API.
- Improved benchmarks in computer use and professional tasks.
- 33% fewer claim errors compared to GPT-5.2.
- Introduces Tool Search to improve AI tool-calling efficiency.








