OpenAI launches GPT-4.1 models with massive token limits and faster performance
OpenAI has launched the GPT-4.1 model family, which includes the standard GPT-4.1, as well as the GPT-4.1 Mini and GPT-4.1 Nano.
The new models, which were revealed during a livestream today, are designed with code generation and software development tasks in mind.
OpenAI says the new models can handle up to one million tokens in a single context window. The previous GPT-4o model had a limit of 128,000 tokens, so this is a significant increase that’ll enable the new models to handle large codebases and extensive technical documents.
OpenAI is promising that the new GPT-4.1 model is 40% faster and 26% cheaper than its predecessor. It’ll cost $2 per million input tokens and $8 per million output tokens. However, the Mini and Nano versions are cheaper again.
In OpenAI’s internal benchmarks, GPT-4.1 scored as high as 54.6% on SWE-bench Verified. That’s behind Google’s Gemini 2.5 Pro and Anthropic’s Claude 3.7 Sonnet, but at a lower cost.
The new models excelled at understanding repositories and generating clean code, OpenAI said. During the livestream, OpenAI demonstrated how the model could build a flashcard app live on stage.
OpenAI sees these models as a step towards creating an “agentic software engineer” that can handle the entire software development lifecycle from planning to deployment and maintenance.
OpenAI is planning to retire the older GPT-4 model by April 30 and the GPT-4.5 preview by July. The company is positioning GPT-4.1 as the new standard.
The new models do have some limitations, such as decreased accuracy when more input tokens are used. The company also says the models tend to require more specific prompts to achieve optimal performance.
OpenAI says that while GPT-4.1 is not a replacement for human engineers, it is a significant step forward in AI-assisted software development.