VentureBeat with Midjourney
A Surprise! After reports that OpenAI had purchased the hot coding startup Windsurf a few days ago, it appears the former company is launching its own competitor under the brand name Codex as a research preview, competing against Windsurf and Cursor and the growing list AI coding tools provided by startups and large companies such as Microsoft and Amazon.
The new version of OpenAI’s Codex AI code completion model is a cloud-based AI software engineer (SWE) agent that is built on a finely tuned version of OpenAI’s o3 reason model and can execute multiple development task in parallel.
It will be available today for ChatGPT Enterprise, Team, and Pro users. Support for Plus and Edu is expected to follow soon.
Codex’s evolution – from model to autonomous AI coder
The release of this version marks a significant advancement in Codex development. The original Codex was released in 2021, as a model to translate natural language into code. It was available through OpenAI’s new application programming interface.
This was the engine that powered GitHub Copilot, a popular autocomplete-style coding assistance designed to work with IDEs such as Visual Studio Code. This initial iteration was based on millions of lines of source code, and trained on them.
The early version had some limitations. It was prone for syntactic mistakes, insecure code suggestion, and biases in its training data. Codex sometimes proposed code that was superficially correct but failed to function. In some cases, it made problematic associations.
In spite of these flaws, the product showed enough promise to make AI coding tools a rapidly expanding product category. According to an OpenAI spokesperson, the original model was deprecated in favor of a new set of products.
GitHub Copilot has officially moved away from OpenAI’s Codex Model in March 2023. It adopted GPT-4 for its Copilot X Upgrade to enable deeper IDE Integration, chat capabilities, as well as more context-aware suggestions.
Agentic visions
Codex is far more advanced than its predecessor. Codex is now built to act autonomously for longer periods of time. It can write features, fix bug, answer codebase specific questions, run tests, or propose pull requests. Each task runs in a secure and isolated cloud sandbox.
This design reflects OpenAI’s broader ambitions to move beyond quick responses and into collaborative work. Josh Tobin, the leader of the Agents Research Team, stated during a recent presentation: “We define agents as AI systems which can work on your behalf over a long period of time and accomplish large chunks of tasks by interacting with real world.” Codex falls into this definition. “Our vision is for ChatGPT to become a virtual colleague, not just answering questions but also collaborating on a variety of tasks,” Tobin added.
OpenAI released figures showing that the new Codex-1 SWE Agent outperforms OpenAI’s latest reasoning model on internal SWE task.
New capabilities, interface, and workflows
Codex Tasks are initiated by a sidebar in ChatGPT. This allows users to ask the agent questions or tasks.
Each request is processed in a separate environment, which contains the user’s repository. It is configured to reflect the development setup. It logs all its actions, summarizes changes, and cites test results, making its work traceable. It also provides configuration via AGENTS.md, which are project-level guides for the agent to learn how to navigate a codebase and run specific tests.
Embiricos explained that they trained their model to read code, and to infer style (such as whether or not an Oxford comma is used).
Security and practical use
Codex executes task without internet access by relying only on the user-provided code. This design ensures a secure operation and minimizes the potential misuse. Embiricos said that the model API is much more than that. “Because the model runs in a human-reviewed environment, we can give it a lot more freedom.”
OpenAI reports early external use case. Cisco is evaluating Codex to accelerate engineering work across all of its product lines. Temporal uses Codex to run background tasks such as debugging and writing tests. Codex is used by Superhuman to improve test coverage, and to allow non-engineers suggest lightweight code modifications. Kodiak, a firm that specializes in autonomous vehicles, uses it to improve code quality and gain insight into unfamiliar stack components.
OpenAI also releases updates to Codex CLI – its lightweight terminal agent designed for local development. The CLI now uses a smaller model–codex-mini-latest–optimized for low-latency editing and Q&A.
Pricing is set at $1.50 for a million input tokens and $ 6 per million output tokens with a caching discount of 75%. Codex is free to use for the duration of the rollout, but rate limits and options for on-demand pricing are planned.
Does this mean OpenAI is NOT buying Windsurf? *Thinking face*
Codex’s release comes amid increased competition within the AI coding tool space, and signals that OpenAI intends to build, rather than buy, its next phase products.
According recent data from SimilarWeb the traffic to developer-focused AI software has risen by 75% in the past 12 weeks. This highlights the growing demand for coding assistances as essential infrastructure and not experimental add-ons.
According to reports from TechCrunch Bloomberg suggests that OpenAI held acquisition discussions with two fast-growing AI development tool startups, Cursor and Windsurf. Cursor is said to have walked away; Windsurf has reportedly agreed to be acquired by OpenAI at a price of $3.5 billion, although no deal has yet been confirmed by OpenAI or Windsurf.
In fact, Windsurf launched its own family foundation models, SWE-1 yesterday, which are designed to support the entire software engineering lifecycle from debugging through to long-term project maintenance. SWE-1 models are reported to be custom-made and trained in-house, using a new sequential model tailored to real development workflows.
There may be many things going on behind the scenes, but the timing of Windsurf’s launch of its own coding-foundation model, instead of its current strategy of using Llama models and giving users the choice to slot in OpenAI or Anthropic models, followed by OpenAI’s release of its own Windsurf rival, seems to indicate that the two companies are not aligning anytime soon.
On the other hand, it’s possible that OpenAI is using the fact that the new Codex AI SWE Agent is currently in “research pre-view” to get Windsurf and Cursor to negotiate a deal. OpenAI’s spokesperson told VentureBeat that they had no information to share about the possibility of a Windsurf purchase or reports of such a deal.
Embiricos frames Codex in either case as much more than just a code tool or assistant.
He said, “We are about to undergo a major shift in the way developers work with agents. We will no longer be partnering with them in real-time but instead delegating all tasks.” “The first experiments consisted of reasoning models with terminal accessibility. The experience was magical. They started doing things for you.”
Built for teams of developers, not just solo developers
Codex was designed with professional software developers in mind. However, Embiricos stated that even product managers found it useful for suggesting or validating change before bringing in human SWEs. This versatility reflects OpenAI’s strategy to build tools that enhance productivity across technical teams.
Trini summarized the larger ambition behind Codex. “This is a transformational change in how software developers interface with AI and computer in general.” It maximizes the potential of each individual.
OpenAI envisions Codex at the center of a new workflow in which engineers assign high-level task to agents and work with them asynchronously. The company is working towards deeper integrations with GitHub, ChatGPT desktop, issue trackers and CI systems. The long-term objective is to combine real-time task delegation and long-horizon pairing into a seamless experience.
Coding is the foundation of so many useful things in the economy, according to Josh Tobin. Accelerating coding can be a powerful way to spread the benefits of AI throughout the world, including to ourselves.
OpenAI’s agents are tasked with leading the next chapter in developer productivity, regardless of whether or not they close deals for competitors.
Daily insights into business use cases from VB Daily
Want to impress your boss? VB Daily can help. We provide you with the inside scoop about what companies are doing to maximize ROI.
Read our privacy policy
Thank you for subscribing. Click here to view more VB Newsletters.
An error occured.