Home Technology Machine Learning Baseten takes on hyperscalers with new AI training platform that lets you...

Baseten takes on hyperscalers with new AI training platform that lets you own your model weights

0

Baseten, an AI infrastructure firm recently valued at $2.15 billion, is undertaking a transformative shift by expanding into comprehensive AI model training. This strategic move aims to empower enterprises to reduce their reliance on proprietary AI providers like OpenAI by offering a robust platform for fine-tuning open-source models.

Headquartered in San Francisco, Baseten unveiled its new infrastructure platform designed to simplify the complexities of training AI models. This solution eliminates the operational burdens of managing GPU clusters, orchestrating multi-node setups, and planning cloud capacity. CTO Amir Haghighat highlights that this expansion responds to persistent customer demand and a strategic goal to oversee the entire AI deployment lifecycle, moving beyond Baseten’s original focus on inference services.

“Customers repeatedly expressed frustration with existing workflows,” Haghighat shared. “One client described having to manually SSH into cloud instances over weekends to start fine-tuning jobs, only to find out days later that the process had failed.”

Learning from Past Challenges: Redefining AI Training Infrastructure

Baseten’s journey into training isn’t new. About two and a half years ago, the company launched Blueprints, an ambitious product aimed at automating model fine-tuning. However, it fell short because it abstracted too much, expecting users to intuitively select base models, data, and hyperparameters without sufficient guidance.

“Users lacked the expertise to make optimal choices, and when results were poor, they blamed the platform,” Haghighat explained. “We ended up providing consulting services to troubleshoot everything from dataset issues to model selection, which diverted us from our core mission.”

Recognizing this misstep, Baseten discontinued Blueprints and refocused on inference, vowing to re-enter training only when conditions were right. That moment arrived as the market evolved: most of Baseten’s inference revenue stemmed from models trained externally, and competitors’ restrictive terms locked customers into their ecosystems by preventing weight portability.

In contrast, Baseten’s new platform champions customer ownership of model weights, allowing users to download and migrate their fine-tuned models freely. The company bets that superior inference performance will naturally retain clients.

Advanced Multi-Cloud GPU Management and Rapid Job Scheduling

Baseten’s latest offering operates at a foundational infrastructure level, providing opinionated tools for reliability, observability, and seamless integration with its inference stack. Key features include multi-node training across GPU clusters such as NVIDIA A100s and H100s, automated checkpointing to safeguard against failures, and sub-minute job scheduling.

Central to this is Baseten’s Multi-Cloud Management (MCM) system, which dynamically allocates GPU resources across multiple cloud providers and regions. This flexibility enables cost savings and avoids the long-term contracts and capacity constraints typical of hyperscalers.

“Unlike hyperscalers that require multi-year commitments for GPU resources, we offer on-demand scaling without locking customers in,” Haghighat noted.

This multi-cloud agility also enhances resilience. For example, during a recent AWS outage, Baseten’s inference services remained uninterrupted by rerouting workloads to alternative providers-a capability now extended to training jobs.

Additionally, Baseten’s observability tools deliver granular per-GPU metrics, detailed checkpoint tracking, and a revamped user interface that highlights infrastructure events. The company also launched an open-source “recipe book” featuring training protocols for models like Gemma, GPT OSS, and Qwen, accelerating users’ path to successful training.

Customer Success Stories: Significant Cost Reductions and Performance Gains

Baseten’s platform has attracted AI-native companies developing specialized vertical solutions requiring tailored models.

Oxen, a dataset management and fine-tuning platform, exemplifies this partnership approach. CEO Greg Schoeninger remarked, “Platforms attempting to handle both hardware and software often falter. Partnering with Baseten for infrastructure was a clear win.”

Oxen built its customer experience atop Baseten’s infrastructure, automating GPU provisioning and job orchestration behind the scenes. One Oxen client, a startup organizing chaotic retail data, achieved an 84% reduction in inference costs-from $46,800 down to $7,530-by leveraging this integration.

Daniel Demillard, CEO of AlliumAI, added, “Training custom LoRAs has been powerful but cumbersome. With Oxen and Baseten, infrastructure headaches vanish. We scale training and deployment without worrying about CUDA or GPU selection.”

Another early adopter, Parsed, focuses on reducing enterprise dependence on OpenAI by building domain-specific models excelling in healthcare, finance, and legal sectors. Parsed experienced 50% lower latency in transcription tasks, launched HIPAA-compliant EU deployments within 48 hours, and executed over 500 training jobs using Baseten’s platform.

Parsed’s co-founder Charles O’Neill emphasized, “Fast models matter, but models that improve continuously matter more. Baseten delivers both speed and infrastructure for ongoing enhancement.”

The Symbiotic Relationship Between Training and Inference

Parsed’s success underscores Baseten’s strategic insight: training and inference are deeply intertwined. Baseten’s own model performance team extensively uses the training platform to develop “draft models” for speculative decoding-a technique that accelerates inference by generating preliminary tokens.

Recently, Baseten achieved over 650 tokens per second on OpenAI’s GPT-4, a 60% improvement, by training specialized small models to complement larger ones during inference.

“Training and inference are more connected than commonly perceived,” Haghighat said. “Our model performance team continuously trains these auxiliary models to optimize inference speed and quality.”

This integration allows Baseten to offer a seamless workflow: models trained on their platform can be deployed with a single click to inference endpoints optimized for that architecture, supporting chat completions and audio transcription directly from checkpoints.

This approach contrasts with vertically integrated competitors like Anthropic or Cohere, which bundle training and inference but with less architectural flexibility. Baseten’s focus on low-level infrastructure and performance tuning caters to enterprises running custom models at scale.

Open-Source Models and Fine-Tuning: The Future of Enterprise AI

Baseten’s strategy hinges on the rapid advancement of open-source AI models, which are closing the gap with proprietary systems and enabling widespread enterprise adoption through fine-tuning.

“Both closed and open-source models are improving rapidly,” Haghighat observed. “You don’t need open source to surpass closed models; as both improve, they unlock new use cases.”

He highlighted the rise of reinforcement learning and supervised fine-tuning techniques that allow companies to tailor open-source models to specific capabilities, matching or exceeding closed models in targeted tasks.

Baseten’s Model APIs, launched alongside its training platform, provide production-grade access to open-source models such as LLaMA 2 and Falcon, serving as an entry point for companies before they move to fine-tuning and deployment on Baseten’s infrastructure.

Despite progress, the market remains uncertain about which training methods will dominate. Baseten mitigates this by collaborating closely with select customers on cutting-edge techniques, aiming to develop user-friendly training products that avoid the pitfalls of overly restrictive platforms.

Future plans include expanding support for image, audio, and video fine-tuning, and integrating advanced methods like prefill-decode disaggregation to boost efficiency.

Competing in a Crowded AI Infrastructure Landscape

Baseten operates in a competitive environment with hyperscalers like AWS, Google Cloud, and Azure offering GPU compute, alongside specialized providers such as Lambda Labs, CoreWeave, and Together AI. Vertically integrated platforms like Hugging Face, Replicate, and Runway also vie for market share by bundling training, inference, and hosting.

Baseten differentiates itself through its MCM system for multi-cloud resource management, deep expertise in inference performance optimization, and a developer experience focused on production-ready deployments rather than experimentation.

Recent funding rounds provide the capital to advance both training and inference products. Key clients include companies specializing in transcription, customer service AI, and coding assistants-sectors where customized models and performance are critical.

Timing is a crucial advantage. The convergence of maturing open-source models, enterprise unease with proprietary AI dependence, and sophisticated fine-tuning techniques signals a lasting market shift.

“Closed models excel in many areas, but open models are catching up quickly through reinforcement learning and supervised fine-tuning,” Haghighat said. “This trend is palpable across industries.”

For enterprises transitioning from closed to open AI ecosystems, Baseten offers a compelling value proposition: infrastructure that simplifies fine-tuning’s complexities while optimizing for scalable, cost-effective, and reliable inference. By allowing customers to retain full ownership of their model weights-unlike competitors who use training as a lock-in mechanism-Baseten places its confidence in technical superiority to drive loyalty.

Success will depend on balancing infrastructure flexibility with user accessibility, avoiding the trap of becoming consultants, and crafting abstractions that empower users without overwhelming them. Baseten’s willingness to discontinue Blueprints demonstrates a pragmatic approach that may prove decisive in a market where many providers overpromise and underdeliver.

“At our core, we’re an inference company,” Haghighat emphasized. “Training exists to serve inference.”

This clear focus-viewing training as a means rather than an end-may be Baseten’s greatest strength. As AI deployment evolves from experimentation to production, companies mastering the full stack stand to capture outsized value, provided they avoid chasing technology without clear problems to solve.

For now, Baseten’s customers can finally avoid the frustration of manually managing training jobs over weekends, benefiting from infrastructure that simply makes the hardest parts vanish.

Exit mobile version