Rising AI Investments in Asia Pacific Face Infrastructure Challenges
Despite increasing expenditures on artificial intelligence across the Asia Pacific region, many organizations find it difficult to extract meaningful returns from their AI initiatives. A significant factor behind this struggle is the inadequacy of existing AI infrastructure, which often fails to support inference workloads at the speed and scale required by practical applications. Industry analyses reveal that numerous AI projects fall short of their return on investment targets, even after substantial spending on generative AI technologies, primarily due to these infrastructural limitations.
This disparity highlights the critical role that AI infrastructure plays in determining system performance, operational costs, and the scalability of AI deployments in real-world scenarios.
Transforming AI Deployment with Edge-Based Inference Solutions
To tackle these challenges, Akamai has introduced Inference Cloud, developed in partnership with NVIDIA and powered by the cutting-edge Blackwell GPU architecture. The core principle behind this innovation is relocating AI inference closer to end-users rather than relying on centralized data centers located far away. This proximity reduces latency, lowers operational expenses, and enhances the responsiveness of AI services that require instantaneous decision-making.
Jay Jenkins, Akamai’s Chief Technology Officer for Cloud Computing, emphasizes that this shift is prompting enterprises to reconsider their AI deployment strategies, with inference workloads emerging as the primary bottleneck rather than model training.
Bridging the Gap Between AI Experimentation and Production
Jenkins points out that many organizations underestimate the complexity involved in moving from AI pilots to full-scale production. “The transition from experimentation to operational deployment is often wider than anticipated,” he explains. Despite strong enthusiasm for generative AI, obstacles such as escalating infrastructure costs, high latency, and difficulties in scaling inference workloads frequently hinder progress.
Currently, most enterprises depend on centralized cloud platforms and large GPU clusters. However, as AI usage expands, these infrastructures become prohibitively expensive, especially in countries distant from major cloud hubs. Latency issues arise when inference requires multiple processing steps over long network distances, degrading user experience and diminishing business value. Additionally, multi-cloud environments, complex data governance, and stringent compliance requirements further complicate scaling AI solutions.
The Growing Importance of Inference Over Training
In the Asia Pacific region, AI adoption is evolving from limited trials to widespread integration within applications and services. Jenkins notes that inference-the real-time application of AI models-now consumes the majority of computational resources, overshadowing the less frequent training phases. As organizations deploy language, vision, and multimodal AI models across diverse markets, the demand for rapid and dependable inference is accelerating beyond initial expectations.
These models must operate under varying linguistic, regulatory, and data conditions, often requiring instantaneous responses. Centralized infrastructures, originally designed for batch training, struggle to meet these real-time demands, creating a significant bottleneck.
Advantages of Edge Computing for AI Inference
Relocating inference tasks closer to users, devices, or autonomous agents can dramatically improve both performance and cost-efficiency. By minimizing the distance data travels, edge computing enables faster AI responses and reduces the expenses associated with transferring large datasets between central cloud facilities.
For example, physical AI systems such as autonomous drones, industrial robots, and smart city sensors require decision-making within milliseconds to function effectively. When inference is processed remotely, these systems experience delays that compromise their reliability and safety.
Moreover, Akamai’s analysis reveals that enterprises in countries like India and Vietnam achieve significant cost savings by running image-generation AI models at the edge rather than in centralized clouds. These savings stem from improved GPU utilization and reduced data egress fees.
Industries Leading the Edge AI Inference Adoption
Edge inference is gaining traction particularly in sectors where latency directly impacts revenue, safety, or customer satisfaction. Retail and e-commerce are early adopters, as slow response times often lead to abandoned shopping sessions. Localized inference enhances personalized recommendations, search functionalities, and multimodal shopping experiences, thereby boosting engagement.
Financial services also benefit significantly from edge AI. Tasks such as fraud detection, payment authorization, and transaction risk scoring require rapid, sequential AI decisions. Processing inference near the data source not only accelerates these operations but also helps financial institutions comply with data sovereignty regulations.
Strategic Collaborations Between Cloud Providers and GPU Manufacturers
The surge in AI workloads has fostered closer partnerships between cloud service providers and GPU manufacturers. Akamai’s collaboration with NVIDIA exemplifies this trend, deploying GPUs, DPUs, and AI software across thousands of edge locations to create a distributed “AI delivery network.”
This decentralized approach enhances performance by distributing inference tasks across multiple sites rather than concentrating them in a few data centers. It also facilitates compliance with diverse regional data regulations, a challenge faced by nearly 50% of large organizations in Asia Pacific. Security features such as zero-trust access controls, data-aware routing, and fraud prevention are integral to these infrastructures.
Preparing Infrastructure for Agentic AI and Automation
Agentic AI systems, which execute multiple sequential decisions autonomously, demand infrastructure capable of millisecond-level responsiveness. The region’s diverse connectivity, regulatory environments, and technological maturity present challenges, but also opportunities for flexible AI deployment strategies.
Research indicates that while most enterprises currently rely on public cloud platforms, a significant shift toward edge computing is anticipated by 2027. Future infrastructure must support in-country data residency, intelligent task routing to optimal locations, and resilience against network instability.
Key Considerations for Organizations Embracing Edge AI
As AI inference increasingly migrates to edge environments, companies must adopt new operational models. This includes managing a distributed AI lifecycle with frequent model updates across numerous sites, necessitating advanced orchestration tools and comprehensive monitoring of performance, costs, and errors.
Local data processing simplifies compliance with complex regional regulations, a pressing concern for half of the region’s large enterprises. However, expanding inference to the edge also heightens security demands. Organizations need robust protections for APIs, data pipelines, and defenses against fraud and automated attacks. Financial institutions, in particular, are already leveraging such security measures to safeguard their AI operations.
