SoftBank’s Infrinia AI Cloud OS for GPU cloud services

2 months ago 69

Japanese multinational investment holding company, SoftBank, has launched Infrinia AI Cloud OS, a software stack custom-designed for AI data centres. Designed by the company’s Infrinia team, Infrinia AI Cloud OS lets data centre operators deliver Kubernetes-as-a-service (KaaS) in multi-tenant settings, and offer inference-as-a-service (Inf-aaS). Therefore, customers can access LLMs via simple APIs that can be added directly into an operator’s existing GPU cloud offerings.

Infrinia Cloud OS meets growing global demands

The software stack is expected to reduce total cost of ownership (TCO) and streamline day-to-day complexities, particularly when compared to options developed internally and custom-made stacks. Ultimately, Infrinia Cloud OS promises to accelerate GPU cloud services deployments, simultaneously supporting each stage of the AI lifecycle, from training models to real-time use.

Initially, SoftBank plans to incorporate Infrinia Cloud OS into its existing GPU cloud offerings before deploying the software stack globally to overseas data centres and cloud platforms in the future.

Demand for GPU-powered AI has been increasing rapidly in many industries, from science and robotics to generative AI. As the complex needs of users also grows, it places demand on GPU cloud service providers.

Some users require fully managed systems with “abstracted GPU bare-metal servers” while others need affordable AI inference without having to rely on GPU management directly. Others seek more advanced setups where AI model training is centralised and inference is implemented at the edge.

Infrinia AI Cloud OS has been designed to meet these challenges, maximising GPU performance and easing management and deployment of GPU cloud services.

Infrinia Cloud OS’ abilities

With its KaaS features, SoftBank’s latest software stack is able to automate every layer of the underlying infrastructure, from low-level server settings through to storage, networking, and Kubernetes itself.

It can also reconfigure hardware connections and memory as and when required, letting GPU clusters to be produced, adjusted, or removed quickly to suit different AI workloads. Automated node allocation, that is based on how close GPUs are connected and NVIDIA NVLink domains, helps reduce delays and improves GPU-to-GPU bandwidth for larger scale, distributed workloads. Infrinia’s Inf-aaS component has been designed so users can implement inference workloads easily, enabling faster and more scalable access AI model inference through managed services.

By simplifying operational complexities and decreasing the TCO, Infrinia AI Cloud OS is positioned to accelerate the adoption of GPU-based AI infrastructure in different sectors worldwide.

(Image source: “SoftBank.” by MIKI Yoshihito. (#mikiyoshihito) is licensed under CC BY 2.0. )

Want to learn more about Cloud Computing from industry leaders? Check out Cyber Security & Cloud Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

CloudTech News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Read Entire Article