Full-stack AI cloud infrastructure for model training and deployment
Nebius operates a GPU-backed AI cloud platform targeting ML practitioners at startups, enterprises, and research institutions. The tech stack—PyTorch, Ray, Slurm, MLflow, NVIDIA H200 GPUs, and Kubernetes—is purpose-built for distributed training and inference workloads. With 96 engineers and 30 ops staff actively hiring across 17 countries and projects spanning hardware validation, data center buildout, and regional go-to-market expansion, Nebius is scaling toward infrastructure-as-a-service for generative AI at scale.
Notable leadership hires: Warehouse Operations Lead, Regional Sales Director, Director GTM M&A
Nebius provides cloud infrastructure optimized for AI model training and inference. The platform is built on NVIDIA GPUs (H200), orchestrated through Kubernetes and Slurm, and supports PyTorch and Ray for distributed workloads. Customers include startups, enterprises, and scientific institutions building and deploying generative AI applications. The company operates from Amsterdam with a distributed engineering and operations footprint across Europe, North America, the Middle East, and Asia-Pacific. Current focus areas include hardware reliability, regional scaling, inference platform expansion, and cost optimization.
Nebius deploys NVIDIA H200 GPUs as core infrastructure, orchestrated with Kubernetes and Slurm for distributed training jobs.
Yes. Kubernetes is a primary orchestration layer, paired with Slurm for workload scheduling and Ray for distributed ML job coordination.
Other companies in the same industry, closest in size