Real-time trading platform for GPU compute capacity
SF Compute operates a spot market for GPU compute, modeled after commodity exchanges. The stack—Kubernetes, SLURM, Rust, Go, vLLM, and deep networking primitives (VXLAN, RoCEv2, InfiniBand)—reflects infrastructure-grade orchestration and multi-tenant cluster isolation. The hiring profile (10 engineers, half senior/principal) and project focus on GPU cluster automation, SDN, and dynamic scaling suggest they're solving for real-time capacity matching, not just trading UI. Pain-point clustering around financing and risk de-risking indicates the market model itself is the core product—connecting buyers who can't commit long-term to sellers with spare capacity.
SF Compute is building a real-time trading venue for GPU compute, positioning compute as a liquid commodity traded between buyers and sellers. The company solves a supply-side problem: cluster operators can't fully book capacity, and buyers are forced into inflexible annual contracts. SF Compute's infrastructure spans global GPU cluster deployment, automated provisioning, and orchestration layers (Kubernetes, SLURM) paired with customer-facing trading and risk management tooling. The 11-person team is based in San Francisco and focused on core engineering and product work.
Kubernetes, SLURM, vLLM for inference, Rust/Go for systems code, and networking infrastructure (VXLAN, RoCEv2, InfiniBand, BGP) for multi-cluster communication and isolation.
Global GPU cluster deployment, automated hardware provisioning, software-defined networking, dynamic inference scaling, and API/orchestration frameworks for cluster management.
Other companies in the same industry, closest in size