AI infrastructure platform for deploying and scaling large language models
Baseten operates an inference-focused AI infrastructure platform built on NVIDIA GPUs, Kubernetes, and custom performance optimization (vLLM, TensorRT, CUDA). The stack reveals a systems-level engineering focus: adoption of InfiniBand and RDMA signifies movement toward high-performance interconnect, while the project list spans frontier model APIs, billing platform integrations, and internal ML optimization — indicating product maturation beyond early-stage tooling. Hiring is engineering-heavy (55 of 102 roles) with senior-skewed seniority distribution, paired with active GTM hiring, suggesting scaling from technical adoption to enterprise sales motion.
Notable leadership hires: Head of Legal Operations
Baseten provides an inference platform designed to simplify deployment and scaling of large language models. The platform abstracts hardware provisioning and optimization, offering global availability with claimed 99.99% uptime. The company targets developers and machine learning teams who need to move models from experimental to production without managing underlying infrastructure. Core technical surface includes model APIs for frontier models, a proprietary inference stack, and billing integrations. The organization is headquartered in San Francisco and operates with 201–500 employees, currently hiring across engineering, sales, marketing, and product.
Baseten's core stack includes NVIDIA CUDA, PyTorch, Kubernetes, vLLM, TensorRT, and InfiniBand. Recent adoptions: RDMA, Cursor, and Codex. Monitoring and observability via Prometheus, Grafana, and Loki.
Baseten hires in the United States and Canada. Headquarters is San Francisco, CA.
Baseten's technology stack, projects, and hiring signals are inferred from public hiring and company data — career pages, public listings, and company web presence — then clustered and de-duplicated. Figures are estimates that refresh over time. Read our full methodology →
This is not an official vendor or customer list. It is a technology-adoption signal inferred from public data, intended for B2B research.