AI inference platform with global GPU infrastructure and 99.99% uptime
Baseten operates an inference-focused AI infrastructure platform built on Kubernetes, Docker, and NVIDIA hardware (CUDA, TensorRT, TensorRT-LLM). The tech stack reveals a production-grade, distributed system—Prometheus, Loki, Grafana for observability; RDMA and InfiniBand in active adoption signal investment in high-performance interconnects for multi-GPU workloads. Active hiring is heavily weighted toward senior and mid-level engineers (78 open roles, 76 posted in the last month), paired with deliberate customer-facing function builds (sales, product, HR infrastructure), indicating transition from pure engineering product toward go-to-market maturity.
Baseten provides inference infrastructure for AI teams deploying large language models and foundation models at scale. The platform combines a proprietary inference stack (leveraging TensorRT and NVIDIA optimization) with managed Kubernetes infrastructure across AWS and GCP, targeting global availability and high SLA commitments. Pain points tracked internally center on capacity management, ML workload reliability, and performance bottlenecks in inference—core problems their customers face. The company is 51–200 employees, based in San Francisco, and actively building sales pipeline and go-to-market strategy alongside core infrastructure development.
Core: NVIDIA (CUDA, TensorRT, TensorRT-LLM), Kubernetes, Docker, Python, Go. Infrastructure: AWS, GCP, Terraform, CloudFormation, Pulumi. Observability: Prometheus, Grafana, Loki, OpenTelemetry. CI/CD: GitHub Actions, GitLab CI/CD, CircleCI, Jenkins. Frontend: React, TypeScript, WebSockets. Adopting: RDMA, InfiniBand, NVLink.
Core: baseten inference stack, model APIs for frontier models, distributed training infrastructure for large-scale foundation models. Current focus: fastest/most accurate Whisper transcription, new deployments, developing sales pipeline, and go-to-market strategy for fastest-growing customer segments.
Other companies in the same industry, closest in size