echoloc

Baseten Tech Stack

AI inference platform with global GPU infrastructure and 99.99% uptime

Software Development San Francisco, CA 51–200 employees Privately Held

Baseten operates an inference-focused AI infrastructure platform built on Kubernetes, Docker, and NVIDIA hardware (CUDA, TensorRT, TensorRT-LLM). The tech stack reveals a production-grade, distributed system—Prometheus, Loki, Grafana for observability; RDMA and InfiniBand in active adoption signal investment in high-performance interconnects for multi-GPU workloads. Active hiring is heavily weighted toward senior and mid-level engineers (78 open roles, 76 posted in the last month), paired with deliberate customer-facing function builds (sales, product, HR infrastructure), indicating transition from pure engineering product toward go-to-market maturity.

Tech Stack 79 technologies

AdoptingCursor RDMA InfiniBand Claude Codex NVLink NVIDIA

What Baseten Is Building

Challenges

  • Capacity management
  • Scalable infrastructure
  • Improving engineering productivity
  • Ensuring soc 2 compliance
  • Technical friction
  • Reliability of ml workloads
  • Performance bottlenecks in ml inference
  • Deploying ai models at scale
  • Building foundation and processes for customer-facing teams
  • Scaling large language models

Active Projects

  • Baseten inference stack
  • Fastest, most accurate whisper transcription
  • Developing sales pipeline
  • Model apis for frontier models
  • New deployments
  • Building foundation and processes for customer-facing teams
  • Operational debugging issues
  • Annual bonus planning
  • Go-to-market strategy for fastest growing customer segment
  • Distributed training infrastructure for large-scale foundation models

Hiring Activity

Accelerating150 roles · 75 in 30d

Department

Engineering
78
Sales
18
HR
10
Product
8
Finance
6
Marketing
6
Data
4
Ops
3

Seniority

Senior
78
Mid
34
Manager
14
Junior
11
Lead
5
Company intelligence

Find more companies like Baseten by tech stack, pain points and active projects

Get started free

About Baseten

Baseten provides inference infrastructure for AI teams deploying large language models and foundation models at scale. The platform combines a proprietary inference stack (leveraging TensorRT and NVIDIA optimization) with managed Kubernetes infrastructure across AWS and GCP, targeting global availability and high SLA commitments. Pain points tracked internally center on capacity management, ML workload reliability, and performance bottlenecks in inference—core problems their customers face. The company is 51–200 employees, based in San Francisco, and actively building sales pipeline and go-to-market strategy alongside core infrastructure development.

HeadquartersSan Francisco, CA
Company Size51–200 employees
Hiring MarketsUnited States, Canada

Frequently Asked Questions

What is Baseten's tech stack?

Core: NVIDIA (CUDA, TensorRT, TensorRT-LLM), Kubernetes, Docker, Python, Go. Infrastructure: AWS, GCP, Terraform, CloudFormation, Pulumi. Observability: Prometheus, Grafana, Loki, OpenTelemetry. CI/CD: GitHub Actions, GitLab CI/CD, CircleCI, Jenkins. Frontend: React, TypeScript, WebSockets. Adopting: RDMA, InfiniBand, NVLink.

What is Baseten working on?

Core: baseten inference stack, model APIs for frontier models, distributed training infrastructure for large-scale foundation models. Current focus: fastest/most accurate Whisper transcription, new deployments, developing sales pipeline, and go-to-market strategy for fastest-growing customer segments.

Similar Companies in Software Development

Other companies in the same industry, closest in size