echoloc

Together AI Tech Stack

AI infrastructure platform for inference, model training, and GPU compute

Software Development San Francisco, California 201–500 employees Founded 2022 Privately Held

Together AI operates a full-stack AI platform spanning inference engines, on-demand GPU clusters, and pre-training infrastructure. The tech stack reveals a systems-first engineering culture: heavy emphasis on performance-critical layers (CUDA, Triton, FlashAttention, vLLM, SGLang) alongside orchestration and data pipelines (Kubernetes, Airflow, dbt, ClickHouse). The company is scaling aggressively—28 engineering roles posted in the last 30 days—and pain points cluster around GPU utilization, latency optimization, and cost control, suggesting they're hitting limits on both their internal infrastructure and customer workload density.

Tech Stack 158 technologies

Core StackPython PyTorch Go Rust Apache Kafka ClickHouse Terraform Pulumi ArgoCD Kubernetes Java TypeScript AWS PostgreSQL Apache Airflow Codex FlashAttention CUDA Triton C/C++ vLLM SGLang AWS Kinesis Redpanda AWS CDK TeamCity Azure GCP Slurm Kinesis+120 more
AdoptingKubernetes AWS EKS Ceph Lustre Cartesia Deepgram FlashAttention

What Together AI Is Building

Challenges

  • Scaling data center infrastructure
  • Scaling hiring for core engineering functions
  • Handling millions events daily
  • Optimizing latency
  • Cost optimization for storage
  • Lower the cost of modern ai systems
  • Reducing inference latency and cost
  • Optimizing gpu utilization
  • Scaling to massive concurrent users
  • Scaling compute infrastructure

Active Projects

  • Integration of partner models into serving stack
  • Runtime inference services
  • Design and operate medallion data warehouse stack
  • Build airflow orchestrated pipelines and dbt transformation projects
  • Cuda graph optimization for inference
  • Build scalable infrastructure with ansible, terraform, kubernetes
  • Sla monitoring system
  • Distributed gpu scheduling system
  • Customer-facing cloud platform services
  • Global management plane

Hiring Activity

Accelerating55 roles · 30 in 30d

Department

Engineering
27
Finance
6
Ops
6
Data
3
Sales
3
HR
2
Research
2
Design
1

Seniority

Senior
28
Staff
7
Mid
6
Manager
5
Director
3
Junior
2
Lead
1

Notable leadership hires: Tax Director

Company intelligence

Find more companies like Together AI by tech stack, pain points and active projects

Get started free

About Together AI

Together AI builds cloud infrastructure purpose-built for AI workloads, targeting AI-native companies and SaaS platforms deploying LLM and model-serving applications. The platform spans three layers: a high-performance inference engine optimized for throughput and latency, on-demand GPU cluster orchestration, and large-scale pre-training capacity. The company operates across three continents (United States, India, Netherlands) with a team weighted toward senior and staff engineers, reflecting the infrastructure maturity required to manage distributed GPU fleets and meet the SLA demands of production AI services.

HeadquartersSan Francisco, California
Company Size201–500 employees
Founded2022
Hiring MarketsUnited States, India, Netherlands

Frequently Asked Questions

What tech stack does Together AI use?

Core inference: CUDA, Triton, FlashAttention, vLLM, SGLang. Infrastructure: Kubernetes, Terraform, AWS (EC2, EKS, Kinesis), GCP, Azure. Data: ClickHouse, PostgreSQL, Airflow, dbt. Languages: Python, PyTorch, Go, Rust, C/C++, TypeScript.

What is Together AI working on?

Distributed GPU scheduling, CUDA optimization for inference, medallion data warehouse design, Airflow-orchestrated data pipelines, SLA monitoring systems, and a global management plane for customer-facing cloud services.

How this profile is built

Together AI's technology stack, projects, and hiring signals are inferred from public hiring and company data — career pages, public listings, and company web presence — then clustered and de-duplicated. Figures are estimates that refresh over time. Read our full methodology →

This is not an official vendor or customer list. It is a technology-adoption signal inferred from public data, intended for B2B research.