Together AI Tech Stack

GPU cloud platform for training and inference at scale

Software Development San Francisco, California 201–500 employees Founded 2022 Privately Held

Together AI operates a purpose-built GPU cloud for model training, fine-tuning, and inference, with a tech stack centered on PyTorch, Kubernetes, and custom inference engines (vLLM, SGLang, Triton). The adopting list reveals heavy infrastructure investment—moving into RoCEv2, InfiniBand, and Ceph—while projects focus on distributed GPU scheduling, multi-petabyte storage systems, and cost optimization, indicating they're scaling toward larger, multi-region deployments and addressing the operational complexity of managing GPU fleets at cloud scale.

Tech Stack 138 technologies

Core StackPython PyTorch TypeScript Go Cypress GitHub Actions AWS Rust Java Terraform Pulumi Stripe Kubernetes Apache Airflow Kafka FlashAttention k6 Argo CD AWS EKS CUDA Triton AWS CDK Kinesis C/C++ vLLM SGLang AWS Lake Formation IAM VPC TeamCity+102 more

AdoptingKubernetes AWS EKS RoCEv2 EVPN/VXLAN InfiniBand Ceph Lustre

What Together AI Is Building

◆Challenges

Cost optimization for storage
Scaling data processing infrastructure
Lower the cost of modern ai systems
Handling millions events daily
Scaling physical backbone
Scalability challenges
Cost efficiency
Vendor management
Cost optimization across major infrastructure builds
Optimizing autoscaling

▲Active Projects

Efficient inference and rl‑driven training
Runtime inference services at scale
Usage-based billing system
Run complex demonstrations and pocs of together’s entire stack
Global data center strategy development
Sales enablement tools
Design multi-petabyte ai/ml storage systems
Build kubernetes-native storage operators
Implement intelligent prefetching and model-weight distribution
Distributed gpu scheduling system

Hiring Activity

Decelerating75 roles · 20 in 30d

Department

Engineering

Marketing

Research

Ops

Finance

Sales

Data

Seniority

Senior

Intern

Staff

Mid

Junior

Manager

Lead

Director

Company intelligence

Find more companies like Together AI by tech stack, pain points and active projects

Get started free

About Together AI

Together AI is an AI infrastructure company providing GPU cloud services to AI teams at enterprise SaaS companies and AI-native startups. Founded in 2022 and headquartered in San Francisco, the company operates a platform for training, fine-tuning, and serving large language models on optimized hardware clusters. The engineering-focused organization (35+ roles) is building distributed systems for GPU scheduling, storage at petabyte scale, and usage-based billing. Active hiring spans the United States, United Kingdom, Netherlands, and India, with seniority concentrated in senior and staff levels.

HeadquartersSan Francisco, California

Company Size201–500 employees

Founded2022

Hiring MarketsUnited States, United Kingdom, Netherlands, India

Frequently Asked Questions

What is Together AI's tech stack?

Core: PyTorch, Python, TypeScript, CUDA, Kubernetes. Inference: vLLM, SGLang, Triton. Infrastructure: AWS (EKS, Kinesis, Lake Formation), Terraform, Argo CD. Monitoring and workflow: Apache Airflow, Kafka. Now adopting: RoCEv2, InfiniBand, Ceph for high-speed networking and distributed storage.

What is Together AI working on?

Efficient inference engines, distributed GPU scheduling, multi-petabyte AI/ML storage systems (Kubernetes-native operators), usage-based billing, global data center strategy, and cost optimization for large-scale model serving.