Magic is building frontier-scale code models on massive GPU clusters, operating the full stack from kernel optimization to trillion-parameter training. The tech stack — CUDA, XLA, Triton, Ray, Kubernetes across GCP/AWS/Azure/OCI — reflects deep infrastructure work; the project list (sandboxed execution, high-density compute, internet-scale data pipelines, inference optimization) reveals a company building toward inference at scale, not just training. Engineering dominance and heavy investment in storage, caching, and fault tolerance signal they are solving hard systems problems upstream of the model itself.
Magic develops frontier-scale code models designed as AI coworkers rather than code assistants. Founded in 2022 and based in San Francisco, the company operates as a small, engineering-focused team (2–10 employees, 15 of 24 hires in engineering) building infrastructure for training and deploying very large models. The technical footprint spans model training on GPU supercomputer clusters, inference optimization for novel architectures, and supporting infrastructure: sandboxed execution environments, high-performance storage systems for long-context inference, and internet-scale data acquisition pipelines. The team is also investing in safety and robustness (red-teaming, fault detection, synthetic dataset generation).
Magic uses CUDA, XLA, Triton for GPU compute; Ray for distributed training; Kubernetes for orchestration; Terraform/Pulumi for infrastructure-as-code; GCP, AWS, Azure, and OCI for cloud resources; and Rust, C++, Go for systems work.
Magic is focused on frontier AI infrastructure: training trillion-parameter models on GPU clusters, optimizing inference throughput, building sandboxed execution and high-performance storage systems, acquiring internet-scale training data, and addressing fault tolerance and security in large-scale training pipelines.
Other companies in the same industry, closest in size