Subquadratic Tech Stack

Real-time speech AI APIs optimized for inference efficiency

Technology, Information and Internet Miami, Florida 11–50 employees Privately Held

Subquadratic builds a speech-to-text platform designed around inference efficiency rather than raw scale—a counter to the industry's race toward larger models. The tech stack is heavy on distributed systems (Spark, Dask, Ray, PyTorch, TensorFlow) and production ML infrastructure (Kubernetes, Triton, TorchServe), reflecting their focus on low-latency, cost-efficient inference at scale. Active projects reveal a company wrestling with trillion-token data pipelines, edge inference, and real-time streaming stability, while pain points around GPU utilization and cold-start latency signal they're optimizing the full inference stack, not just the model itself.

Tech Stack 34 technologies

Core StackPython Apache Spark PyTorch TensorFlow AWS Go Rust gRPC OpenTelemetry Kubernetes AWS Lambda TypeScript React Node.js Redis PostgreSQL DynamoDB GitHub Dask Ray GCS HDFS WebSocket Triton TorchServe AWS EKS AWS ECS ElastiCache Memcached GCP+4 more

What Subquadratic Is Building

◆Challenges

Scaling real-time voice ai
Performance and cost bottlenecks
Scaling data pipelines for trillion-token scale
Data quality at scale
Generating synthetic data for model training
Optimizing data loaders at scale
Reducing cold start latency
Improving gpu utilization
Ensuring soc2 compliance
Scaling voice conversational ai

▲Active Projects

Synthetic data generation for model training
Trillion-token scale data pipelines
Data infrastructure for multi-modal ai research
Build and scale data pipelines for pretraining, midtraining, and post-training at trillion+ token scale across language and speech domains
Build data versioning and reproducibility systems to track dataset compositions and enable reproducible experiments
Design and evolve the global control plane for real-time and batch apis
Productionize ml at the edge with high-throughput inference layer
Implement distributed tracing and define strict slos for streaming stability
Core developer platform and console
Real-time audio and llm token streaming

Hiring Activity

Accelerating8 roles · 8 in 30d

Department

Engineering

Data

Sales

Seniority

Senior

Mid

Director

Lead

Principal

Company intelligence

Find more companies like Subquadratic by tech stack, pain points and active projects

Get started free

About Subquadratic

Subquadratic is an AI infrastructure company providing speech-to-text APIs designed for performance and cost efficiency. Their platform targets developers building voice-native applications and voice agents, with a roadmap extending into text-to-speech, speech-to-speech, and multimodal interaction. The engineering and data-focused hiring mix (4 engineers, 2 data roles, plus leadership) mirrors their project velocity: trillion-token scale data pipelines, distributed inference systems, real-time streaming APIs, and edge deployment. Based in Miami with 11–50 employees, the company is actively scaling infrastructure to support production voice workloads.

HeadquartersMiami, Florida

Company Size11–50 employees

Hiring MarketsUnited States

Frequently Asked Questions

What tech stack does Subquadratic use?

Python, PyTorch, TensorFlow, Apache Spark, Dask, Ray, Kubernetes, Triton, TorchServe, AWS, gRPC, WebSocket, OpenTelemetry, PostgreSQL, Redis, TypeScript, React, Node.js.

What is Subquadratic working on?

Trillion-token data pipelines, real-time speech-to-text and LLM token streaming APIs, distributed inference optimization, edge deployment, synthetic data generation for pretraining, and systems for data versioning and reproducibility.