Speech-to-text and AI voice analytics platform processing 300 years of transcription monthly
Speechmatics operates a production speech recognition and voice AI platform with deep infrastructure across AWS, GCP, and Azure, backed by TensorFlow and PyTorch. The tech stack and project list reveal active work on scaling GPU-bound ML models, optimizing real-time latency, and expanding into emotive text-to-speech—while pain points around data quality, incident frequency, and model reliability suggest a company hitting performance ceilings as volume grows. Engineering-heavy hiring (17 of 25 active roles) and mid-to-senior distribution point to consolidation of core capabilities rather than broad expansion.
Speechmatics is a UK-based speech technology company founded in 2006 that provides automatic speech recognition and voice AI analytics to enterprise customers worldwide. The platform transcribes and analyzes human speech across 50 languages in real-time and batch modes, layering on downstream capabilities like summarization, sentiment analysis, topic extraction, translation, and speaker diarization. Processing over 300 years of transcription monthly globally, the company operates distributed infrastructure across AWS, GCP, and Azure with engineering teams in Cambridge, New York, Canada, Serbia, and the United States.
Core infrastructure runs on AWS, GCP, and Azure with Kubernetes/Helm orchestration. ML models built in TensorFlow and PyTorch. Data pipelines use Python, PostgreSQL, and SQL. Observability via Datadog and OpenTelemetry. CI/CD through GitLab and ArgoCD.
Active projects include scaling data pipelines for AI models, advancing end-to-end speech models, optimizing real-time latency, developing emotive text-to-speech, and automating self-healing systems. Internal focus on feature deployment tooling and ML pipeline streamlining.
Other companies in the same industry, closest in size