echoloc

Snorkel AI Tech Stack

Data development platform for specialized AI production

Software Development Redwood City, California 51–200 employees Founded 2019 Privately Held

Snorkel AI builds infrastructure for generating and labeling training data at scale, enabling teams to move AI from research to production faster. The stack reveals a heavy ML/data engineering foundation (PyTorch, TensorFlow, Pandas, scikit-learn) paired with AWS/GCP cloud infrastructure and emerging CI/CD adoption (CircleCI, Buildkite), suggesting internal focus on automating data pipelines and model workflows. Active hiring across engineering, data, and operations — combined with ongoing projects around synthetic data generation, data pipelines, and internal recipe standardization — points to a company scaling its own data production capacity while building that capability into product.

Tech Stack 50 technologies

Core StackReact TypeScript Python NumPy scikit-learn Pandas PyTorch TensorFlow AWS Tableau Power BI Looker Salesforce Kubernetes Gong Zapier n8n Okta CrowdStrike Coupa Ariba QuickBooks Online GCP TPU Notion Slurm Outreach Apollo Jamf Pro Cisco Meraki+19 more
AdoptingCircleCI Auth0 Buildkite

What Snorkel AI Is Building

Challenges

  • Scaling data delivery
  • Unblocking projects
  • Improving cash collections
  • Reducing bottlenecks in data production
  • Reducing ramp times
  • Improving win rates
  • Infrastructure blockers
  • Standardizing bespoke solutions into internal recipes
  • Lead routing
  • Matching efficiency

Active Projects

  • Internal gtm data governance framework
  • Ai data pipelines
  • Synthetic data generation for data development
  • End-to-end ai workflows
  • Competency-based onboarding program
  • Knowledge base in notion
  • Enablement infrastructure development
  • Synthetic dataset generation
  • Abm campaigns
  • Sales enablement portfolio

Hiring Activity

Accelerating120 roles · 75 in 30d

Department

Engineering
25
Data
22
Ops
22
Research
10
Sales
10
Finance
8
Product
8
Marketing
7

Seniority

Senior
40
Manager
26
Mid
18
Staff
15
Lead
10
Director
9
Principal
2

Notable leadership hires: Sales Director, Strategic AI Lead

Company intelligence

Find more companies like Snorkel AI by tech stack, pain points and active projects

Get started free

About Snorkel AI

Snorkel AI develops a data development platform for enterprises, labs, and government agencies building specialized AI models. The company originated from research at Stanford AI Lab and focuses on programmatic labeling and weak supervision techniques to accelerate training data creation. Operationally, the company spans 51–200 employees across engineering, data science, research, sales, and operations from its Redwood City headquarters. Current work includes synthetic data generation, end-to-end AI workflows, data governance frameworks, and sales enablement — indicating simultaneous investment in product maturity, GTM scaling, and internal operational infrastructure.

HeadquartersRedwood City, California
Company Size51–200 employees
Founded2019
Hiring MarketsUnited States

Frequently Asked Questions

What is Snorkel AI's tech stack?

Core ML stack includes PyTorch, TensorFlow, Pandas, NumPy, and scikit-learn. Cloud infrastructure runs on AWS and GCP with Kubernetes orchestration. Data pipeline tooling includes dbt-adjacent patterns. Analytics uses Tableau, Power BI, and Looker. Recently adopting CircleCI and Buildkite for CI/CD.

What is Snorkel AI working on?

Active projects include synthetic data generation, AI data pipelines, end-to-end AI workflows, data governance frameworks, and sales enablement infrastructure. Internal focus on standardizing bespoke solutions and reducing bottlenecks in data production.

How this profile is built

Snorkel AI's technology stack, projects, and hiring signals are inferred from public hiring and company data — career pages, public listings, and company web presence — then clustered and de-duplicated. Figures are estimates that refresh over time. Read our full methodology →

This is not an official vendor or customer list. It is a technology-adoption signal inferred from public data, intended for B2B research.