Sieve is building video training data at scale for large AI labs and Fortune 100 companies. The tech stack—PyTorch, Go, Rust, GCP, AWS, Kubernetes (Argo CD, Helm, Kustomize)—reflects a systems-heavy engineering org optimizing cost and throughput for petabyte-scale video processing. Active projects center on ML+ETL pipeline orchestration and video understanding, with hiring accelerating across engineering roles (6 open positions), signaling the company is scaling data infrastructure faster than data science or go-to-market.
Sieve operates an AI research lab focused exclusively on video data, assembling exabyte-scale infrastructure, video understanding techniques, and diverse data sources into training datasets for video modeling. The company serves frontier AI research labs, Fortune 100 enterprises, and generative AI startups. Founded in 2022 and based in San Francisco, Sieve is 11–50 employees with recent hiring velocity focused on engineering. Core operational challenges center on cost-effective processing of petabyte-scale video, optimizing compute scheduling, and scaling data pipelines while maintaining quality delivery.
Core stack: Python, PyTorch, Go, Rust, C++. Infrastructure: GCP, AWS, Kubernetes (Argo CD, Helm, Kustomize), Terraform, Cloudflare. Observability: Prometheus, OpenTelemetry, VictoriaMetrics. Frontend: React, Next.js, TypeScript.
Active projects include ML+ETL pipeline orchestration for large video data, video understanding pipelines, internal CI/CD tooling, ML filter development, and a video collection platform. Also building out recruiting function.
Other companies in the same industry, closest in size