Multimodal lakehouse for ingesting, curating, and training on unstructured data at scale
Ocular AI is a YC W24 startup building infrastructure to manage and train custom models on video, image, and audio data. The tech stack reveals a product anchored in PyTorch and TensorFlow with PostgreSQL + AWS infrastructure, paired with a full frontend (Next.js, React) and modern ops tooling (PostHog for analytics). The hiring profile—mostly senior engineers alongside two sales roles and one designer—and the project list (annotation platform, data pipelines, model training infrastructure, LLM fine-tuning) show a team focused on shipping a complete data-to-model pipeline rather than point solutions.
Ocular AI builds Foundry, an AI-native multimodal lakehouse designed to ingest, catalog, search, and annotate unstructured video, image, and audio data, then train custom models on top. The platform includes intelligent multimodal search (natural-language queries over video/image datasets), annotation tools with AI data agents for auto-labeling, full data versioning and lineage, and integrated GPU-powered model training. The company targets teams building computer vision systems, robotics perception models, and domain-specific generative AI. Current pain points center on go-to-market: early sales engine, top-of-funnel growth, and closing larger enterprise deals—typical for a pre-product-market-fit infrastructure startup.
PostgreSQL and AWS for backend infrastructure; PyTorch, TensorFlow, and OpenCV for ML; Next.js and React for frontend; PostHog for analytics; Salesforce and HubSpot for CRM.
Core projects include the Foundry annotation platform, data pipelines for model evaluation, model training infrastructure for LoRAs, LLM fine-tuning, and a multimodal collaborative canvas for data curation and labeling.
Ocular AI (YC W24)'s technology stack, projects, and hiring signals are inferred from public hiring and company data — career pages, public listings, and company web presence — then clustered and de-duplicated. Figures are estimates that refresh over time. Read our full methodology →
This is not an official vendor or customer list. It is a technology-adoption signal inferred from public data, intended for B2B research.