AI infrastructure and biology foundation models for disease research
Biohub operates a research-heavy nonprofit combining petabyte-scale compute infrastructure with applied AI for biological discovery. The tech stack reveals a production ML operation: PyTorch, TensorFlow, Hugging Face, Ray, Kubernetes, GPU, and Databricks sit alongside specialized biology tools (10x Genomics, FIB-SEM, Opentrons). Active hiring skews toward senior research and engineering roles, with ongoing work on biology foundation models, multi-agent systems for scientific tasks, and spatiotemporal multi-omics platforms—indicating they've moved past proof-of-concept into systems that integrate AI inference with wet-lab automation at scale.
Notable leadership hires: Immune Cell Reprogramming Lead
Biohub is a nonprofit research organization in Redwood City focused on advancing AI-powered biology to accelerate disease understanding and treatment. The organization combines computational infrastructure with experimental capabilities: they develop biology foundation models, multi-omics platforms, and automated protocols for cell reprogramming and infectious disease research. Operations span 201–500 employees across research, engineering, and data teams, with hiring active in the United States and Canada. Core challenges center on managing infrastructure complexity—petabyte-scale data pipelines, GPU-native data loading, distributed compute reliability—while integrating AI models with laboratory automation.
Biohub uses Python, C++, MATLAB, PyTorch, TensorFlow, Kubernetes, Ray, Databricks, Delta Lake, 10x Genomics, FIB-SEM microscopy, Opentrons automation, AWS, and specialized HPC infrastructure (Slurm, InfiniBand, GPU).
Projects include biology foundation models, multi-agent systems for scientific discovery, spatiotemporal multi-omics platforms, immune cell reprogramming for age-related diseases, reinforcement learning for biological discovery, and advanced light-sheet microscopy for zebrafish imaging.
Biohub's technology stack, projects, and hiring signals are inferred from public hiring and company data — career pages, public listings, and company web presence — then clustered and de-duplicated. Figures are estimates that refresh over time. Read our full methodology →
This is not an official vendor or customer list. It is a technology-adoption signal inferred from public data, intended for B2B research.