Managed data lakehouse platform for Apache Hudi, Iceberg, and Delta Lake
Onehouse operates a managed cloud data lakehouse built on open table formats (Hudi, Iceberg, Delta Lake), competing directly against proprietary warehouse lock-in. The tech stack reveals dual infrastructure patterns—Kubernetes, Kafka, and Spark for data pipelines; React and TypeScript for control plane—while heavy investment in Hudi engine optimization and Presto/Trino integration signals they're betting hard on query interoperability across warehouse engines. Hiring skews heavily toward senior and staff engineers alongside a smaller sales team, typical of infrastructure plays in early adoption phase.
Notable leadership hires: Chief of Staff
Onehouse builds a fully-managed data lakehouse platform deployed in customer VPCs on Apache Hudi, Apache Iceberg, and Delta Lake. The company targets enterprises seeking to consolidate analytics, reporting, data science, and ML pipelines without vendor lock-in. The platform includes high-throughput ingestion pipelines, table optimization, ACID transactions, time travel, and native support for Presto, Trino, and Snowflake downstream. Core operational challenges center on scaling both the product and the underlying open-source ecosystem (Hudi roadmap and feature development appear across projects), as well as designing concurrency control and incremental indexing systems for distributed workloads.
Apache Hudi, Presto, Trino, Spark, Kafka, Kubernetes, Java, gRPC, React, TypeScript, Snowflake, AWS, GCP, Azure, dbt, Airflow, Parquet, Terraform, and CloudFormation.
Primary focus is optimizing the Apache Hudi engine, advancing Presto/Trino integration, designing concurrency control, and building next-generation data infrastructure. Open-source community engagement and deal execution are also active work streams.
Onehouse's technology stack, projects, and hiring signals are inferred from public hiring and company data — career pages, public listings, and company web presence — then clustered and de-duplicated. Figures are estimates that refresh over time. Read our full methodology →
This is not an official vendor or customer list. It is a technology-adoption signal inferred from public data, intended for B2B research.