Data and AI platform spanning clouds, data centers, and edge environments
Cloudera operates a data and AI platform designed to function across hybrid and multi-cloud infrastructure—AWS, Azure, GCP, and on-premises. The tech stack reveals a company deeply embedded in open-source big-data infrastructure (Kafka, Spark, Flink, Hadoop ecosystem) while actively adopting container orchestration (Kubernetes, Helm, Rancher) and infrastructure-as-code tooling (Terraform, OpenTofu, Crossplane), indicating a strategic shift toward cloud-native deployments. Engineering dominates hiring (165 roles), and active projects center on next-generation ML/AI platforms, cloud-native SaaS design, and low-latency inference services—aligned with pain points around accelerating ML-to-production workflows and cost optimization at scale.
Notable leadership hires: Technical Lead, Partner Sales Lead, Cloud FinOps Lead, Engineering Lead, Account Director
Cloudera provides a data and AI platform for enterprises operating across multiple infrastructure environments: public clouds (AWS, Azure, GCP), on-premises data centers, and edge deployments. The platform handles structured and unstructured data at scale, underpinned by open-source components (Apache Kafka, Spark, Flink, Hive, HBase) layered with proprietary features for ML/AI workflows, security (Kerberos, LDAP), and analytics (Trino, Impala). The company sells to large organizations in finance, healthcare, government, and technology sectors. With 1,001–5,000 employees, Cloudera operates a distributed workforce across 18+ countries, with significant engineering presence in Hungary, India, and Canada. Active product development focuses on cloud-native architecture, low-latency ML inference, and storage modernization (Apache Ozone adoption signals movement away from HDFS limitations).
Core: Apache Kafka, Spark, Flink, Hive, HBase, Hadoop/HDFS. Cloud platforms: AWS, Azure, GCP, OpenShift. Orchestration: Kubernetes, Docker. Infrastructure: Terraform, OpenTofu, Helm, Rancher. Languages: Java, Python, R, Scala, JavaScript. Analytics: Trino, Impala. Security: Kerberos, LDAP.
Next-generation AI and ML platforms, cloud-native SaaS design and scaling, low-latency foundation model services, Apache Ozone storage features, Hive ecosystem enhancements, and a user-driven data analytics application. Addressing pain points: ML-to-production acceleration, HDFS limitations, cost optimization, and vulnerability detection speed.
Other companies in the same industry, closest in size