Private-company data platform covering 130M+ businesses with ML-driven enrichment
Veridion operates a data enrichment platform focused on private-company intelligence, built on a distributed architecture (Spark, Cassandra, Kafka, Kubernetes) designed to handle petabyte-scale ingestion and processing. The tech stack and project list reveal an organization scaling data extraction and normalization at infrastructure level—web scraping via Puppeteer/Playwright, NLP model training, and API resilience—while fighting coverage gaps and unstructured data extraction. The balanced mix of engineering and data hiring (11 of 17 roles) indicates they're treating data pipeline maturity as a core competitive moat.
Veridion is a Romanian data intelligence company founded in 2019, providing business enrichment datasets covering over 130 million private companies. The platform serves procurement, insurance, underwriting, and market intelligence teams with real-time classification and supplier-sourcing data. Built on Apache Spark, Cassandra, and Kafka, the product ingests and normalizes company data from web sources using NLP and machine learning, then exposes it via client-facing APIs. The team operates from Bucharest with 11–50 employees, hiring primarily in Romania.
Veridion's core stack includes Apache Spark and Cassandra for distributed data storage, Kafka for streaming, Kubernetes for orchestration, Elasticsearch for search, PostgreSQL for transactional data, Python, Scala, and Java for processing, and Puppeteer/Playwright for web extraction.
Veridion covers 130M+ private companies globally with enrichment data. The platform extracts and normalizes company information from unstructured web sources using NLP and machine learning to support procurement, underwriting, and market intelligence use cases.
Other companies in the same industry, closest in size