AI data services for multilingual model training and fine-tuning
DATAmundi provides data annotation, collection, and curation services for AI model development, with a focus on multilingual datasets. The tech stack is a mix of classical infrastructure (Python, Java, C/C++) and storage optimization (SSD, NVMe, ASIC), suggesting heavy investment in handling large annotation pipelines and validation workflows. Active projects span LLM evaluation and training content, virtual assistants, and demand forecasting tools, indicating customers across NLP and generative AI use cases.
Notable leadership hires: Quality Lead
DATAmundi is an AI data services company founded in 2012, headquartered in Westborough, Massachusetts, serving enterprises, AI startups, and research organizations. The core offering is the creation, annotation, and management of multilingual training data, paired with domain expertise, quality assurance, and compliance frameworks. The company operates a global contributor network and serves customers building custom AI models across industries. Teams are distributed across the US, Argentina, Canada, Poland, Portugal, Spain, and India.
DATAmundi provides AI data services including data annotation, collection, labeling, and curation with specialization in multilingual datasets. Services include LLM fine-tuning support, data validation workflows, and quality management for enterprises building custom AI models.
DATAmundi is headquartered in Westborough, Massachusetts, and operates across seven countries: United States, Argentina, Canada, Poland, Portugal, Spain, and India.
Primary stack includes Python, Java, C/C++, JavaScript, SQL, and Git. Infrastructure focuses on storage optimization with SSD, NVMe, and ASIC technologies, built to handle large-scale data pipeline validation and annotation workflows.
Other companies in the same industry, closest in size