Appen operates a data annotation and labeling platform serving AI model training, with a sprawling global workforce (hiring across 25+ countries) and a data-heavy tech stack built on AWS, Databricks, and Redshift. The hiring mix is heavily skewed toward junior data roles (70 of 116 open positions), paired with acute pain points around speech recognition accuracy, data governance, and compliance—a pattern typical of companies scaling crowdsourced annotation while managing quality at volume.
Notable leadership hires: Team Lead
Appen provides human-annotated training datasets for AI model development, covering supervised fine-tuning, reinforcement learning with human feedback (RLHF), and model evaluation. The platform combines an annotation and labeling interface with access to a global contributor network spanning over 200 countries. Core service areas include transcription, translation, computer vision labeling, and RLHF for large language models. The company operates at scale—managing multilingual datasets, voice projects, and specialized evaluation workflows—and serves enterprise clients building generative AI and LLM applications. Founded in 1996 and headquartered in Kirkland, Washington, Appen is a public company with 501–1,000 employees.
Appen uses AWS (Redshift, Glue, Athena, Kinesis, EMR, RDS), Databricks with Delta Lake and Unity Catalog, Apache Spark, Python, Scala, and annotation tools including Trados, MemoQ, and Smartcat. Data pipelines are supported by Power BI for analytics.
Active projects include multilingual dataset creation, transcription and voice projects (Jigglypuff), machine translation evaluation, video labeling, Japanese-to-Thai translation, and voice assistant understanding—spanning generative AI, LLM fine-tuning, and speech recognition workflows.
Other companies in the same industry, closest in size