echoloc

Scribd, Inc. Tech Stack

Digital library platform with four content verticals and AI-powered metadata infrastructure

Software Development San Francisco, California 201–500 employees Founded 2007 Privately Held

Scribd operates a multi-product content ecosystem (ebooks, audiobooks, presentations, and digital comics) serving billions of users globally. The tech stack reveals a data-and-ML-heavy engineering org: Python, Scala, Spark, Databricks, and Delta Lake form the backbone, paired with SageMaker for model serving. Active projects cluster around generative AI metadata enrichment, content extraction for LLM training, and user acquisition funnel optimization — signaling a pivot toward AI-assisted curation and discovery rather than pure content hosting. The 136-person engineering team dwarfs other departments, consistent with infrastructure-intensive platform scaling.

Tech Stack 32 technologies

What Scribd, Inc. Is Building

Challenges

  • Improving test reliability
  • Metadata extraction at scale
  • Reducing signup friction
  • Metadata inconsistencies at scale
  • Content format variability
  • Upload flow improvements
  • Scaling paid social spend
  • Improving ltv:cac
  • Landing page friction
  • Optimizing user acquisition

Active Projects

  • Generative ai metadata enrichment platform
  • Frontend experiences for user acquisition and activation funnels
  • A/b testing optimization
  • Data-informed personalization features
  • Upload flow improvements
  • Content quality validation
  • Content extraction for ml/llm
  • Structured testing plans
  • Cro testing roadmap
  • Attribution modeling

Hiring Activity

Accelerating220 roles · 220 in 30d

Department

Engineering
136
Marketing
45
Design
19
Data
14
Product
8

Seniority

Mid
111
Senior
81
Staff
23
Lead
7
Company intelligence

Find more companies like Scribd, Inc. by tech stack, pain points and active projects

Get started free

About Scribd, Inc.

Scribd, Inc. operates four digital content products: Scribd (ebooks and audiobooks), Slideshare (presentation sharing), Everand (curated digital collections), and Fable (digital comics). The company serves a global audience through subscription and free-tier models, with content sourced from publishers, independent authors, and user-generated uploads. The platform handles heterogeneous content formats (PDFs, EPUB, video, presentations) and relies on AWS infrastructure (Lambda, ECS, SQS, ElastiCache) to manage scale. The organization is primarily engineering-driven, with secondary focus on marketing and design, headquartered in San Francisco and hiring across North America.

HeadquartersSan Francisco, California
Company Size201–500 employees
Founded2007
Hiring MarketsUnited States, Canada, Mexico

Frequently Asked Questions

What is Scribd's tech stack?

Python, Scala, and Ruby on Rails for application logic; Apache Spark, Databricks, and Delta Lake for data processing; AWS services (Lambda, ECS, SQS, ElastiCache) for compute and storage; SageMaker and GraphQL for ML and API layers.

What is Scribd working on?

Generative AI metadata enrichment, content extraction for ML/LLM training, A/B testing optimization, user acquisition funnel improvements, and data-informed personalization features.

How this profile is built

Scribd, Inc.'s technology stack, projects, and hiring signals are inferred from public hiring and company data — career pages, public listings, and company web presence — then clustered and de-duplicated. Figures are estimates that refresh over time. Read our full methodology →

This is not an official vendor or customer list. It is a technology-adoption signal inferred from public data, intended for B2B research.