Persimmons, Inc. Tech Stack

Custom silicon and compiler stack for AI inference at scale

Computer Hardware Manufacturing San Jose, California 11–50 employees Founded 2023 Privately Held

Persimmons designs full-stack AI inference hardware and software, from custom silicon (ASIC, chiplets) through compilers (LLVM, MLIR, XLA, IREE) to communication libraries (NCCL, MPI) for datacenter and edge deployment. The engineering-heavy org (10 of 11 roles) skews senior, with active projects spanning chiplet verification, compiler optimization, and multi-node communication—a signal the company is solving internal scaling bottlenecks (communication across thousands of nodes, timing closure, ASIC verification complexity) rather than shipping a product yet.

Tech Stack 25 technologies

Core StackPyTorch TensorFlow C++ Python LLVM MLIR XLA IREE JAX Halide SystemVerilog Verilog UVM C C/C++ JTAG NCCL ROCm MPI ASIC Perl TCL PCIe Cadence Synopsys

What Persimmons, Inc. Is Building

◆Challenges

Scaling communication across thousands of compute nodes
Efficient dataflow for ai workloads
Verifying complex asic designs
Scalable execution of modern ai workloads
Runtime-level optimizations for latency
Manual overhead in design flows
Timing closure challenges
Complex multi-corner environments

▲Active Projects

Optimizing persimmons compiler
Designing ai model compiler
Developing novel scheduling algorithms
Persimmons inference chiplet verification
Communication libraries for high performance scalable system
Communication protocols for device discovery and routing
Scalable communication software for large-scale ai clusters
Persimmons chiplet development
Automation methodology improvement
Verification planning for fabric-level and full-chip designs

Hiring Activity

Accelerating10 roles · 5 in 30d

Department

Engineering

Seniority

Senior

Mid

Company intelligence

Find more companies like Persimmons, Inc. by tech stack, pain points and active projects

Get started free

About Persimmons, Inc.

Persimmons, founded in 2023, builds custom inference silicon and the software stack required to deploy generative AI workloads efficiently across edge devices and large-scale HPC clusters. The company operates from San Jose with a small, senior-focused engineering team. Their approach spans hardware design (ASIC and chiplet architecture), compiler infrastructure (LLVM/MLIR-based optimization), and communication protocols for distributed inference—targeting both latency optimization at the silicon level and scalability across thousands of compute nodes in datacenter environments.

HeadquartersSan Jose, California

Company Size11–50 employees

Founded2023

Hiring MarketsUnited States

Frequently Asked Questions

What tech stack does Persimmons use?

LLVM, MLIR, XLA, IREE, PyTorch, TensorFlow, JAX, Halide, C++, Python, SystemVerilog, Verilog, UVM, NCCL, ROCm, MPI, ASIC design tools (Cadence, Synopsys), and PCIe. No active adopts or replacements recorded.

What is Persimmons working on?

Chiplet design and verification, AI model compiler optimization, novel scheduling algorithms, and communication libraries for multi-node AI clusters. Internal challenges include scaling communication across thousands of nodes, timing closure in ASIC design, and runtime latency optimization for modern AI workloads.