ElastixAI Tech Stack

AI inference platform for cost-efficient LLM deployment at scale

Software Development Seattle, WA 11–50 employees Founded 2025 Privately Held

ElastixAI is building a systems-level inference platform from hardware primitives up: the stack spans RTL design, LLVM/XLA compilers, and deployment frameworks (vLLM, SGLang, TensorRT-LLM, DeepSpeed). The hiring pattern—nearly all senior engineers, no sales or product yet—and project focus on kernel decomposition and hardware roadmaps suggest the company is still in research-to-production phase, attacking inference efficiency and throughput as the core constraint.

Tech Stack 22 technologies

Core StackPyTorch TensorFlow Python C++ AWS Docker Kubernetes Verilog SystemVerilog MLIR LLVM XLA Triton JAX GCP Azure vLLM SGLang TensorRT-LLM DeepSpeed CUDA PCIe

What ElastixAI Is Building

◆Challenges

Improving inference efficiency
Scaling inference deployments
Optimizing model performance
Optimizing inference throughput and latency

▲Active Projects

Core ai inference platform design
Ai inference engine development
Rtl design and verification
Llm operation decomposition into kernel primitives
Performance modeling and profiling
Inference engine architecture
Generative ai research initiatives
Research to production pipeline
Hardware roadmap for inference engine
Performance, power, area trade-offs analysis

Hiring Activity

Accelerating6 roles · 4 in 30d

Department

Engineering

Seniority

Senior

Mid

Company intelligence

Find more companies like ElastixAI by tech stack, pain points and active projects

Get started free

About ElastixAI

ElastixAI, founded in 2025, is a Seattle-based startup building next-generation inference infrastructure for generative AI workloads. The platform targets cost and efficiency as primary levers: the team is working on LLM operation decomposition, performance-power-area trade-off analysis, and custom hardware roadmaps to reduce inference latency and throughput bottlenecks. Current product work centers on core engine architecture and RTL verification, positioning the company to serve infrastructure teams at hyperscalers and enterprises running large language models at scale.

HeadquartersSeattle, WA

Company Size11–50 employees

Founded2025

Hiring MarketsUnited States

Frequently Asked Questions

What tech stack does ElastixAI use?

Core stack spans hardware design (Verilog, SystemVerilog), compilers (MLIR, LLVM, XLA), ML frameworks (PyTorch, TensorFlow, JAX), and inference engines (vLLM, SGLang, TensorRT-LLM, DeepSpeed). Deployment on AWS, GCP, Azure via Kubernetes and Docker.

What is ElastixAI working on?

Core projects include AI inference engine development, RTL design and verification, LLM operation decomposition into kernel primitives, and hardware roadmap design. Primary pain points are inference efficiency, throughput, and latency optimization.