Compiler platform for deploying AI models across edge hardware
Roofline builds a compiler-first platform for edge AI deployment, built on MLIR and LLVM with deep support for PyTorch, TensorFlow, and ONNX. The tech stack and active projects reveal a compiler-heavy engineering org focused on optimization primitives (vector instructions, tiling strategies, quantization flows) rather than a higher-level inference framework — positioning them as infrastructure for hardware diversity (RISC-V, CUDA, Vulkan, NPU) rather than a single-target solution. Early-stage hiring (intern-heavy, 5 engineers total) and open-source partnerships suggest a research-to-production arc.
Roofline solves fragmentation in edge AI deployment by abstracting away hardware complexity. Users import trained models (PyTorch, TensorFlow, JAX, ONNX) and compile them once to run across CPUs, GPUs, NPUs, and custom accelerators with a single Python interface. The company is based in Cologne, Germany, and focuses on compiler infrastructure, quantization flows, and heterogeneous execution orchestration. Current engineering efforts center on improving compiler reliability, building benchmarking infrastructure to validate accuracy, and developing novel optimization techniques like their Blockbuster tiling strategy.
PyTorch, TensorFlow, JAX, and ONNX. Models are imported into the compiler, then deployed across target hardware via a unified Python interface.
CPUs, GPUs (CUDA, Vulkan), NPUs, and RISC-V architectures. The platform abstracts hardware diversity through MLIR and LLVM-based compilation.
Other companies in the same industry, closest in size