AI inference hardware and silicon optimization for edge deployment
Taalas designs AI-optimized silicon and inference serving infrastructure, with a tech stack spanning hardware description languages (Verilog, VHDL), ML frameworks (PyTorch, vLLM), and container orchestration (Kubernetes, Traefik). The company is actively building inference serving clusters purpose-built for their own AI model chips, paired with LoRA swapping for multi-tenant environments — indicating a vertical integration strategy from silicon design through model deployment.
Taalas is a Toronto-based semiconductor company focused on silicon optimization for AI inference workloads. The team operates at small scale (11–50 employees) with engineering-majority hiring concentrated in Canada. Core work spans hardware design (Verilog/VHDL), ML model optimization (PyTorch/CUDA), and infrastructure-as-code for inference serving. Current priorities include building and adapting inference servers to interface with proprietary AI model chips, establishing lab automation and regression testing pipelines, and addressing internal operational friction around purchase order tracking and expense management.
Taalas is developing inference serving clusters optimized for their AI model chips, implementing LoRA swapping for multi-tenant environments, automating lab test and characterization workflows, and building supporting infrastructure (Kubernetes/Traefik-based orchestration, regression testing automation).
Hardware design: Verilog, VHDL. ML/inference: PyTorch, vLLM, CUDA. Infrastructure: Kubernetes, Traefik, Prometheus, Grafana. Languages: C/C++, Python, Perl. Cloud/OS: Azure, Linux, macOS.
Other companies in the same industry, closest in size