
Senior ML Systems Engineer — C++/Rust
A high-performance computing startup is building cutting-edge machine learning infrastructure that reimagines how models are composed, optimized, and deployed across diverse hardware targets. The team is developing a powerful intermediate representation and infrastructure stack — an “LLVM for Neural Networks” — designed to enable deep optimization, modularity, and cross-framework portability for modern ML workloads.
We’re looking for ML systems engineers who thrive at the intersection of hardware-aware compilation and low-level systems engineering. This role is ideal for someone with a strong background in infrastructure internals, C++ or Rust development, and performance optimization. You’ll work across the infrastructure stack — from intermediate representation design to backend code generation — to unlock efficiency on modern accelerators including GPUs, TPUs, and NPUs.
📍Ankara / On-site
Responsibilities
Design and optimize backend code generation paths targeting CPUs, GPUs, TPUs, and custom accelerators
Extend the intermediate representation to support hardware-specific features
Contribute to a high-performance ML infrastructure stack using C++ and/or Rust
Implement optimization passes for operator fusion, memory layout planning, and static/dynamic scheduling
Integrate with low-level runtime and execution engines (e.g., CUDA, ROCm, XLA)
Collaborate with upstream open-source communities (e.g., MLIR, TVM, Halide)
Requirements
3-5+ years of experience in systems programming, infrastructure development, or ML infrastructure
Strong proficiency in C++ and/or Rust for performance-critical development
In-depth knowledge of infrastructure architecture, IR transformations, and code generation
Familiarity with ML compilers and IR frameworks (e.g., LLVM, MLIR, TVM)
Solid understanding of hardware architecture and parallel computing models
Experience with numerical computation or symbolic graph-based systems
Nice to Have
Experience optimizing for GPUs or ML accelerators (e.g., Tensor Cores, NPUs)
Contributions to open-source ML infrastructure or compiler stacks
Background in custom runtime scheduling or memory management systems
Exposure to embedded or edge ML environments (e.g., low-power inference, mobile devices)