1d ago
Senior ML Engineer
Poland
โจ $160k+ / yearest.
full-timesenior Remotesoftware
๐ Tech Stack
๐ผ About This Role
You'll join the Kimchi team to optimize LLM inference performance, directly improving customer p99 latency and company margins. You'll own the technical direction of inference optimization, tuning kernels, quantization, and scheduling. This is a high-impact, high-autonomy role where your work on KV cache utilization and throughput has immediate bottom-line effects.
๐ฏ What You'll Do
- Push throughput via batching, speculative decoding, and kernel tuning on vLLM, SGLang, and TensorRT-LLM.
- Attack latency by profiling and fixing actual bottlenecks (compute, memory, scheduling, networking).
- Optimize KV cache with paged attention, prefix caching, eviction policies, and quantized KV.
- Quantize weights and activations (INT8, INT4, FP8) while measuring quality on real workloads.
- Scale inference across nodes with distributed topologies and network-aware placement.
๐ Requirements
- 5+ years building ML inference or training infrastructure at production scale.
- Strong Python skills with production services experience.
- Hands-on experience with vLLM, SGLang, or TensorRT-LLM and understanding of inference engine performance.
- Fluency with quantization tradeoffs, including measuring quality regressions.
โจ Nice to Have
- Experience with distributed systems (collective communication, sharding, multi-GPU setups).
- Bias toward measurement and instrumentation to distinguish real wins from artifacts.
- Self-direction and excitement about a wide mandate.
๐ Benefits & Perks
- ๐ฐ Competitive salary and equity options.
- ๐ Flexible remote-first global environment.
- ๐ Learning budget for conferences and courses.
- ๐ Annual hackathon and team-building budget.
- ๐ ๏ธ Equipment budget and extra days off.
๐จ Hiring Process
Estimated timeline: 3-5 weeks
- 1Screening call with Recruiterยท 30 min
- 2Hiring Manager interviewยท 45 min
- 3Technical interview (system design)ยท 60 min
- 4Live codingยท 60 min
- 5Culture Check interview with an executiveยท 45 min
This description was AI-summarized. View original
0 0 0