2h ago
L3 Support Engineer
Israel
full-timeseniorCloud Computing
Tech Stack
Description
You will build and operate the L3 Support Line for Nebius, conducting deep investigations into server, firmware, and Linux-level issues across datacenters. You will drive cross-site pattern detection, collaborate with R&D and vendors to deliver permanent fixes, and create runbooks to elevate frontline support teams.
Requirements
- Deep expertise in Linux diagnostics and server hardware
- Experience with firmware (BIOS/BMC) troubleshooting and validation
- Ability to perform root cause analysis and cross-site pattern detection
- Strong collaboration skills with R&D and vendor teams
- Experience creating technical documentation and runbooks
Responsibilities
- Lead root cause analysis for GPU failures, firmware issues, Linux-level faults, and HW/SW interactions
- Detect recurring patterns across datacenters and convert findings into durable fixes
- Own technical workstreams during high-severity incidents
- Build evidence packs and drive escalations with ODM and R&D for component and platform-level resolutions
- Support firmware validation, rollout planning, and create scalable runbooks and troubleshooting guides
0 views 0 saves 0 applications