1d ago
Reliability Operations Engineer
Penang, Malaysia
$80k-$100k / year
full-timemid
๐ Tech Stack
๐ผ About This Role
You'll support the operational reliability of robotic and cloud systems by handling Tier 2 escalations and performing technical investigations. You'll work closely with senior engineers and SREs to refine runbooks and strengthen system health. This role contributes to incident response with clear communication and timely escalation.
๐ฏ What You'll Do
- Lead incident investigations during daytime hours.
- Respond to escalations from Tier 1 support using runbooks.
- Update runbooks and operational documentation.
- Run existing automations and enhance tooling and scripts.
- Use observability tools to interpret metrics and logs.
๐ Requirements
- 2โ4 years of experience in Reliability Operations or SRE.
- Experience with Tier 1 or Tier 2 investigations.
- Proficiency with Linux for system diagnostics.
- Familiarity with GCP or other cloud platforms.
โจ Nice to Have
- Experience with high-severity incident response.
- Exposure to robot fleets or IoT systems.
- Scripting ability to improve workflows.
๐จ Hiring Process
Estimated timeline: 2-4 weeks ยท AI estimate
- 1Recruiter callยท 30 min
- 2Technical interviewยท 60 min
- 3On-site interviewยท 120 min
0 0 0