1d ago

Reliability Operations Engineer

Penang, Malaysia

$80k-$100k / year

full-timemid

๐Ÿ›  Tech Stack

๐Ÿ’ผ About This Role

You'll support the operational reliability of robotic and cloud systems by handling Tier 2 escalations and performing technical investigations. You'll work closely with senior engineers and SREs to refine runbooks and strengthen system health. This role contributes to incident response with clear communication and timely escalation.

๐ŸŽฏ What You'll Do

  • Lead incident investigations during daytime hours.
  • Respond to escalations from Tier 1 support using runbooks.
  • Update runbooks and operational documentation.
  • Run existing automations and enhance tooling and scripts.
  • Use observability tools to interpret metrics and logs.

๐Ÿ“‹ Requirements

  • 2โ€“4 years of experience in Reliability Operations or SRE.
  • Experience with Tier 1 or Tier 2 investigations.
  • Proficiency with Linux for system diagnostics.
  • Familiarity with GCP or other cloud platforms.

โœจ Nice to Have

  • Experience with high-severity incident response.
  • Exposure to robot fleets or IoT systems.
  • Scripting ability to improve workflows.

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter callยท 30 min
  2. 2Technical interviewยท 60 min
  3. 3On-site interviewยท 120 min
0 0 0