3h ago

HPC Production Engineer

Sydney
full-timeseniorFinancial Services

Tech Stack

Description

You will design, implement, and maintain high performance compute and storage systems, collaborate with researchers to optimize HPC infrastructure, and provide operational support in a global team. This role involves building tooling for software deployment, monitoring system performance, and participating in large coordinated maintenance operations.

Requirements

  • 5+ years professional experience in HPC with parallel filesystems (e.g., Lustre, GPFS)
  • 5+ years Linux systems administration
  • Proficiency in at least one programming/scripting language (Go, Python, C)
  • Experience with distributed systems design and debugging
  • Experience with configuration management tools (SaltStack, Ansible, Puppet)

Responsibilities

  • Design, implement, maintain, and support HPC compute and storage systems
  • Implement performance monitoring and fault monitoring systems
  • Build tooling to compile, package, install, and upgrade software and OS at scale
  • Collaborate with researchers to optimize their use of HPC infrastructure
  • Provide operational support on a rotating basis
0 views 0 saves 0 applications