3h ago
HPC Production Engineer
Sydney
full-timeseniorFinancial Services
Tech Stack
Description
You will design, implement, and maintain high performance compute and storage systems, collaborate with researchers to optimize HPC infrastructure, and provide operational support in a global team. This role involves building tooling for software deployment, monitoring system performance, and participating in large coordinated maintenance operations.
Requirements
- 5+ years professional experience in HPC with parallel filesystems (e.g., Lustre, GPFS)
- 5+ years Linux systems administration
- Proficiency in at least one programming/scripting language (Go, Python, C)
- Experience with distributed systems design and debugging
- Experience with configuration management tools (SaltStack, Ansible, Puppet)
Responsibilities
- Design, implement, maintain, and support HPC compute and storage systems
- Implement performance monitoring and fault monitoring systems
- Build tooling to compile, package, install, and upgrade software and OS at scale
- Collaborate with researchers to optimize their use of HPC infrastructure
- Provide operational support on a rotating basis
0 views 0 saves 0 applications