Operations Engineer, Fleet Reliability at Careers - Europe | CoreWeave — CareerPair

5h ago

Operations Engineer, Fleet Reliability

Poland

full-timemidcloud computing

Tech Stack

Description

You will drive server nodes through provisioning and validation processes, troubleshooting hardware and software issues to maximize uptime of high-performance supercomputing clusters. This role involves configuring and maintaining large-scale GPU clusters, working shifts from 7 am to 9 pm, and participating in on-call rotations. Onboarding training at US headquarters is required within the first month.

Requirements

2+ years experience in data center or on-prem infrastructure
Strong Linux system administration and networking knowledge
Ability to troubleshoot hardware and software issues
Bachelor's degree or equivalent experience
Ability to travel to US on short notice (ESTA or B-1 visa)

Responsibilities

Provision and validate batches of server nodes
Troubleshoot node and cluster issues efficiently
Configure and maintain large-scale GPU clusters
Perform system maintenance tasks reliably
Participate in on-call rotations including after-hours and weekends

Careers - Europe | CoreWeave

Get ready to build with us. Learn about CoreWeave career opportunities in the EU.

Other jobs at Careers - Europe | CoreWeave

No other jobs found.

0 views 0 saves 0 applications