1h ago
Staff Site Reliability Engineer - Data Center
Boston, MA
$165,750-$224,450 / year
full-timesenior RemoteHealthcare AI
Tech Stack
Description
You will design, build, and operate PathAI's hybrid cloud/on-prem environment to support AI-powered pathology. Implement SRE best practices, automate infrastructure, and improve reliability for critical production systems that accelerate drug development and patient diagnosis.
Requirements
- 8+ years of relevant experience
- Automation with scripting, Ansible, Python/GoLang
- Experience with observability tools (Datadog/Grafana/Prometheus)
- Infrastructure as Code (Terraform/Cloudformation)
- Administer physical hardware (iDRAC/IPMI/Nvidia UFM/Juniper Systems)
Responsibilities
- Implement SRE best practices focusing on users, monitoring, and automation
- Engineer infrastructure patterns for AWS with security, reliability, and scalability
- Design, build, and operate datacenter for Machine Learning team
- Integrate on-premises datacenter with cloud infrastructure for seamless hybrid cloud
- Participate in on-call rotations and incident response
0 views 0 saves 0 applications