6h ago

Site Reliability Engineer

Lisbon

$55k-$68k / year

full-timemid Remotesoftware Visa Sponsor

๐Ÿ›  Tech Stack

+3

๐Ÿ’ผ About This Role

You'll ensure the reliability, availability, and scalability of systems at a fast-growing AI company. You'll implement automation, monitoring, and performance optimization strategies to minimize downtime and improve resilience. This onsite role in Lisbon includes relocation support.

๐ŸŽฏ What You'll Do

  • Design scalable, reliable, and fault-tolerant systems
  • Develop observability tools (Prometheus, Grafana, Datadog, ELK)
  • Automate infrastructure provisioning and incident response with IaC
  • Optimize system performance and incident response workflows

๐Ÿ“‹ Requirements

  • 4+ years in SRE, DevOps, or System Engineering
  • Strong knowledge of cloud platforms (AWS, Azure, GCP)
  • Experience with observability tools (Prometheus, Grafana, ELK, Datadog)
  • Proficiency in Infrastructure as Code (Terraform, CloudFormation)

โœจ Nice to Have

  • Hands-on experience with containerization and orchestration
  • Knowledge of security best practices and compliance
  • Experience with incident management and root cause analysis

๐ŸŽ Benefits & Perks

  • ๐ŸŽ Apple hardware ecosystem for work
  • ๐Ÿ’ฐ Annual Bonus
  • ๐Ÿฅ Top-tier Health and Life Insurance
  • ๐ŸšŒ Transportation Budget
  • ๐Ÿ’ณ Coverflex benefits package

๐Ÿ“จ Hiring Process

Estimated timeline: 2-4 weeks ยท AI estimate

  1. 1Recruiter Screenยท 30 min
  2. 2Technical Interviewยท 60 min
  3. 3System Design Interviewยท 60 min
  4. 4Hiring Manager Interviewยท 45 min
  5. 5Reference Checkยท 15 min

๐Ÿšฉ Heads Up

  • Requirement for no AI assistance in application may deter some candidates
  • Vague company description without specific product details
0 0 0