6 days ago

DevOps - Platform Engineer

United States
full-timesenior RemoteHealthcare Technology

Tech Stack

Description

You'll be a core contributor to our infrastructure, owning the systems that keep August Health fast, secure, and resilient as we scale. This is a high-autonomy, high-impact role where you'll work closely with our engineering team to shape how we build, deploy, and operate software with real influence over architecture decisions and engineering culture.

Requirements

  • Strong hands-on experience with AWS — particularly EKS, Cognito, Aurora, RDS, Lambda, and VPC; you can make smart tradeoff decisions across services and know when to reach for each
  • Proficiency with Kubernetes in production — you've operated clusters at scale and know how to debug when things go wrong
  • Experience with infrastructure as code, ideally Pulumi or a similar tool (Terraform, CDK)
  • Comfort with GitHub Actions or similar CI/CD systems — you've built and optimized pipelines, not just used them
  • A security-minded approach — you think about least privilege, secrets management, and compliance by default; experience working toward or maintaining SOC 2 and/or HIPAA compliance is important, not just a nice-to-have
  • Solid observability experience — you're comfortable with Prometheus, have instrumented backend services before, and can look at an existing metrics setup and form a point of view on what's missing or misleading
  • Familiarity with data pipeline infrastructure, including tools like Snowflake and Apache NiFi
  • Strong communication skills — you can explain infrastructure decisions to non-infrastructure engineers, and you write good documentation
  • Self-direction — you can identify what needs doing, prioritize well, and drive projects to completion without heavy oversight

Responsibilities

  • Infrastructure as code — managing and evolving our AWS infrastructure using Pulumi, with a focus on reliability, cost efficiency, and maintainability
  • Kubernetes platform — operating and improving our K8s clusters: workload scheduling, resource management, networking, and observability
  • CI/CD pipelines — owning and optimizing our GitHub Actions workflows to keep builds fast, feedback tight, and deployments safe
  • Security & compliance — hardening our infrastructure posture, supporting audit readiness, and implementing controls that meet the requirements of operating in healthcare
  • Data pipeline infrastructure — supporting the reliable operation of our data engineering workflows
  • LLM tooling — deploying and maintaining prompt tracing, evaluation, and observability tools as we integrate AI capabilities into our product
  • Network & access — managing secure, zero-trust connectivity via Tailscale across our distributed infrastructure
  • Disaster recovery & incident response — designing, documenting, and regularly testing DR/IR processes so we're always ready
0 views 0 saves 0 applications