about 3 hours ago
Staff Software Engineer - Platform & Infrastructure
Remote - USA
$210,400-$302,500 / year
full-timesenior RemoteCybersecurity
Tech Stack
+4
Description
You will lead foundational efforts on the Platform Infrastructure team, building and evolving core systems (compute, orchestration, data platform) that power Abnormal Security's AI-driven cybercrime prevention. Your work will shape a self-service infrastructure platform, drive operational excellence, and partner with ML teams to enable AI-native development at scale.
Requirements
- Proven experience building and scaling data-intensive, distributed backend systems in high-growth environments.
- 5+ years as a Senior/Staff engineer building platforms, tools, or infrastructure that increase engineering velocity and reliability.
- Strong track record as a change agent reshaping infra strategy and shipping self-service platform offerings in startup settings.
- Depth in at least two of: Compute (EC2, autoscaling, container runtimes, networking, security), Orchestration (Kubernetes/EKS, controllers, scheduling, multi-cluster), Data Platform (Kafka, Spark, S3, PostgreSQL, DynamoDB, Redis, etc.).
- Hands-on with Python, Golang, Terraform, PostgreSQL, Kafka, Redis, OpenSearch, AWS, Kubernetes.
- Strong IaC, observability, and SRE fundamentals (SLOs, error budgets, incident management, capacity planning).
Responsibilities
- Shape core areas of Platform Infrastructure: compute, orchestration, data platform.
- Design and drive platform architecture roadmap to support AI/ML portfolio.
- Partner with product ML workflows to enable platform-first operating model and self-service.
- Raise operational excellence bar (SLOs, availability, incident response, on-call hygiene).
- Act as technical lead: define quarterly roadmaps, mentor engineers, land cross-team initiatives.
- Champion AI-native software development, guiding teams on architecture, data gravity, feature stores, model interfaces, evaluation pipelines.
- Own cost-conscious engineering: optimize design and operations for performance, reliability, and spend.
- Instill platform product practices: crisp APIs, docs, SLAs/SLOs, telemetry, paved paths.
0 views 0 saves 0 applications