Senior Site Reliability Engineer (Linux)
The Senior Site Reliability Engineer is a role for the motivated coder/hacker/engineer who wants to solve problems at the root cause, in an elegant and sustainable way. You will be an instrumental part of our TechOps team, which exists to build and support the foundational tools that our product teams use to build products our customers love and trust. We care deeply about our delivery pipeline being simple, reliable, consistent, and fast. You will be successful in this role if you have a deep love for automation, building scalable systems, embracing new technologies, and sharing with teammates.
The Senior Site Reliability Engineer reports to the Manager, Site Reliability Engineering.
In your day-to-day, you will also:
Monitor system activity 24x7 as part of an on-call rotation
Support all Daxko software offerings and integrated third-party tools
Collaborate on cases escalated to TechOps Support and build long-term solutions for recurring cases with automatable solutions.
Identify and resolve technical debt items that, if resolved, could make other engineers more efficient.
Coordinate with agile development teams, DBAs, implementation, and support to ensure the production environment is healthy and stable
Identify repetitive tasks and automate them (spinning up new environments, deployments, etc)
Build, support, and administer all aspects of Daxko's continuous product delivery pipeline
Work with core components such as load balancers, firewalls, etc.
Make it painless for product teams to develop, test, deploy, and monitor by providing clear, documented frameworks around our operational systems
Execute our disaster recovery plan; ensuring it is up-to-date and thoroughly tested
Mentor team members as a subject-matter expert
Troubleshoot system jobs and services that fail and work with core development teams as needed to ensure operational stability and efficiency.