Site Reliability Engineer II

Location: Burlington, Vermont

Type: Full Time

Education: Bachelor's Degree

Experience: 3 - 5 Years

Cox Automotive ( is currently looking for a Software Engineer II, Site Reliability Engineer to join our team.

About the Team

The Software Engineer II, Site Reliability Engineer will be part of the Consumer & Marketing Solutions Site Reliability Engineering (SRE) team. The SRE team is an innovative team devoted to providing automated solutions and services for Kelly Blue Book and Autotrader at Cox Automotive to measure, evaluate and plan for visible, reliable application delivery and maintenance.

About the Position

As a member of the SRE team, you will bring a collaborative style in leading efforts that raise the maturity levels of the engineering practices across all agile teams delivering our products. The tools and use-cases are diverse, and our challenge is to increase the development velocity by optimizing various parts of the pipeline and increase application stability. Much of our software development focuses on optimizing existing systems by measuring elasticity and saturation, building infrastructure through IAC, and eliminating /reducing toil through automation. We also look to instill core SRE practices into the engineering teams including measuring SLIs/SLOs, increasing visibility/observability through monitoring tools, guide chaos engineering efforts in order to improve overall resiliency, and lead Gameday/Production Readiness reviews across all engineering disciplines. We are looking for engineers who are passionate about automation and owning best practices facilitated by SRE principles to build scalable and highly reliable applications.

If you love to figure out how all the pieces are put together and if automation and building tools to monitor and manage your applications sounds interesting to you, we want to talk to you!

As a Software Engineer II, Site Reliability Engineering at Cox Automotive you will:

  • Have Bachelor’s degree in a related discipline and 3+ years’ of experience in a related field. The candidate could also have a different combination, such as Masters degree and 2 years’ experience, or 7 years’ of experience in a related field.
  • Have a natural tendency to avoid toil and want to automate it away
  • Automate anything and everything! (testing, deploying, monitoring, etc)
  • Take complex and not maybe well-defined problem and come up with a technically reasonable solution
  • Take ownership of processes or solutions that can be shared across teams globally
  • Build and rollout solutions to be consumed by multiple teams
  • Have innate curiosity about how things work
  • Design and assist in the authoring of software tools that reliably manage application delivery & performance
  • Design and assist in the setup and maintenance of application monitoring and alerting
  • Engage with product/capability engineering teams to ensure best practices are implemented
  • Improve predictability and reliability of software releases, workflows, and operating software.
  • Reduce complexity and streamline delivery by participating in the creation of and promoting the use of reusable code, tooling, systems, and solution patterns
  • Reduce application deployment windows by leading engineering teams towards a Continuous Deployment environment
  • Reduce mean time to recovery (MTTR) by helping troubleshoot, monitor, alert, and automating recovery.
  • Facilitate Gamedays and Production Readiness reviews to continue increasing resiliency in our applications

Qualifications :

  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
  • Ability to debug, optimize code, and automate routine tasks
  • Systematic problem-solving approach, coupled with effective communication skills and a sense of drive
  • Experience designing, deploying, and supporting solutions in AWS
  • Understanding of Linux/Windows operating systems
  • Experience with Python or PowerShell or related scripting languages
  • Experience with configuration management systems (Spinnaker, Chef, Puppet, or Ansible)
  • Experience rolling out highly available, mission-critical applications
  • Experience with version control systems (Git or SVN) and branching strategies
  • Experience with Cloud Computing platforms (Amazon AWS, Kubernetes, Heroku, etc)
  • Experience with continuous integration tools (Jenkins, GitHub Actions, CircleCI, TeamCity, etc), Artifactory (or Nexus)
  • Experience with Database Server infrastructure (RDS, Aurora, DynamoDB, MySQL, Postgres, etc)
  • Experience with agile development, continuous integration and automated testing
  • Experience with Infrastructure as Code (Terraform or CloudFormation) for managing infrastructure and environments at scale
  • Excellent written communication, problem solving, and process management skills
  • Desire to work in a fast paced, evolving, growing, dynamic environment

© 2024 Vermont Technology Alliance

Site by Scout Digital