Dealer.com, a Cox Automotive brand, is seeking a Senior Site Reliability Engineer (Sr. Software Engineer) to join our Vehicle and Website Delivery team in Burlington, VT. This position is remote/work from home and can be located anywhere within the United States.
At Cox Automotive, we offer the chance to take a leading role in the digital revolution of the automotive industry. Our Site Reliability Engineers are energetic influencers that build world-class solutions that benefit the auto buying public, dealers and manufacturers. Collaboration is woven into the fabric of everything we do. At Cox Automotive, you’ll be immersed in an environment that values your teamwork and collaborative problem-solving skills while still nurturing your individualism.
Our Site Reliability Engineers (SREs) help bridge the gap between developers and IT Operations. They use their systems expertise to support architecture, create and maintain build/orchestration systems, and provide guidance on best practices. As we migrate our footprint to the cloud, we are looking for an experienced SRE to join our team of skilled engineers and drive the conversation around reliability through monitoring, automation, and SLO’s with a focus on writing code.
Our Vehicle and Website Delivery platform supports over 13,000 individual vehicle shopper websites and our SREs work closely with engineering teams to utilize technology wisely. We are looking for exceptionally ambitious and communicative hands-on individuals who are comfortable collaborating within the Agile methodology as part of a cross-functional team, have experience working in fast-paced environments, and who have the passion and skills to take our product offerings to the next level.
Who you are:
- BS degree in Computer Science, related technical field, or equivalent practical experience.
- Experience writing code in Java, Go, Shell, Python, or a similar language.
- Knowledge of debugging and optimizing distributed systems with complex real-time interactions.
- Strong interest in SRE topics like SLOs, resilience, scaling, performance, and more
What we look for:
- Experience with AWS cloud services and their offerings.
- Experience with Akamai Edge Platform and accelerating web applications via a CDN.
- Software operations and monitoring, and incident response experience.
- Comfortable on the Linux command line and troubleshooting network issues.
- Understanding of Java application tuning memory optimizations.
- Distributed Systems architectures, micro-services, and high availability.
- Configuration Management such as Puppet, Terraform, Terragrunt, Ansible.
- CI/CD with Jenkins, Spinnaker and/or Argo.
- Container orchestration using Docker. Kubernetes a plus.
- Experience designing, writing, and/or troubleshooting software and systems in a distributed, internet-scale Linux environment.
- Build software and systems to manage platform infrastructure and applications.
- Improve reliability, quality, and time-to-market of our suite of software solutions.
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
- Provide operational support and engineering for multiple large and small distributed software applications.
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
- Partner with development teams to improve services through rigorous testing and release procedures.
- Participate in system design consulting, platform management, and capacity planning.
- Create sustainable systems and services through automation and uplifts.
- Balance feature development speed and reliability with well-defined service level objectives