• Site Reliability Specialist III
Description:Someone once said, "If you do it once, great. If you do it twice, frown. If you do it three times, automate it." We are firm believers in automation, and depend on it to achieve amazingly scalable solutions. To enable this automation, we depend on our Site Reliability Specialist to identify, dissect, and automate processes found throughout the product lifecycle. Working in tandem with other Software Developers and DevOps Specialists, our Site Reliability Specialists are key technical players that make up an Integrated Solutions & Services Delivery Teams.Responsibilities:
- Automate deployment of elastic, highly available, and fault tolerant systems and services within a commercial cloud service provider ecosystem.Migrate legacy on-premises services leveraging established and well-accepted design patterns.Adhere to DevOps practices, using a suite of standard tools.Improve and build upon existing automation tooling for provisioning of systems and software.Collaborate with fellow teammates on complex problems, and recommend improvements.Assist in deconstruction, definition, and interpretation of requirements supporting customer projects.
- Be awesome to work with!Enjoy working in the public sector. You have a passion for education, training, designing, and building cutting-edge cloud computing systems for the worlds’ leading Armed Forces and Federal services communities.Have an understanding of standards and compliance requirements that impact the public sector. The candidate has a variety of experiences working as a solutions architect, systems or network administrator, or software developer.Enjoy making use of your existing skills, but also developing new ones.Relish diving deep into the details surrounding a challenge- possibly writing code- and generally doing what it take to support the customers’ mission.Possess good speaking and presentation skills for use in both formal settings, as well as white-boarding sessions with other seasoned developers.Consistently demonstrate a thorough understanding of best practices when operating in a remote computing environment.Travel averaging 25% of the time
- Familiarity with the Amazon Web Services, Microsoft Azure ecosystemsFamiliarity with Site Reliability Engineering concepts and principlesStrong understanding of the Red Hat Enterprise Linux or Microsoft Windows Server platforms.Significant experience in automating tasks using scripted languages such as Bash, PowerShell, or Perl.Practical experience using automation engines such as Puppet, Ansible, Chef, or Salt and has used them to implement orchestration on either the Microsoft Windows Operation System or with Linux-based distros.Some degree of experience working with one or more common programming languages such Python, Ruby, or Java.Experience using issue tracking tools such as Jira, GitHub, Redmine, Bugzilla, Team Foundations, or similar.Self-motivating and have the ability to work independently when necessary, but able to work with other on team projects.Understanding of Agile Software Development, and have used either Kanban or scrum-based scheduling.Understanding of test-driven development.Confidence in configuring and maintaining of a variety of various frameworks and applications such as Node.js, Ruby on Rails, .NET, Apache, Nginx, IIS, PostgreSQL, MySQL, MSSQL, or similar.
- Past experience working with Continuous Integration pipelines is a plus. A general understanding of data structures and algorithms, and are familiar with principles of software engineering.Industry recognized certification as a Cloud Solutions Architect