Manager - Service Reliability Engineering (SRE)

Fannie Mae
Reston, VA
Jan 26, 2022
Jan 28, 2022
Full Time
Company DescriptionAt Fannie Mae, futures are made. The inspiring work we do makes an affordable home a reality and a difference in the lives of Americans. Every day offers compelling opportunities to modernize the nation's housing finance system while being part of an inclusive team using new, emerging technologies. Here, you will help lead our industry forward, enhance your technical expertise, and make your career. Job DescriptionAs a valued leader on our team, you will manage the work of a team designing, producing, testing, or implementing software, technology, or processes, as well as create and maintain IT architecture, large scale data stores, and cloud-based systems.You will apply your expertise in software and systems engineering to ensure that both our internally critical and externally visible systems meet the appropriate performance needs of our users. In this role, you will be expected to: strategize portfolio / program reliability by working with cross-functional IT organizations and build roadmaps to drive reliability into the product; enable enterprise to standardize and adopt application reliability metrics and improve application health; socialize and plan Chaos Engineering gamedays for the applications in scope; serve as a change agent for driving service prioritization; coach and guide direct reports as well as customers, creating a culture of continuous learning within team members; standardize communication within and across team members. THE IMPACT YOU WILL MAKE The Service Reliability Engineering (SRE) Manager role will offer you the flexibility to make each day your own, while working alongside people who care so that you can deliver on the following responsibilities:Determine the needs of diverse and complex customer groups requiring applied understanding of strategic issues of importance to the function/initiative.Plan and direct the work of the team as they design and develop software solutions to meet needs.Ensure teams use a process-driven approach in designing solutions.Oversee the implementation of new software technology.Oversee the effective and efficient maintenance of existing software.QualificationsTHE EXPERIENCE YOU BRING TO THE TEAM Required Experience 6+ years of relevant professional experience;Excellent verbal and written communication skills with experience presenting information and/or ideas to an audience in a way that is engaging and easy to understand;Experience collaborating cross-functionally on availability / performance issues in order to identify root-cause, determine areas for improvement, and drive those actions to closure through effective solutions;Extensive knowledge of principles, advanced techniques, and theories to suggest and implement solutions on a specific project, program, or product;Collective capabilities for leadership, including leading teams, giving feedback, facilitating meetings, coaching & mentoring;Influencing skills to include negotiation, persuasion of others, meeting facilitation, and conflict resolution;Skilled in deriving business insight for the purposes of advising stakeholders and project team members, designing business models, interpreting customer and market insights, forecasting, benchmarking, etc.;Adept at managing project plans, resources, and people to ensure successful project completion in a Agile / Scrum environment in order to facilitate the design / development of performance engineering and resiliency methodologies through collaboration with engineering and product teams to implement shift left techniques on test design & automation;Experience supporting teams in the writing of Performance and Chaos Engineering strategies and scripts with a strong emphasis on automated deployment, infrastructure automation solutions, and continuous integration & delivery processes;Ability to identify gaps in the code from a non-functional viewpoint and experience assisting developers to fix the code and promote relevant reliability pattern implementations;Skilled in establishing and maintaining the overall health, availability, performance, resiliency, and capacity of technology products with specific experience in performance engineering and validations using JMeter, Load Runner, etc.;Skilled in cloud technologies and cloud computing to include Amazon Web Services (AWS) offerings, development, and networking platforms;Experience defining, measuring, and improving Reliability Metrics (SLO/SLI), Observability (Monitoring, Logging-Tracing solutions), Operations Processes (Incident, Problem Management), and Operations Toil Reduction through Automation;Experience designing, building and implementing necessary dashboards from application and infrastructure health perspectives using tools such as Splunk, Dynatrace, Datadog, etc. to provide a single pane view of all critical business and operational information to relevant stakeholders;Experience in architecting solutions for the design and implementation of applications in the cloud;Experience in activities like architecture reviews, code reviews, creating platforms and frameworks, capacity planning, etc.;Experience designing and developing highly available systems that utilize load balancing, horizontal scalability, and high availability;Strong understanding and knowledge of Java / J2EE technologies and frameworks including UI / JavaScript frameworks, Spring Boot / Spring Cloud Frameworks, REST, Microservices, server-side frameworks; Knowledge on cloud technologies and containerization using Docker & Kubernetes; Excellent understanding and demonstrated experience in the use of DevOps / CICD tools like Jenkins, Jules and automated deployment tools;Automation experience with Blue Prism, Selenium, or Ansible playbooks and programming languages like Java, Perl, Python or PowerShell scripting and Ansible playbook;Experience in implementing resiliency design patterns frameworks and validations.Desired Experience Bachelor's Degree or Equivalent;Candidates located in or around the Plano, TX areaRelevant certifications such as AWS Certified Solutions Architect, AWS Certified SysOps Administrator, Splunk Certified Developer, Dynatrace, Sun Certified Java Programmer, etc.Additional InformationJob REF ID: REF2773A The future is what you make it to be. Discover compelling opportunities at Mae is an Equal Opportunity Employer, which means we are committed to fostering a diverse and inclusive workplace. All qualified applicants will receive consideration for employment without regard to race, religion, national origin, gender, gender identity, sexual orientation, personal appearance, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation in the application process, email us at

Similar jobs