Manager Technology and Site Reliability Engineering (Remote eligible)

Capital One
Richmond, Virginia
Jul 21, 2022
Aug 19, 2022
Full Time
Locations: VA - Richmond, United States of America, Richmond, Virginia

Manager Technology and Site Reliability Engineering (Remote eligible)

We are looking for an experienced Manager, Technology with operational and/or site reliability engineering background with a passion for providing superior system availability and customer experience. We are looking for candidates who can lead a 24/7 support organization, drive reliability and performance across a massive scale by mastering the full depth of the stack. As a Manager, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of critical services.


  • Provide call leadership to mitigate critical incidents.
  • Increase operational efficiencies to proactively reduce and mitigate production incidents.
  • Lead a team of experienced support engineers to meet or exceed expectations on incident SLAs.
  • Ability to understand and work the full technology stack of systems in the assigned domain.
  • Lead a high performing team of support engineers across several geographical locations to provide 24x7 support for systems with an ever-watchful eye on their availability, latency, performance, and capacity.
  • Collaborating with other tech leads and support teams to ensure integrated end-to-end availability, reliability, and performance.
  • Define support strategies for systems in the Cloud (AWS)
  • Influencing resiliency and scalability in production environments in Amazon Web Services
 and other cloud platforms.
  • Identify and drive resolution on monitoring and alerting gaps.
  • Lead a team to design, write and deliver technical and process automation to improve the availability, scalability, latency, and efficiency of Capital One's services.
  • Solve problems relating to mission-critical services and build automation to prevent problem recurrence; with the goal of automated response to all non-exceptional service conditions.
  • Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.
  • Experience utilizing monitoring solutions, such as New Relic, Splunk and/or DataDog to reduce outages detection time.
  • Identifying and remediating risk to critical and non-critical system KPIs.
  • Familiarity with application architectures and networking.
  • Familiarity with DevOps and\or Site Reliability Engineering concepts and principles.
  • Familiarity with automation of routine maintenance tasks and common issues.
  • Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way.
  • Networking: knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing)

Capital One will consider hiring remote candidate for this position.

Basic Qualifications:
  • At least 3 years of experience in managing production support teams.
  • At least 1 year of experience in AWS cloud services configuration and administration.
  • At least 1 year of experience in restful web and API services support and deployment.

Preferred Qualifications:
  • Bachelors or Masters Degree in Computer Science or related field.
  • 2+ years of experience in AWS and current Associate Level AWS certification (Solution Architect, Developer or SysOps)
  • 1+ years of experience with scripting language(s) such as Python to debug, optimize code, and automate routine tasks.
  • 1+ years of experience with Splunk, Datadog or New Relic monitoring and alerting.

At this time, Capital One will not sponsor a new applicant for employment authorization for this position.

No agencies please. Capital One is an Equal Opportunity Employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex, race, color, age, national origin, religion, physical and mental disability, genetic information, marital status, sexual orientation, gender identity/assignment, citizenship, pregnancy or maternity, protected veteran status, or any other status prohibited by applicable national, federal, state or local law. Capital One promotes a drug-free workplace. Capital One will consider for employment qualified applicants with a criminal history in a manner consistent with the requirements of applicable laws regarding criminal background inquiries, including, to the extent applicable, Article 23-A of the New York Correction Law; San Francisco, California Police Code Article 49, Sections 4901-4920; New York City's Fair Chance Act; Philadelphia's Fair Criminal Records Screening Act; and other applicable federal, state, and local laws and regulations regarding criminal background inquiries.

If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation, please contact Capital One Recruiting at 1-800-304-9102 or via email at . All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodations.

For technical support or questions about Capital One's recruiting process, please send an email to

Capital One does not provide, endorse nor guarantee and is not liable for third-party products, services, educational tools or other information available through this site.

Capital One Financial is made up of several different entities. Please note that any position posted in Canada is for Capital One Canada, any position posted in the United Kingdom is for Capital One Europe and any position posted in the Philippines is for Capital One Philippines Service Corp. (COPSSC).