Platform Engineer

Capital One
Mclean, VA
May 18, 2019
May 21, 2019
Engineer, IT, QA Engineer
Full Time
McLean 2 (19052), United States of America, McLean, Virginia At Capital One, we're building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding. Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good. Platform Engineer Capital One Machine Learning Tech is working to build a world-class machine learning platform to help us accelerate building innovative machine learning models that can be run in environments that support millions of transactions a minute. As part of the journey we are undertaking to build this platform we need to bring in some top-level talent that has experience with building rock-solid platforms. We are looking for DevOps and site reliability engineers with experience working with Kubernetes at scale. We're looking for people that have experience working with the Kubernetes platform itself as well as people that have experience automating the onboarding to and working within the platform. This team is responsible for building the solid foundation that our applications will be built on top of. What you'll be doing: Work alongside a team of highly qualified engineers to continually improve our managed Kubernetes platform. You will "automate all the things" around this platform. Think along the lines of one-touch (or zero touch) releases with visualization of service health and automatic service restoration type of "things". Work with partner teams to satisfy all security and regulatory requirements for a well-managed platform. We run an active/active platform that can balance traffic geographically across the US. We need to know what is going on in this platform. Work with our partner teams to gather together and automate the on and off-boarding to our platform. Basic Qualifications: Bachelor's Degree or Military Experience At least 1 year working with a Kubernetes platform environment in the public cloud. At least 3 years of experience working with Linux. At least 1 year of experience scaling Kubernetes workloads on AWS including using auto-scaling groups, pod autoscaling, and disruption budgets. At least 1 year of experience identifying and automating repetitive operational tasks at all stages of the platform management lifecycle. At least 1 year of experience with infrastructure automation technologies Preferred Qualifications: Experience with cutting edge technologies built on top of Kubernetes such as argo, skaffold, kubeflow, seldon, istio, linkerd, and knative. Experience with logging, monitoring, and tracing in a Kubernetes environment. Experience working with either Python or Go. Experience with heterogeneous hardware environments for delivering machine learning solutions (GPUs, TPUs, FPGAs, etc) Experience with serverless computing in K8s Experience working in a locked down environment and understand what goes into deploying infrastructure in the said environment At this time, Capital One will NOT sponsor a new applicant for employment authorization for this position.