Senior Machine Learning DevOps and Data Engineer

Kinetica DB
Arlington, VA
Apr 17, 2018
Apr 18, 2018
Full Time
Company Description When extreme data requires companies to act with unprecedented agility, Kinetica powers business in motion. Kinetica is the instant insight engine for the Extreme Data Economy. Across healthcare, energy, telecommunications, retail, and financial services, enterprises utilizing new technologies like connected devices, wearables, mobility, robotics, and more can leverage Kinetica for machine learning, deep learning, and advanced location-based analytics that are powering new services. Kineticas accelerated parallel computing brings thousands of GPU cores to address the unpredictability and complexity that result from extreme data. For more information and trial downloads, visit or follow us on LinkedIn and Twitter. Job Description Kinetica seeks an experienced Data Engineer (with DevOps experience) for a Machine Learning Systems Engineering role building a robust commercial product that handles a variety of cross-industry workloads. We want people who want to go the extra mile to consider varied modes of usage and design products that work under different product usage scenarios. Were building a commercial product for distributed Machine Learning training+inference that sits alongside a distributed GPU-accelerated database and you should be excited about the scale and technical complexities of such a setup. We want people who realize that success is both a personal and a team effort and that winning technical products are the same -- they are complex but well-integrated machines that used best available technologies packed into a cohesive product experience that function at scale. We want curious tinkerers, but we also want people who can drive a product to the finish line. Job Responsibilities Research and own commercial and open source {machine learning, container management, resource management} packages to find ideal stacks to achieve required product features Development and own systematic and automated data pipelines (batch and streaming) to feed automated machine learning workflows Work from an internal product systems engineering perspective to achieve goals (using APIs, not using GUIs) Ensure all functionality is callable and integratable with a Python/C++ internal stack Research products and keep abreast of marketplace tools, offerings, and possibilities for machine learning {automation, deployment, management} Work collaboratively with a close-knit team to design and develop a release-quality commercial product Work smoothly with our broader engineering group to ensure products fit into the companys product lineup Work iteratively to hone proofs-of-concepts for new product features and steadily merge development into the overall product Ability to bring things to a close -- not just exploring but getting things to the finish line. Keep attuned to the marketplace and spot opportunities to expand functionality in response to new technical capabilities as they arise Qualifications Five to seven years work experience in Dev-Ops and/or automated Data Engineering role Bachelors in Computer Science, Data Science, Operations Research, Statistics, Math, Physics, or equivalent Experience working with highly complex technical ecosystems (resource managers, containers, streaming systems, and automated testing -- especially: Docker, Kubernetes, Mesos, Kafka, Flink) Experience working in a highly automated environment for correctness and consistency (automated builds, continuous integration, automated testing, containers -- eg, Git, Jenkins) Enthusiasm for technology and product development Technology community participation (eg, StackOverflow, Kaggle, open source contributions/projects) Experience with distributed systems Interest in Machine Learning and Data Science Excitement about a small company with close team interactions and a fast-moving culture Openness to roll-with-the-punches, as required for highly competitive markets with competitive landscapes that require constant product and features enhancements Proficiency working in Linux environments Experience with databases Comfortable with Python and other languages Experience with at least one machine learning open source package (sklearn, TensorFlow, Caffe2, Torch, etc.) The more experience the better. Relevant portfolios a big plus. Preferred Qualifications Graduate degree in Computer Science, Data Science, ORIE, Statistics, Math, Physics, or equivalent Understanding of the data science ecosystem -- commercial and open source Experience at high-tech startup, technology/data science consultancy with data science tools; Academic experience is a valid substitute (eg, Ph.D., Fellowship) Strong communications skills as demonstrated by personal projects, technical blog postings, volunteer activities, etc. Participation in hackathons; wins at hackathons or Kaggle leaderboards are even more impressive Experience working with computational libraries (eg, NumPy, Pandas, Spark, etc.) To be considered Applicants are encouraged to share online project/code portfolios and any demonstrations of community participation (eg, public Github profiles, public StackOverflow profiles, public Kaggle profiles, technical blog postings, etc.) Additional Information All your information will be kept confidential according to EEO guidelines.

Similar jobs

More searches like this