IT Principal Engineer, High Performance Computing

Location
College Park, MD
Posted
Apr 12, 2017
Closes
May 12, 2017
Ref
Position #106459
Function
Engineer, IT
Hours
Full Time

Position Description

The High-Performance Computing (HPC) Systems Engineer is in charge of the management, maintenance, and operation of the University of Maryland’s HPC systems. Responsibilities include planning, design, engineering, and project support for HPC hardware and software; designing and managing petabyte-scale data storage, with uses ranging from collaborative software development environments to multi-terabyte scientific datasets; establishing strategic relationships with vendors; collaborating with peers across the University System of Maryland (USM) and the State of Maryland. Additionally, in conjunction with the Director of Research Computing, the incumbent will be responsible for assessing and understanding campus research needs and collaborating with campus units on technology direction. Lastly, the HPC Systems Engineer will oversee the institution of appropriate controls, processes, and best practices that ensure compliance with Federal, State, and USM audit requirements.

Minimum Qualifications

  1. Bachelor’s degree in Electrical Engineering, Computer Engineering, Computer Science, or a related field
  2. Five to seven years of progressive experience in HPC and hands-on systems engineering experience in enterprise infrastructure
  3. Five or more years administering UNIX/Linux systems used in scientific computation
  4. Familiarity with OpenMP, MPI, SLURM, version control (SVN/git), and automation/configuration management (Chef, Puppet, Ansible) tools
  5. Knowledge of HPC systems, scalable/parallel architectures, and Unix/Linux operating systems
  6. Knowledge of advanced data storage technologies and high-speed networking
  7. Working knowledge of one or more high-level programming languages such as C, C++, or FORTRAN
  8. Expert knowledge of one or more scripting languages such as Bash, perl, Python, etc.
  9. Skill in the installation and configuration of operating systems and application software
  10. A comprehensive knowledge of Linux operating system internals and multiple high performance computing architectures
  11. Knowledge of advanced problem resolution procedures, testing and evaluation methods, programming tools, and system network security
  12. Ability to assist the Director of Research Computing in gathering user requirements for planning and designing computer systems
  13. Ability to define proper methods and procedures for the integration, testing, and installation of system modifications
  14. Ability to work under pressure and meet firm deadlines
  15. Ability to analyze complex problems, interpret operational needs, and develop integrated, creative solutions
  16. Ability to represent the University in various forums at the local, state, regional, and national levels
  17. Ability to make complex technical design decisions involving software or hardware implementation strategies
  18. Ability to monitor system usage and performance statistics and to understand the impacts of operating system tuning parameters
  19. Effective verbal and written communication capabilities Ability to lead large-scale/enterprise-wide projects and display a high degree of proficiency in project management techniques, including time estimates, resource allocations, project status updates, managing critical path and milestone deadlines.
  20. Ability to work independently and in matrixed team environments, and to collaborate easily with members of the campus community, peers, and senior leadership
  21. Ability to work collaboratively with faculty, researchers, technical staff, administrators, and constituents outside of the University, and to represent the University at local, state, regional, and national forums

     

Preferences

  1. Experience working in a large research university environment
  2. Demonstrated successful experience leading and managing large-scale infrastructure projects and developing a technical vision for the oversight of a major technology/service unit
  3. Experience with Amazon Web Services (AWS) or other public cloud infrastructures

     

 

Posting Date: 04/12/2017

Closing Date: 05/12/2017

Best Consideration Date: 05/05/2017

 

Please apply at: https://ejobs.umd.edu/postings/50829

 

EOE/AA