Skip to main content

This job has expired

SQL and ETL Data Engineer

Employer
Socially Determined
Location
Washington, DC
Closing date
Nov 18, 2019
About the Role We seek an experienced SQL and ETL Data Engineer to help design, implement and operate a scalable, automated high-performance data collection and ingestion service in our environment. Using AWS services and integrated tools, the ideal candidate will leverage best-practice techniques for acquiring, organizing, and preparing raw CSV and JSON healthcare data to be loaded into our data model. The SQL and ETL Data Engineer will be fluent in SQL and creating scripts to create and update complex schemas, tables, and relationships in AWS Aurora PostgreSQL and other RDBMS. The SQL and ETL engineer will also understand MongoDB JSON databases and how to organize data effectively in JSON documents to achieve high scalability and performance. The optimum candidate will have geospatial experience, specifically PostGIS and Tiger. What you ll do: Help lead the data architecture and data model design supporting our data science, data analytics, and data visualization functions. Acquire and organize raw healthcare data in CSV, TXT, JSON and other formats from our clients using AWS S3 buckets in a data lake organizational model. Create automated ETL processes to transform raw CSV, TXT, JSON and other raw data formats into a consistent data model supporting data science, data analytics, reporting and data visualization functions. Design and implement Serverless ETL using AWS Glue, Lambda or other AWS native services. Familiar with DevOps model using Jenkins and AWS CI/CD services. Perform geospatial manipulation and calculations using PostGIS, Tiger, and other geospatial technologies. Create scripts for creating schemas, databases, tables, views and other constructs in PostgreSQL. Create scripts for automating permissions and roles in PostgreSQL. Configure security logging. Design and configure AWS DocumentDB MongoDB database for JSON document storage. Design process for converting relational data to appropriate JSON documents. Maintain an exceptional level of quality, efficiency, and effectiveness in the deliverables of each product sprint What you ll need: 2+ years of direct experience in creating and implementing SQL and ETL scripts and processes, especially in a secure AWS environment 2+ years of experience with PostgreSQL administration, including security and performance management and logging Experience performing PostGIS geospatial data manipulation and calculations Experience in configuring MongoDB and JSON document sets Experience in columnar file structure such as Avro and Parquet is a plus. Experience working in fast-moving startup environments with rapidly evolving priorities Programming skills in Python is a plus Healthcare specific experience working with clinical and claims data is a differentiator

Get job alerts

Create a job alert and receive personalized job recommendations straight to your inbox.

Create alert