Data Scientist - Big Data
Washington Post is looking for passionate Data Scientists to join our Big Data Analytics team. Washington Post has huge volumes of activity data and related business data from millions of customers. We are building an integrated Big Data Platform that stores all aspects of customer profiles and activities (360-degree view of customers), contents and their metadata, and business data. Data scientist will utilize the data from the platform and design and build systems that apply machine learning, statistical modeling, NLP (Natural Language Processing), data visualization and other data science techniques to provide personalized contents and experience for customers, generate insights, improve advertisement strategies, automate processes, and help newsroom and other business units to make data-driven decisions. This role is equal parts scientist, statistician & software developer.
• As part of a project team, contribute to solving business problems by framing the problems, determining intended approach and quantitative methods, evaluating the analytical solutions to the problems, and deploying them to production
• Evaluate commercial and open source techniques in Machine Learning, Data Mining, NLP, and Analytics
• Build scalable and high performance Machine Learning and Data Mining algorithms
• Rapidly test hypotheses by developing prototypes, running offline experiments, and online A/B tests
• Generate actionable insights by analyzing large amounts of data and using analytical rigor, predictive modeling, cluster analysis, temporal analysis, social network analysis, and other techniques.
• Generate reports and visualizations to provide insight into the workings of the production system
• Document projects including business objective, data gathering and processing, leading approaches, final algorithm, detailed set of results and analytical metrics. Develops materials to explain project findings. Explain findings to business audiences
Experience and Qualifications:
• Bachelor's degree in Computer Science, Statistics or Mathematics is required. Master's degree preferred with emphasis on data mining, NLP or machine learning.
• Ability to work as part of a team under close supervision.
• 3 - 5 years of experience in NLP system, predictive modeling, recommendations/personalization, working with Big Data using Hadoop, MongoDB, Spark, Cassandra, Splunk
• Programming experience in Java, scripting tools such as Perl or Python, Hive, Pig
• Experience with tools such as R, Weka, SPSS, or SAS