Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
September 14, 2017

Editor's Picks

A Message From This Week's Sponsor

Introducing the Data Jobs Board

Looking for your next great data job? Tired of having to wade through irrelevant job listings thanks to the murky definition of “analyst”? We've been there.
At Mode, we want to give you everything you need to be a great analyst, whether it be an online community of like-minded data people, learning resources like SQL School and Udacity courses, and powerful and collaborative software for SQL and Python analysis. Now, we're taking the headache out of finding great data jobs.
Introducing the Data Jobs Board: a curated list of the best jobs for data analysts, data scientists and data engineers.

Data Science Articles & Videos

  • Communicating Uncertainty When Lives Are on the Line
    Showing when and where natural disasters like hurricanes are going to cause damage is not just a question of aesthetics – it is literally a matter of life and death. The traditional way hurricane forecasts are shown has a number of problems, but are the alternatives actually better?...
  • Detecting Malicious Requests with Keras & Tensorflow
    So what if you could use the power of Google’s Tensorflow engine to decide on whether a given request is considered malicious? Well that was the question I was looking to answer while participating in Slalom’s recent AI hackathon. The following post outlines the technical details of a PoC for a security monitoring application which was built with the help of a couple other Slalomites...
  • Predicting Portland Home Prices
    For my final project at Metis, I wanted to choose something that enabled me to incorporate all that I had learned during the past three months. Predicting Portland home prices allowed me to do this because I was able to incorporate various web scraping techniques, natural language processing on text, deep learning models on images, and gradient boosting into tackling the problem...
  • Discovering Causal Signals in Images
    This paper establishes the existence of observable footprints that reveal the “causal dispositions” of the object categories appearing in collections of images...
  • Sketchy Data Visualization in Semiotic
    When I open-sourced Semiotic, I expected to get some pushback on its support for hand-drawn “sketchy” rendering in marks. I also expected some questions as to how it and its accompanying “painty” mode are implemented. Instead, except for a couple friendly jibes, mostly of the response to Semiotic has been on its focus on information design. But I wanted to make sure to highlight the sketchy functionality nonetheless...


  • Data Visualization & Web Application Engineer - Insight Engines - San Francisco, CA
    Design, develop, and deploy data visualizations powered by natural language. Insight Engines creates natural language search tools for cyber security investigations, which requires producing relevant data visualizations over large complex datasets in order to make sense of results. We are looking for a skilled data visualization and web application engineer who can craft visualizations that help security analysts navigate and understand cyber security log data. You will be joining a small enthusiastic team, and you will take ownership of the frontend user experience...

Training & Resources

  • TensorLy, now also with PyTorch backend
    Since TensorLy was refactored to support backends, it is fairly easy to add new backends, so as a proof of concepts I put together a Pytorch backend. There are most likely a few optimisations to do and some things could be done better but all the tests pass. Here is a quick demonstration...
  • Data-driven Advice for Applying Machine Learning to Bioinformatics Problems
    As the bioinformatics field grows, it must keep pace not only with new data but with new algorithms. Here we contribute a thorough analysis of 13 state-of-the-art, commonly used machine learning algorithms on a set of 165 publicly available classification problems in order to provide data-driven algorithm recommendations to current researchers. We present a number of statistical and visual comparisons of algorithm performance and quantify the effect of model selection and algorithm tuning for each algorithm and dataset...
  • Scalable Machine Learning (Part 1)
    Anaconda is interested in scaling the scientific python ecosystem. My current focus is on out-of-core, parallel, and distributed machine learning. This series of posts will introduce those concepts, explore what we have available today, and track the community's efforts to push the boundaries...


Looking to hire a Data Scientist? Find an awesome one among our readers! Email us for details on how to post your job :) - All the best, Hannah & Sebastian

Easy to unsubscribe at any time. Your e-mail address is safe.