Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
November 2, 2017

Editor's Picks

  • What causes wildfires in the US?
    Recent events were my motivation for this project, where I aim to create a classification model to predict the cause of a wildfire given its features, and create a tool in the form of a Flask application that could help authorities determine the cause of a fire when reasons are unknown...

A Message From This Week's Sponsor

Get a Data Science Job With Proven Career Placement

Springboard offers you your own data science expert and career coach with the first online course to offer you a data science job or your money back. They'll help you tailor a personalized skills training and career search strategy that will get you into a data science career. Springboard graduates have been placed at Ford, Verizon, Nielsen, Kaiser Permanente, and the Federal Reserve.

Data Science Articles & Videos

  • 2017 - The State of Data Science & Machine Learning
    This year, for the first time, we conducted an industry-wide survey to establish a comprehensive view of the state of data science and machine learning. We received over 16,000 responses and learned a ton about who is working with data, what’s happening at the cutting edge of machine learning across industries, and how new data scientists can best break into the field...
  • O.K. Computer - Tell Me What This Smells Like
    Over the years, biologists who specialize in the psychophysics of smell have continued to work away at the problem. Earlier this year, Vosshall and her collaborators published a new take on it, this time using computer algorithms...
  • From Data to Deployment – Full Stack Data Science
    In this talk, we walked through an actual Indeed data science full-stack model building process: labeling data, performing analysis, generating features, building the model, validating the model, building infrastructure, deploying the model, and monitoring the solution. We discussed how these techniques are applicable across a broad set of domains...
  • How do CNNs Deal with Position Differences?
    An engineer who’s learning about using convolutional neural networks for image classification just asked me an interesting question; how does a model know how to recognize objects in different positions in an image? Since this actually requires quite a lot of explanation, I decided to write up my notes here in case they help some other people too...
  • Simpson's Paradox Explained
    Is Simpson's Paradox a paradox because people implicitly assume all comparisons are "all else equals" comparisons?...


  • Senior Data Scientist - DataDog - New York, Paris
    At Datadog, we’re on a mission to build the best monitoring platform in the world. We operate at high scale—trillions of data points per day—and high availability, providing always-on alerting, visualization, and tracing for our customers' infrastructure and applications around the globe.
    Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way. We need you to design and build machine learning-powered products that help our customers learn from their data and make better decisions in real-time.
    You will have a fantastic team of data engineers to support you, a collaborative environment to encourage your work, and the best technologies for performing data science at high scale in your toolkit...

Training & Resources

  • Eager Execution: An imperative, define-by-run interface to TensorFlow
    Today, we introduce eager execution for TensorFlow. Eager execution is an imperative, define-by-run interface where operations are executed immediately as they are called from Python. This makes it easier to get started with TensorFlow, and can make research and development more intuitive...
  • The often-overlooked random forest kernel
    Here, we’ll discuss a type of kernel called the random forest kernel, which takes advantage of a pre-trained random forest in order to provide a custom-tailored kernel...
  • Bounter -- Counter for large datasets
    Bounter is a Python library, written in C, for extremely fast probabilistic counting of item frequencies in massive datasets, using only a small fixed memory footprint...


Easy to unsubscribe at any time. Your e-mail address is safe.