Data Science Weekly Newsletter - Issue 236

Issue #236

May 31 2018

Editor Picks
 
  • Mortality in Puerto Rico after Hurricane Maria
    Quantifying the effect of natural disasters on society is critical for recovery of public health services and infrastructure. The death toll can be difficult to assess in the aftermath of a major disaster. In September 2017, Hurricane Maria caused massive infrastructural damage to Puerto Rico, but its effect on mortality remains contentious. The official death count is 64...
  • Why you need to improve your training data, and how to do it
    There are lots of good reasons why researchers are so fixated on model architectures, but it does mean that there are very few resources available to guide people who are focused on deploying machine learning in production. To address that, my talk at the conference was on “the unreasonable effectiveness of training data”, and I want to expand on that a bit in this blog post, explaining why data is so important along with some practical tips on improving it...
 
 

A Message from this week's Sponsor:

 

 
Clark University: Transform Data Into Something Meaningful

Business Analytics at Clark University will give you the skills employers demand by teaching you how to synthesize data into powerful information. Whether you enroll in a full- or part-time master’s or accelerated certificate program, you will be equipped to make informed decisions and improve organizational performance.

You don’t need a background in statistics or science to succeed here. We offer:
  • Blended curriculum
  • Career-ready courses
  • Affordable excellence
Move your career forward in one of the fields with the largest demand.
 
Learn more at clarku.edu/analytics.
   
 

Data Science Articles & Videos

 
  • Reproducibility in ML: Why It Matters and How to Achieve It
    Reproducing results across machine learning experiments is painstaking work, and in some cases, even impossible. In this post, we detail why reproducibility matters, what exactly makes it so hard, and what we at Determined AI are doing about it...
  • StackNN
    A PyTorch implementation of differentiable stacks for use in neural networks...
  • Create beautiful test-driven data visualisations with D3.js
    Creating D3 visualisations with Test Driven Development is one way of producing code that is easy to extend, refactor and change. In this post, we will go through the process of creating a heatmap chart using a Test Driven approach. The result is a great looking chart with a complete set of tests that has code good enough to be used in a production environment. Let’s get going!...
  • Advice For Applying To Data Science Jobs
    This post is a collection of my thoughts and recommendations for people interested in applying to data science jobs in the US. Many of these principles also apply to tech jobs in general...
 
 

Jobs

  • Data Scientist - SQUAD - NYC
    SQAD LLC is on the cutting edge of digital and traditional, media cost measurement and forecasting. Recognized as an industry pioneer, SQAD’s data and systems serve the some of the biggest brands in media. SQAD provides reliable media cost data to advertising agencies, media buying companies, advertisers, television and radio stations, cable companies, program syndicators and Internet publishers.

    What you’ll do: You will be part of a talented Dev-Ops team that is responsible for designing and developing cloud-based data solutions utilizing best-of-breed open source tools and data technologies...


 

Training & Resources

 
  • datasheets
    datasheets is a library for interfacing with Google Sheets, including reading data from, writing data to, and modifying the formatting of Google Sheets. It is built on top of Google's google-api-python-client and oauth2client libraries using the Google Drive v3 and Google Sheets v4 REST APIs...
  • Introducing Machine Learning Practica
    Today, we’re sharing this interactive course with you on Learn with Google AI, Google’s online hub for educational resources on machine learning. First, you’ll walk through the basics of how image classification works, learning the building blocks of convolutional neural networks (CNNs). Then you’ll build a CNN from scratch, learn how to prevent overfitting, and leverage pretrained models for feature extraction and fine-tuning...

 

Books

 

  • Test-Driven Machine Learning

    The book begins with an introduction to test-driven machine learning and quantifying model quality. From there, you will test a neural network, predict values with regression, and build upon regression techniques with logistic regression...


    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
 
 
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.