Data Science Weekly Newsletter - Issue 123

Issue #123

March 31 2016

Editor Picks
 
  • Debunking Narrative Fallacies with Empirically-Justified Explanations
    Of all our many talents – bipedalism, opposable thumbs, etc. – one of humanity’s most remarkable traits is our tendency to infer meaning from what happens around us. We understand the world through stories, and this is such a fundamental part of our nature that it is almost impossible to stop ourselves from inventing very reasonable-sounding explanations for what we see. A lot of these stories are intuitive and a lot of them might be right (seasonality is real in many businesses!), but we’re not good at knowing when our stories are trustworthy and when they aren’t...
  • One Genius' Lonely Crusade to Teach a Computer Common Sense
    Over July 4th weekend in 1981, several hundred game nerds gathered at a banquet hall in San Mateo, California. Doug Lenat, then a 29-year-old computer science professor at nearby Stanford University, was among the players. But he didn’t compete alone. He entered the tournament alongside Eurisko, the artificially intelligent system he built as part of his academic research....
 
 

A Message from this week's Sponsor:

 

 
 

Data Science Articles & Videos

 
  • Building a High-Throughput Data Science Machine
    Scaling is hard. Scaling data science is extra hard. What does it take to run a sophisticated data science organization? What are some of the things that need to be on your mind as you scale to a repeatable, high-throughput data science machine?...
  • Machines Just Got Better at Lip Reading
    Bear and her colleague Richard Harvey have come up with a new lip-reading algorithm that improves a computer’s ability to differentiate between sounds—such as ‘p’, ‘b,’ and ‘m’—that all look similar on lips...
  • Can I Hug That?
    Classifier tells you whether or not what's in an image is huggable...
  • ggplot2 and Joy Division
    A while ago I had had a great time answering a question on stackoverflow that was asking about recreating a plot from a fivethirtyeight article in ggplot2. You can see the original and my attempt below...
  • Bar Charts with Brains
    But there are things we can do to bar charts to smarten them up while preserving their familiarity. The idea behind glasseye is to develop d3 charts for the presentation rather than the exploration of data. These are two very different activities and their confusion is behind many dull or incomprehensible presentations. For one thing we can give them a layer of intelligence that will help the user make better decisions. Here is our version of a bar chart that helps makes sense of some multiple choice survey data...
  • CrowdSignals Aims to Create a Marketplace for Smartphone Sensor Data
    Words and pictures, culled from across the web, have been the digital grist for remarkable gains in computing tasks like image recognition and speech translation. But another huge data resource — sensor data from smartphones — lags behind as a fuel source for major research advances...
 
 

Jobs

 
  • Data Scientist - Washington Post - Washington D.C.

    Washington Post is looking for passionate data scientists to join our Big Data Analytics team. Washington Post has huge volumes of activity data and related business data from millions of customers. We are building an integrated Big Data Platform that stores all aspects of customer profiles and activities (360-degree view of customers), contents and their metadata, and business data. Data scientist will utilize the data from the platform and design and build systems that apply machine learning, statistical modeling, NLP (Natural Language Processing), data visualization and other data science techniques to provide personalized contents and experience for customers, generate insights, improve advertisement strategies, automate processes, and help newsrooms and other business units to make data-driven decisions. This role is equal parts scientist, statistician & software developer...
 
 

Training & Resources

 
  • Missing data visualization module for Python
    Messy datasets? Missing values? missingno provides a flexible and easy-to-use missing data matrix (nullity matrix?) visualization that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset...
  • Saddles Again
    Thanks to Rong for the very nice blog post describing critical points of nonconvex functions and how to avoid them. I’d like to follow up on his post to highlight a fact that is not widely appreciated in nonlinear optimization...
 
 

Books

 

  • AI for Humans, Volume 3: Deep Learning and Neural Networks

    Demonstrates neural networks in a variety of real-world tasks such as image recognition and data science...

    "The content is easy to digest and not heavy on the math. Great primer to get used to concepts before diving deeper..."

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
 
 
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.