Data Science Weekly Newsletter - Issue 112

Issue #112

January 14 2016

Editor Picks
 
  • AI Algorithm Identifies Humorous Pictures
    It’s easy to imagine that humor will be one of the last bastions that separates humans from machines. Computers, the thinking goes, cannot possibly develop a sense of humor until they can grasp the subtleties of our rich social and cultural settings. And even the most powerful AI machines are surely a long way from that. That thinking may soon have to change...
  • AMA Data Scientist: Jake Porway of DataKind - QUESTIONS ANSWERED
    DataKind’s founder and executive director Jake Porway did his first ever Reddit AMA on January 13. It was a terrifically candid discussion of what it takes to apply data science for social good. (Hint - much more than good intentions.) He and the DataKind team answered all kinds of questions that you might find useful!...
 
 

A Message from this week's Sponsor:

 


 

Data Science Articles & Videos

 
  • Deep Grammar: Grammar Checking Using Deep Learning
    Deep Grammar is a grammar checker built on top of deep learning. Deep Grammar uses deep learning to learn a model of language, and it then uses this model to check text for errors in three steps...
  • A 'Brief' History of Neural Nets and Deep Learning, Part 1
    This is the first part of ‘A Brief History of Neural Nets and Deep Learning’. In this part, we shall cover the birth of neural nets with the Perceptron in 1958, the AI Winter of the 70s, and neural nets’ return to popularity with backpropagation in 1986...
  • Experiments with style transfer
    Since the original Artistic style transfer and the subsequent Torch implementation of the algorithm by Justin Johnson were released I’ve been playing with various ways to use the algorithm in other ways. Here’s a quick dump of the results of my experiments...
  • Understanding the Pseudo-Truth as an Optimal Approximation
    One of the things that set statistics apart from the rest of applied mathematics is an interest in the problems introduced by sampling: how can we learn about a model if we’re given only a finite and potentially noisy sample of data? Although frequently important, the issues introduced by sampling can be a distraction when the core difficulties you face would persist even with access to an infinite supply of noiseless data...
  • Implicit Recommender Systems: Biased Matrix Factorization
    In today's post, we will explain a certain algorithm for matrix factorization models for recommender systems which goes by the name Alternating Least Squares (there are others, for example based on stochastic gradient descent). We will go through the basic ALS algorithm, as well as how one can modify it to incorporate user and item biases...
  • Implications of use of multiple controls in an A/B test
    This post will examine the idea of using two control buckets in order to guard against Type I and Type II errors. We will demonstrate this causes significant problems, and that creating a single large control is a superior and unbiased way to achieve the same goal using the same amount of data....
  • Why Slam Matters, The Future Of Real-time Slam, Deep Learning VS Slam
    Today's post contains a brief introduction to SLAM (Simultaneous Localization and Mapping), a detailed description of what happened at my ICCV's (International Conference of Computer Vision) Future of Real-Time SLAM Workshop (with summaries of all 7 talks), and some take-home messages from the Deep Learning-focused panel discussion at the end of the session...
  • Seven things I learned at my first data science hackathon
    I love hackathons. I can learn more in a day or two of hard work with friends than I could in six months studying on my own. So when I heard about the first Social Data Science Hackathon in the Twin Cities, I was one of the first to sign up. Here are a few of the most important lessons I learned from the experience...
 
 

Jobs

 
  • Data Scientist - Graphiq - Santa Barbara, CA

    As a Data Scientist on Graphiq’s data team, you will eat, breath, and sleep data. You will be responsible for building full-fledged data products that help our product managers provide users with deeper insights and tell a more compelling story with our data. You will research, design, and implement robust methods for statistical analysis that can be used to help understand our billions of data points. You will build scalable solutions for analyzing our large and very connected knowledge graph...

    Learn more about the role and get some terrific advice for your Data Science resume, in our interview with the Hiring Manager, Nick Larusso...
 
 

Training & Resources

 
  • Getting Started with Markov Chains
    In this post, we’ll explore some basic properties of discrete time Markov chains using the functions provided by the markovchain package supplemented with standard R functions and a few functions from other contributed packages...
 
 

Books

 

  • Superforecasting: The Art and Science of Prediction

    Interesting take on prediction, drawing on decades of research and the results of a massive, government-funded forecasting tournament (The Good Judgment Project) involving tens of thousands of ordinary people...

    "Superforecasting is the rare book that is both scholarly and engaging. The lessons are scientific, compelling, and enormously practical..."

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
 
 
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.