Data Science Weekly Newsletter - Issue 79

Issue #79

May 28 2015

Editor Picks
 
  • The Unreasonable Effectiveness of Recurrent Neural Networks
    There's something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Fast forward about a year: I'm training RNNs all the time and I've witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me. This post is about sharing some of that magic with you...
  • I Let IBM’s Robot Chef Tell Me What to Cook for a Week
    Now, Chef Watson — developed alongside Bon Appetit magazine and several of the world’s finest flavor-profilers — has been launched in beta, enabling you to mash recipes according to ingredients of your own choosing and receive taste-matching advice which, reportedly, can’t fail. While some of the world’s foremost tech luminaries and conspiracy theorists are a bit skeptical about the wiseness of A.I., if it’s going to be used at all, allowing it to tell you what to make out of a fridge full of unloved leftovers seems like an inoffensive enough place to start. I decided to put it to the test...
  • Google working on technology that counts calories in food photos
    Google research scientist Kevin Murphy dropped a knowledge bomb on a crowd full of data scientists at the RE.WORK Deep Learning Summit Tuesday. While working on Google’s image recognition programs — algorithms that can analyze a photo and precisely identify items — he thought of a unique application: Counting calories by analyzing photographs of food...
 
 

Data Science Articles & Videos

 
  • 6 Tricks I Learned From The OTTO Kaggle Challenge
    Here are a few things I learned from the OTTO Group Kaggle competition. I had the chance to team up with great Kaggle Master Xavier Conort, and the french community as a whole has been very active...
  • ConvnetJS demo: Image "Painting"
    This demo that treats the pixels of an image as a learning problem: it takes the (x,y) position on a grid and learns to predict the color at that point using regression to (r,g,b). It's a bit like compression, since the image information is encoded in the weights of the network, but almost certainly not of practical kind :)...
  • Introducing ShArc: Shot Arc Analysis
    In this post, I begin my own foray into the NBA's big data world, but with a different focus. While it's common knowledge that SportVU data provides location in two dimensions for every player on the court, what may not be widely appreciated is that the ball itself is tracked in all three dimensions. When developing EPV with his students, Kirk Goldsberry code named the work "XY Hoops". Consider the work below an "XYZ Hoops" project of sorts...
  • How Predictable is the English Premier League?
    This weekend the English Premier League season will conclude with little fanfare. Bar one relegation place, the league positions have already been determined. In fact, these positions were, for the most part, decided weeks ago. The element of uncertainty seems to have been reduced this season...
  • Why I Left My Data Science Master's Program
    I just completed the second of two finals to end the first semester of Berkeley's MIDS program--a new data science program created by the School of Information at UC Berkeley. It was disappointingly easy and expensive ($13k per semester for 5 semesters for an online program). The level of comprehension required to do well was about that of a Coursera course. And this is not to say that Coursera is easy; it isn't if you really dig your heels in. There is a higher level of accountability that comes with a structured program, but the incremental learning that came with the structure didn't make the degree worth it. I'm dropping the program today...
  • Algocracy
    Ever since the development of computers in the mid–20th century, algorithms have been used to increase business efficiency. Today, algorithms run our world...
  • An NPR Reporter Raced A Machine To Write A News Story. Who Won?
    Even the most creative jobs have parts that are pretty routine — tasks that, at least in theory, can be done by a machine. Take, for example, being a reporter. A company called Automated Insights created a program called WordSmith that generates simple news stories based on things like sporting events and financial news. The stories are published on Yahoo! and via the Associated Press, among other outlets. We wanted to know: How would NPR's best stack up against the machine?...
 
 

Jobs

 
  • Data Engineer - Blue Apron - NYC

    We’re looking for a Data Engineer to join our team to revolutionize the food chain through data. You will empower the organization through the ingestion, organization, manipulation, and delivery of fast, reliable, and consistent data from across our complex organization. You will help the Data Science team infer food preferences, create complex demand prediction models, quantitatively characterize recipe quality, build critical analytics to power a massive logistics operation, and identify innovative new data products that will change the face of the food industry...
 
 

Training & Resources

 
  • Top 10 data mining algorithms in plain English
    Today, I’m going to explain in plain English the top 10 most influential data mining algorithms as voted on by 3 separate panels in this survey paper. Once you know what they are, how they work, what they do and where you can find them, my hope is you’ll have this blog post as a springboard to learn even more about data mining...
  • Mean Shift Clustering
    Mean shift clustering is one of my favorite algorithms. It’s a simple and flexible clustering technique that has several nice advantages over other approaches. In this post I’ll provide an overview of mean shift and discuss some of its strengths and weaknesses...
 
 

Books

 

  • Python: Learn Python in One Day and Learn It Well

    Clear theory and a project to work through at the end...

    "I am a novice to programming and decided to learn Python as I'm told it is one of the easiest language to learn. I read a few books on Python and this is definitely one of the best. The author is able to explain difficult concepts clearly, and the project at the end definitely helped my learning..."

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
 
 
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.