Data Science Weekly Newsletter - Issue 194

Issue #194

Aug 10 2017

Editor Picks
 
  • A computer was asked to predict which start-ups would be successful.
    The results were astonishing

    In 2009, Ira Sager of Businessweek magazine set a challenge for Quid AI's CEO Bob Goodson: programme a computer to pick 50 unheard of companies that are set to rock the world. Nearly eight years later, the magazine revisited the list to see how “Goodson plus the machine” had performed. The results surprised even Goodson: Evernote, Spotify, Etsy, Zynga, Palantir, Cloudera, OPOWER – the list goes on...
  • Dots vs. polygons: How I choose the right visualization
    When I start designing a map I consider: How do I want the viewer to read the information on my map? Do I want them to see how a measurement varies across a geographic area at a glance? Do I want to show the level of variability within a specific region? Or do I want to indicate busy pockets of activity or the relative volume/density within an area?...
  • PyTorch vs TensorFlow — spotting the difference
    In this post I want to explore some of the key similarities and differences between two popular deep learning frameworks: PyTorch and TensorFlow. Why those two and not the others? There are many deep learning frameworks and many of them are viable tools, I chose those two just because I was interested in comparing them specifically...
 
 

A Message from this week's Sponsor:

 

   
 

Data Science Articles & Videos

 
  • Transitioning entirely to neural machine translation
    Creating seamless, highly accurate translation experiences for the 2 billion people who use Facebook is difficult. We need to account for context, slang, typos, abbreviations, and intent simultaneously. To continue improving the quality of our translations, we recently switched from using phrase-based machine translation models to neural networks to power all of our backend translation systems, which account for more than 2,000 translation directions and 4.5 billion translations each day...
  • An Algorithm Trained on Emoji Knows When You’re Being Sarcastic on Twitter
    Detecting the sentiment of social-media posts is already useful for tracking attitudes toward brands and products, and for identifying signals that might indicate trends in the financial markets. But more accurately discerning the meaning of tweets and comments could help computers automatically spot and quash abuse and hate speech online. A deeper understanding of Twitter should also help academics understand how information and influence flows through the network. What’s more, as machines become smarter, the ability to sense emotion could become an important feature of human-to-machine communication...
  • Whiz Kid Invents an AI System to Diagnose Her Grandfather's Eye Disease
    Kopparapu and her team—including her 15-year-old brother, Neeyanth, and her high school classmate Justin Zhang—trained an artificial intelligence system to recognize signs of diabetic retinopathy in photos of eyes and offer a preliminary diagnosis. She presented the system last month...
  • Exploring the census income dataset using bubble plot
    When exploring a data set, we look at the connection between different features in the data and between the features and the target. This can give us a lot of insights about how we should formulate the problem, the required preprocessing (missing values, normalization), which algorithm should we use to build are model, should we segment our data and build different models for different subsets of our dataset, etc...
 
 

Jobs

 
  • Data Scientist - BuzzFeed - New York City, USA

    BuzzFeed’s data science team is diverse, coming from varying backgrounds, experiences, and skill sets. The team uses data-driven methods to power decisions, inform strategy, build robust data products, and identify opportunities for innovation across the company. We are true hybrids - software engineers, statisticians, mathematicians, domain experts and analysts - who specialize in translating questions into methodical approaches, experiments, and products. We think deeply about the limitations of data, and communicate our output coherently...
 
 

Training & Resources

 
  • hipsteR: re-educating people who learned R before it was cool
    I was an early adopter of R, having first learned S (yay!) and then S-plus (yuck!). But at times my knowledge of R seems stuck in 2001. I keep finding out about “new” R functions (like replicate, which was new in 2003). This is a tutorial for people like me, or people who were taught by people like me...
  • Tidyverse
    Welcome to the new and improved tidyverse website. We are working hard to make tidyverse.org the place to go to learn the tidyverse and to keep up to date with it as it evolves...
  • Diamond Part 1
    We are excited to announce Diamond, an open-source Python solver for certain kinds of generalized linear models. This post covers the mathematics used by Diamond. The sister post covers the specifics of diamond. If you just want to use the package, check out the Github page...
 
 

Books

 

  • The Book of R: A First Course in Programming and Statistics

    "The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis"...


    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
 
 
P.S. Want to be a Data Scientist? We've put together a comprehensive guide to help get you started. Check it out here! :) - All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.