Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
July 21, 2016

Editor's Picks

  • Creating a Beer Recommendation Engine
    Beer is one of my passions. I’m an award-winning homebrewer. I’ve judged beer competitions. I’m an active member in my local homebrewing clubs. I’ve reviewed just under 1000 unique beers on Untappd. I have several floorplan ideas for the taproom of the craft brewery I will definitely someday own. I’m drinking a beer while I’m writing this post. My goal is to create a recommendation engine for beer that is actually useful...
  • Central Limit Theorem – Interactive Data Visualization Explanation
    This is an attempt to visually explain the core concepts of the Central Limit Theorem. By providing a variety of interactive components, this page seeks to provide an intuitive understanding of one of the foundational theories behind inferential statistics. It draws inspiration from other visual explanations...

A Message From This Week's Sponsor

  • Where science and policy change the world. And You.

    Apply your knowledge & skills to federal policy via the AAAS Science & Technology Policy Fellowships. A year-long professional development opportunity for doctoral level data scientists to serve in the federal government in Washington, D.C.
    STPF fosters a career-enhancing network of science leaders who understand policymaking & contribute to society...

Data Science Articles & Videos

  • What we’ve learned about brands in London from 5 million Instagram posts
    As any modern fashion mecca and large financial center London is big on instagram, so it’s not surprising it is the most instagrammed city in Great Britain and 2nd one in the world after New York, and followed by Paris. What do Londoners and guests of the city instagram about? What places do they like the most? Where do they feel miserable?...
  • Data Science at Zymergen
    Zymergen is an SF Bay Area startup that uses software, robotics, and advanced genetic engineering techniques to make industrial microbes more effective at producing particular chemicals, or even to create brand new compounds. I recently joined Zymergen to manage the Core Infrastructure and Data Science teams...
  • A Review of Travel Chatbots
    Travel search is one of the most common use cases for chatbots, we reviewed the 5 main travel bots on Facebook Messenger... Some bots understand complex text input, others require information provided one at a time in a ping ping conversation...
  • Probablistic Filters By Example
    Probablistic filters are high-speed, space-efficient data structures that support set-membership tests with a one-sided error. These filters can claim that a given entry is definitely not represented in a set of entries, or might be represented in the set. That is, negative responses are conclusive, whereas positive responses incur a small false positive probability (FPP). Below is side-by-side simulation of the inner workings of Cuckoo and Bloom filters....
  • Virtual Worlds as Proxy for Multi-Object Tracking Analysis
    Modern computer vision algorithms typically require expensive data acquisition and accurate manual labeling. In this work, we instead leverage the recent progress in computer graphics to generate fully labeled, dynamic, and photo-realistic proxy virtual worlds...
  • Understanding Bias: A Pre-requisite For Trustworthy Results
    It turns out that it’s shockingly easy to do some very reasonable things with data (aggregate, slice, average, etc.), and come out with answers that have 2000% error! In this post, I want to show why that’s the case using some very simple, intuitive pictures...


  • Senior Data Scientist - British Geological Survey - Keyworth, UK
    The British Geological Survey is one of the world's leading and forward thinking geological science institutes, a vacancy has arisen for a Senior Data Scientist in Keyworth, Nottingham.
    Starting salary is £35,222 pa to £38,254+ pa depending on qualifications and experience. 
    To apply, please go to and submit your CV and covering letter.  Applicants who would like to receive this advert in an alternative format (e.g. large print, Braille, audio or hard copy), or who are unable to apply online should telephone 01793 867003.
    Closing date is 24 July 2016.

Training & Resources

  • Tuning a scikit-learn estimator with skopt
    Tuning the hyper-parameters of a machine learning model is often carried out using an exhaustive exploration of (a subset of) the space all hyper-parameter configurations (e.g., using sklearn.model_selection.GridSearchCV), which often results in a very time consuming operation. In this notebook, we illustrate how skopt can be used to tune hyper-parameters using sequential model-based optimisation, hopefully resulting in equivalent or better solutions, but within less evaluations...


Easy to unsubscribe at any time. Your e-mail address is safe.