Data Science Weekly Newsletter

Issue

200

September 21, 2017

‍

Editor's Picks

‍

Machine-Vision Drones Monitor Animals in the African Savanna
Managing wild animals in remote areas requires accurate estimates of their numbers. Machine-vision drones can help...

europilot: A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.
Europilot is an open source project that leverages the popular Euro Truck Simulator(ETS2) to develop self-driving algorithms. Think of europilot as a bridge between the game environment, and your favorite deep-learning framework, such as Keras or Tensorflow. With europilot, you can capture the game screen input, and programmatically control the truck inside the simulator...

Introducing: Unity Machine Learning Agents
As the world’s most popular creation engine, Unity is at the crossroads between machine learning and gaming. It is critical to our mission to enable machine learning researchers with the most powerful training scenarios, and for us to give back to the gaming community by enabling them to utilize the latest machine learning technologies. As the first step in this endeavor, we are excited to introduce Unity Machine Learning Agents...

‍

A Message From This Week's Sponsor

‍

Are you data curious? An aspiring data scientist?

On 9/27, join Metis for a free, online event featuring 25+ incredible speakers from the data science field. Speakers will demystify data science and discuss the training, tools, and career path to one of the world's hottest jobs. Every registrant gets access to bonus material from some of the industry's most influential thought-leaders. Secure your spot today! #DemistifyDS

‍

Data Science Articles & Videos

‍

Predicting NFL Plays with the xgboost Decision Tree Algorithm
In all levels of football, on-field trends are typically discerned exclusively through voluminous film study of opponent history, and decisions are made using anecdotal evidence and gut instinct. These methods in isolation are highly inefficient and prone to human error. Enter – the play predictor. This tool aims to enhance in-game NFL decision making with a tool capable of predicting the type of play the opposing team will run at high accuracy in real-time...

Data Security for Data Scientists
Ten practical tips for protecting your data (and more importantly, everyone else’s!)...

Optical Effects in User Interfaces (for True Nerds)
How to make optically balanced icons, correct shapes alignment, and perfect corner rounding. Over 50 pics included...

Kullback-Leibler Divergence Explained
In this post we're going to take a look at way of comparing two probability distributions called Kullback-Leibler Divergence (often shortened to just KL divergence). Very often in Probability and Statistics we'll replace observed data or a complex distributions with a simpler, approximating distribution. KL Divergence helps us to measure just how much information we lose when we choose an approximation....

Learning to Optimize with Reinforcement Learning
Yet, there is a paradox in the current paradigm: the algorithms that power machine learning are still designed manually. This raises a natural question: can we learn these algorithms instead? This could open up exciting possibilities: we could find new algorithms that perform better than manually designed algorithms, which could in turn improve learning capability...

3D Face Reconstruction from a Single Image
This is an online demo of our paper Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression...

Supporting Hypothesis
In September, Stripe is supporting the development of Hypothesis, an open-source testing library for Python created by David MacIver. Hypothesis is the only project we’ve found that provides effective tooling for testing code for machine learning, a domain in which testing and correctness are notoriously difficult...

Visualizing Distributions
Many charting taxonomies include distributions, but they only present a few options. Let’s remedy that with a post on the many. We’ll use a single (completely fake) data set so we can easily compare how each chart type displays the same data...

‍

Jobs

‍

Data Analyst - Glossier - New York, NY
Glossier is looking for a Senior Data Analyst to take our data practice to the next level. You will work closely with our Head of Data to provide data-driven insights to teams across the organization in order to inform strategic decision-making. You will take a leading role in shaping our Data practices, and you will use your insights to scope projects, propose approaches, and help to drive them to completion. If you enjoy finding the signal in the noise, bringing order and structure to inefficiencies, and know the mean time and standard deviation of your commute then please apply...

‍

Training & Resources

‍

Tutorial: Implementing sci-kit learn’s Random Forest Classifier for the 1st time
This tutorial walks you through implementing sci-kit learn’s Random Forest Classifier on the Iris training set. It demonstrates the use of a few other functions from sci-kit learn such as train_test_split and classification_report...

Creating a Bar Chart
The very basics of how to create a bar chart or stacked bar chart with labels and an axis in Semiotic...

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
This is a PyTorch version of fairseq, a sequence-to-sequence learning toolkit from Facebook AI Research. The original authors of this reimplementation are (in no particular order) Sergey Edunov, Myle Ott, and Sam Gross. The toolkit implements the fully convolutional model described in Convolutional Sequence to Sequence Learning and features multi-GPU training on a single machine as well as fast beam search generation on both CPU and GPU. We provide pre-trained models for English to French and English to German translation...

‍

Books

‍

Keeping Up with the Quants:
Your Guide to Understanding and Using Analytics
"Perfect for professionals who want a better grounding in understanding and applying data results to business goals..."...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page...

Reminder, if you enjoyed the first 200 newsletters and want many more ... Please make a donation to help keep us going :) - All the best, Hannah & Sebastian

‍