Data Science Weekly Newsletter - Issue 365

Issue #333

Apr 9 2020

Editor Picks
 
  • Data Science: Reality Doesn't Meet Expectations
    I had high hopes about the potential impact of being a Data Scientist...My expectations did not meet reality...Below are seven most common (and at times flagrant) ways that data science has failed to meet expectations in industry. Throughout each section, I’ll propose solutions to these shortcomings: 1.) People don’t know what “data science” does, 2.) Data science leadership is sorely lacking, 3.) Data science can’t always be built to specs, 4.) You’re likely the only “data person", 5.) Your impact is tough to measure — data doesn’t always translate to value, 6.) Data & infrastructure have serious quality problems, and 7.) Data work can be profoundly unethical. Moral courage required...
  • Monitoring Machine Learning Models in Production
    This comprehensive guide aims to at the very least make you aware of where the complexity in monitoring machine learning models in production comes from, why it matters, and furthermore will provide a practical starting point for implementing your own ML monitoring solutions...
  • Time Series Forecasting Best Practices & Examples
    This repository provides examples and best practice guidelines for building forecasting solutions. The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in forecasting algorithms to build solutions and operationalize them. Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utilities around processing and featurizing the data, optimizing and evaluating models, and scaling up to the cloud...The examples and best practices are provided as Python Jupyter notebooks and R markdown files and a library of utility functions...
 
 

A Message from this week's Sponsor:

 

 
Data scientists are in demand on Vettery

Vettery is an online hiring marketplace that's changing the way people hire and get hired. Ready for a bold career move? Make a free profile, name your salary, and connect with hiring managers from top employers today.
 

 

Data Science Articles & Videos

 
  • An Overview of Early Vision in InceptionV1
    A guided tour of the first five layers of InceptionV1, taxonomized into “neuron groups.”...By limiting ourselves to early vision, this article “only” considers the first 1,056 neurons of InceptionV1. 2 But our experience is that a thousand neurons is more than enough to be disorienting when one begins studying a model. Our hope is that this article will help readers avoid this disorientation by providing some structure and handholds for thinking about them...
  • Learning Agile Robotic Locomotion Skills by Imitating Animals
    Reproducing the diverse and agile locomotion skills of animals has been a longstanding challenge in robotics. While manually-designed controllers have been able to emulate many complex behaviors, building such controllers involves a time-consuming and difficult development process, often requiring substantial expertise of the nuances of each skill...In this work, we present an imitation learning system that enables legged robots to learn agile locomotion skills by imitating real-world animals. We show that by leveraging reference motion data, a single learning-based approach is able to automatically synthesize controllers for a diverse repertoire behaviors for legged robots...
  • How to build a platform data scientists will love
    New data science and machine learning platforms are popping up almost every week. That’s because vendors are building tools to optimize the data science workflow...They are not, however, seeing mass adoption yet. Most data scientists still prefer to work locally with small data sets and free open source tools...So how can you design a data science platform that will make them change their ways? In a nutshell, you need to match the efficiency of local development but also address its weaknesses, such as collaboration and reporting...That’s why the following elements are the most important in designing a data science platform
  • On the Link Between Polynomials and Optimization
    There's a fascinating link between minimization of quadratic functions and polynomials. A link that goes deep and allows to phrase optimization problems in the language of polynomials and vice versa. Using this connection, we can tap into centuries of research in the theory of polynomials and shed new light on old problems...This post deals with a connection between optimization algorithms and polynomials...A popular class of algorithms for this problem are gradient-based methods...
  • Why It’s So Freaking Hard To Make A Good COVID-19 Model
    Using a mathematical model to predict the future is valuable for experts, even if there are vast gulfs between possible outcomes. But it’s not always easy to make sense of the results and how they change over time, and that confusion can hurt both your brain and your heart. That’s why we want to talk about what goes into a model of a pandemic. Hopefully, understanding the uncertainty can help you get the most out of all the numbers flying around...
  • 3 Reasons Why We Are Far From Achieving Artificial General Intelligence
    How far we are from achieving Artificial General Intelligence? We answer this through the study of three limitations of current machine learning...It happened again. Last week, as I was explaining my job to someone, they interrupted me and said "So you're building Skynet". I felt like I had to show them this meme, which I thought described pretty well my current situation...
  • Are there specific things to keep in mind when working on a classifying set with only 8% positives ? [Reddit Discussion]
    I am working on my thesis and gathered a dataset from the company I am working with. The goal of the project is to predict customer churn...In total I was manage to get 1.600k rows of customers that do not churn and 140 of those who did...Because only 8% of the whole set are positives, I started to wonder if I have to be careful or do some necessary steps to make sure the classifying goes well...Does anyone have experience with a similar scenario and could give me some hints?...
  • A conversation with Kevin Scott, author of “Reprogramming the American Dream”
    In his new book, “Reprogramming the American Dream,” Kevin Scott, Microsoft’s chief technology officer, looks at how he went from a childhood in rural Virginia to being a leader in the field of AI – and why he thinks there is ample opportunity for people from all walks of life to take advantage of AI to achieve the American dream...We recently had the opportunity to talk to him about his life, book and career...
 
 

Conference*

 

 
The Premier Machine Learning Conference

5 days, 8 tracks, 160 speakers and over 150 exciting sessions

Join Machine Learning Week 2020 , May 31 – June 4, Las Vegas! It brings together five co-located events: PAW Business, PAW Financial, PAW Industry 4.0, PAW Healthcare, Deep Learning World. This event is where to meet the who’s who and keep up on the latest techniques, making it the leading machine learning event that excites and unites. You can expect top-class experts from world-famous companies such as Google, Microsoft, Lyft, Verizon, Visa and LinkedIn!

Secure your ticket now!

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
 

 

Jobs

 
  • Head of Data Science - Tessian - London, United Kingdom

    Our mission is to secure the Human Layer. This involves deploying near real-time machine learning models at massive scale to some of the world’s largest organisations to keep their most sensitive data private and secure. To do this, we're looking for an inspiring Head of Data Science ready to lead and grow our Data Science team, who is excited about the opportunities and challenges that come with building and deploying real-time production models.

    Find out more about life as a Tessian Engineer...

        Want to post a job here? Email us for details >> team@datascienceweekly.org
 

 

Training & Resources

 
  • PyTorch Developer Conference 2019 [28 Videos]
    Watch the full set of talks from the 2019 PyTorch Developer Conference. Deep dive on PyTorch 1.3 and new tools and libraries including PyTorch Mobile, CrypTen, Captum, Detectron2 and more. Hear from AI researchers and engineers from leading organizations in academia and industry on how they’re using PyTorch for both cutting edge research and production...
  • A practical approach to Tree Pruning using sklearn
    As we have already discussed in the regression tree post that a simple tree prediction can lead to a model which overfits the data and produce bad results with the test data. Tree Pruning is the way to reduce overfitting by creating smaller trees...Tree Pruning isn’t only used for regression trees. We also make use of it in the classification trees as well...As the word itself suggests, the process involves cutting the tree into smaller parts...We can do pruning in two ways...
  • Introduction to TensorFlow Lite for Microcontrollers
    Overview of the TensorFlow Lite Micro framework for embedded machine learning, including a discussion of the design tradeoffs around choosing a machine learning library and practical exercises to try out in the browser...

 

Books

 

  • Data Science in Production: Building Scalable Model Pipelines with Python

    This book provides a hands-on approach to scaling up Python code to work in distributed environments in order to build robust pipelines. Readers will learn how to set up machine learning models as web endpoints, serverless functions, and streaming pipelines using multiple cloud environments. It is intended for analytics practitioners with hands-on experience with Python libraries such as Pandas and scikit-learn, and will focus on scaling up prototype models to production....

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.