Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
June 28, 2018

Editor's Picks

  • Self-Supervised Tracking via Video Colorization
    In “Tracking Emerges by Colorizing Videos”, we introduce a convolutional network that colorizes grayscale videos, but is constrained to copy colors from a single reference frame. In doing so, the network learns to visually track objects automatically without supervision. Importantly, although the model was never trained explicitly for tracking, it can follow multiple objects, track through occlusions, and remain robust over deformations without requiring any labeled training data...

A Message From This Week's Sponsor

Clark University: Transform Data Into Something Meaningful

Business Analytics at Clark University will give you the skills employers demand by teaching you how to synthesize data into powerful information. Whether you enroll in a full- or part-time master’s or accelerated certificate program, you will be equipped to make informed decisions and improve organizational performance.
You don’t need a background in statistics or science to succeed here. We offer:
  • Blended curriculum
  • Career-ready courses
  • Affordable excellence
Move your career forward in one of the fields with the largest demand. Learn more at

Data Science Articles & Videos

  • Everyone Poops
    Here in San Francisco, human waste is a growing issue; both for the people who run into it and for the people that have no other options than to relieve themselves on public streets. This is a multi-faceted problem, with many potential solutions that are best solved by social scientists. However, I think there is a place for data science in this conversation...
  • Baseball Pitch Recommendation: a look into the data science process.
    I’ll walk you through a model I created to recommend pitches to the Cubs in games against the Cardinals, and the steps I took to get there. (Technically, this model could help any team, or any talented pitcher quite frankly when throwing pitches against Cardinals player, but my model is dedicated to my Cubs)...
  • Attentive GAN for Raindrop Removal from A Single Image
    Raindrops adhered to a glass window or camera lens can severely hamper the visibility of a background scene and degrade an image considerably. In this paper, we address the problem by visually removing raindrops, and thus transforming a raindrop degraded image into a clean one...
  • Add Constrained Optimization To Your Toolbelt
    At Stitch Fix, whenever we serve a customer, we must choose the right day to style that client’s fix, the right stylist for that client’s particular aesthetic, the right warehouse for that client’s shipping address. But stylists and warehouses are in high demand, with many clients competing for their time and attention. Constrained optimization helps us get work to stylists and warehouses in a manner that is fair and efficient, and gives our clients the best possible experience...
  • Data Dictionary: a how to and best practices
    A data dictionary is a list of key terms and metrics with definitions, a business glossary. While it is sounds simple, almost trivial, its ability to align the business and remove confusion can be profound. In fact, a data dictionary is possibly one of the most valuable artifacts that a data team can deliver to the business...
  • Travel Time Optimization With Machine Learning And Genetic Algorithm
    What is the relationship between machine learning and optimization? — On the one hand, mathematical optimization is used in machine learning during model training, when we are trying to minimize the cost of errors between our model and our data points. On the other hand, what happens when machine learning is used to solve optimization problems?...


eCommerce Data Science & Machine Learning Analyst - PepsiCo - NYC
Have a strong opinion about Tensorflow lacking an autoregressive dynamic network? So do we!
PepsiCo’s eCommerce Data Science and Analytics group is a team of data scientists, technology specialists, and business innovators who operate within eCommerce to build industry-leading systems and solutions. By focusing on machine learning and automation, the Data Science & Analytics group is pushing the bounds of possibility for PepsiCo and its strategic partners...

Training & Resources

  • The Hitchhiker’s Guide to Hyperparameter Tuning
    If you go around and ask people how they tune their models, their most likely answer will be “just write a script that does it for you”. Well, that’s easier said than done... Apparently, there are a few things you should keep in mind when implementing such a script. Here, at Taboola, we implemented a hyperparameter tuning script. Let me share with you the things we learned along the way...


  • Text Mining with R: A Tidy Approach
    Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective....

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page