Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
September 24, 2020

Editor's Picks

  • The Economics of AI Today
    Economists have been studying the relationship between technological change, productivity and employment since the beginning of the discipline with Adam Smith’s pin factory. It should therefore not come as a surprise that AI systems able to behave appropriately in a growing number of situations - from driving cars to detecting tumours in medical scans - have caught their attention...In September 2017, a group of distinguished economists gathered in Toronto to set out a research agenda for the Economics of Artificial Intelligence (AI). They covered questions such as what is economically unique about AI, what will be its impacts, and what are the right policies to spread its benefits...Last September I had the privilege of attending the third edition of this conference in Toronto, and to witness first-hand how the Economics of AI agenda has evolved. Here, I outline the key themes of the conference and relevant papers at four levels...
  • An Opinionated Guide to ML Research
    In this essay, I provide some advice to up-and-coming researchers in machine learning (ML), based on my experience doing research and advising others. The advice covers how to choose problems and organize your time...I originally wrote this guide in back in December 2017 for the OpenAI Fellows program...

A Message From This Week's Sponsor

Data scientists are in demand on Vettery

Vettery is an online hiring marketplace that's changing the way people hire and get hired. Ready for a bold career move? Make a free profile, name your salary, and connect with hiring managers from top employers today.

Data Science Articles & Videos

  • How we scaled AI Dungeon 2 to support over 1,000,000 users
    Back in March 2019, I built a hackathon project called AI Dungeon. The project was a classic text adventure game, with a twist. The text of the story, and the potential actions you were presented, were all generated with machine learning...
  • Deep reinforcement learning for supply chain and price optimization
    Supply chain and price management were among the first areas of enterprise operations that adopted data science and combinatorial optimization methods and have a long history of using these techniques with great success. Although a wide range of traditional optimization methods are available for inventory and price management applications, deep reinforcement learning has the potential to substantially improve the optimization capabilities for these and other types of enterprise operations due to impressive recent advances in the development of generic self-learning algorithms for optimal control. In this article, we explore how deep reinforcement learning methods can be applied in several basic supply chain and price management scenarios...
  • Learning to See Transparent Objects
    Enabling machines to better sense transparent surfaces would not only improve safety, but could also open up a range of new interactions in unstructured applications — from robots handling kitchenware or sorting plastics for recycling, to navigating indoor environments or generating AR visualizations on glass tabletops...To address this problem, we teamed up with researchers from Synthesis AI and Columbia University to develop ClearGrasp, a machine learning algorithm that is capable of estimating accurate 3D data of transparent objects from RGB-D images...
  • Quantifying Independently Reproducible Machine Learning
    How reproducible is the latest ML research, and can we begin to quantify what impacts its reproducibility? This question served as motivation for my NeurIPS 2019 paper. Based on a combination of masochism and stubbornness, over the past eight years I have attempted to implement various ML algorithms from scratch. This has resulted in a ML library called JSAT. My investigation in reproducible ML has also relied on personal notes and records hosted on Mendeley and Github. With these data, and clearly no instinct for preserving my own sanity, I set out to quantify and verify reproducibility...
  • First steps with ESP32 and TensorFlow Lite for Microcontrollers
    I have no doubt tiny edge devices will take a meaningful place in our life soon. Since Moore’s Law applicable to such devices we are the spectators of maturing of mobile, embedded, wearable and implantable (augmenting) electronic devices with computational power enough to using AI...some hands-on exercises with TensorFlow Lite for Microcontrollers. I prefer approach with gradual complexity increasing...
  • Growing Neural Cellular Automata
    Imagine if we could design systems of the same plasticity and robustness as biological life: structures and machines that could grow and repair themselves. Such technology would transform the current efforts in regenerative medicine, where scientists and clinicians seek to discover the inputs or stimuli that could cause cells in the body to build structures on demand as needed. To help crack the puzzle of the morphogenetic code, and also exploit the insights of biology to create self-repairing systems in real life, we try to replicate some of the desired properties in an in silico experiment...
  • A new model and dataset for long-range memory
    This blog introduces a new long-range memory model, the Compressive Transformer, alongside a new benchmark for book-level language modelling, PG19. We provide the conceptual tools needed to understand this new research in the context of recent developments in memory models and language modelling...


Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.
The course is broken down into three guides:
  1. Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

  2. Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

  3. Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!
Click here to learn more
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!


  • Director of Data Science - Komodo Health - NYC

    Komodo Health is addressing the global burden of disease through the development of the world’s most actionable map of healthcare data. As a fast-growing startup that has already partnered with multiple Fortune 500 companies, we have very ambitious goals that have been designed with career development in mind.
    We are looking for a Director of Data Science to play a critical role in the success of our growing Data Science team. You will lead a group of data scientists and data engineers within the Data Science team that is involved in all aspects of building data products...

        Want to post a job here? Email us for details >>

Training & Resources

  • A comprehensive guide to downloading stock prices in Python
    The goal of this short article is to show how easy it is to download stock prices (and stock-related data) in Python. In this article I present two approaches, both using Yahoo Finance as the data source. There are many alternatives out there (Quandl, Intrinion, AlphaVantage, Tiingo, IEX Cloud, etc.), however, Yahoo Finance can be considered the most popular as it is the easiest one to access (free and no registration required)...
  • Yoshua Bengio has started a blog
    I often write comments and posts on social media but these tend to be only temporarily visible, so I thought I needed a place to couch some of my thoughts that would be more permanent and easier to find. This blog is intended to cover both research questions and broader questions — often about our society — which occupy my thoughts...
  • kNN classification using Neighbourhood Components Analysis
    While reading related work...I stumbled upon a reference to a classic paper from 2004 called Neighbourhood Components Analysis (NCA). After giving it a read, I was instantly charmed by its simplicity and elegance. Long story short, NCA allows you to learn a linear transformation of your data that maximizes k-nearest neighbours performance. By forcing the transformation to be low-rank, NCA will perform dimensionality reduction, leading to vastly reduced storage sizes and search times for kNN! NCA is a very useful algorithm to have in your toolkit – just like PCA – but it’s very rarely mentioned in the wild. In fact, I couldn’t find any tutorial or reference outside of academic papers. This post is an attempt to rectify this...


  • Data Science in Production: Building Scalable Model Pipelines with Python
    This book provides a hands-on approach to scaling up Python code to work in distributed environments in order to build robust pipelines. Readers will learn how to set up machine learning models as web endpoints, serverless functions, and streaming pipelines using multiple cloud environments. It is intended for analytics practitioners with hands-on experience with Python libraries such as Pandas and scikit-learn, and will focus on scaling up prototype models to production....
    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Easy to unsubscribe at any time. Your e-mail address is safe.