Data Science Weekly Newsletter - Issue 415

Issue #383

Mar 25 2021

Editor Picks
  • AI and Drug Discovery: Attacking the Right Problems
    I’ve been meaning to write some more about artificial intelligence, machine learning, and drug discovery, and this paper (open access) by Andreas Bender is an excellent starting point. I’m going to be talking in fairly general terms here, but for practitioners in the field, I can recommend this review of the 2020 literature by Pat Walters, which will take you through a number of important topics and where they seem to be heading...
  • My Love / Hate Relationship With Jupyter
    Jupyter notebooks are the absolute worst thing ever. That is, until I imagine trying to do data exploration and manipulation in some other platform and then they become the best thing ever... As a Data Scientist, I haven't found something that fits my needs better than the Jupyter notebook. As a Machine Learning Engineer - I want those notebooks nowhere near my production systems. Let's talk about why...

A Message from this week's Sponsor:


Get to know our data science Ambassadors, fueled by Z Workstations and Nvidia

From exclusive webinars to the latest research, find everything you need to power your world-changing insights at The Edge. Drop by and meet our Ambassadors to learn what the brightest minds in the field are saying about the future of our industry.

Take me there.


Data Science Articles & Videos

  • GeoGuessing with Deep Learning
    GeoGuessr is a geographic discovery game. You are dropped into a random Google Street View and tasked with pointing out your location on a map. You can look around, zoom, and follow the car’s path through the local streets...I once read that machine learning is currently capable of doing anything a human can do in under one second. Recognize a face, pick out some text from an image, swerve to avoid another car. This got me thinking, and thinking led me to a paper called Geolocation Estimation of Photos using a Hierarchical Model and Scene Classification by Eric Müller-Budack, Kader Pustu-Iren, and Ralph Ewerth...
  • Investigating Microsoft’s Transformation under Satya Nadella
    The blog post walks through how I used NLP tools in Python to derive interesting observations about the changes in Microsoft’s philosophy and strategy under CEO Satya Nadella, and how he was able to lead a successful turnaround at the company...
  • A Tricksy Look at Outfitting: Kitsune
    Ever since getting involved in e-commerce data science, one of the things I have been thinking about has been automating outfit creation. The idea of outfitting is that, for a given seed product, we should be able to return a set of complementary products that go along with it to create an outfit. For me, outfitting is an interesting problem area because it isn’t that well-defined or mapped out, and in these spaces where there are no real right answers or accepted methods, it means I can satisfy my thirst for adventure and shenanigans....
  • Exploring Predicting Saves for Pitchers in MLB
    I woke up on a cold Chicago January morning to the exciting news that the White Sox had recently signed a new player, Liam Hendriks. For an up and coming team, adding an All-Star to anchor their bullpen was an exciting prospect...could I predict how well he would do for the White Sox in the coming season? And second, is there a way to predict saves for closers generally, even for pitchers whose careers change trajectory as much as Hendriks’ had?...
  • Proper Name Detection
    Detecting names in a user message is a common challenge when designing a virtual assistant. It’s also an issue that is more complicated than many people initially think...
  • Defining What You Climb: “Am I a 5.11 climber yet?”
    An analysis of the difference between men and women when they say they climb ‘5.something’...Every once in a while I’ll meet a climber and they’ll ask me, “What grade do you climb?” Instantly my mind has flown off into calculation mode...So how do other people decide what grade they climb? Is it once they’ve climbed one of that grade? Five? A dozen? And does this differ by gender?...I decided to go about answering this question by using data science...
  • AI names colors much as humans do
    When two artificial neural networks are tasked with creating a way to communicate with each other about what colors they see, they develop systems that balance complexity and accuracy much as people do...
  • iMAP: Implicit Mapping and Positioning in Real-Time
    We show for the first time that a multilayer perceptron (MLP) can serve as the only scene representation in a real-time SLAM system for a handheld RGB-D camera. Our network is trained in live operation without prior data, building a dense, scene-specific implicit 3D model of occupancy and colour which is also immediately used for tracking...
  • The heartfelt story of me building a League of Legends win interpreter for hard-stuck Silver II players
    League of Legends is a 5v5 team-based game where each player can select a champion (out of 154 options) and role (top, jungle, mid, adc/bot, support/bot). The game’s competitive scene is vibrant and global, with a *definitely not toxic* ranked match-making option where players can compete amongst themselves and climb the ELO ladder to rank up. I wanted to see if a machine learning model could answer the common question players might have about their matches: “what did I do this game that helped or hurt my chances of winning, and how much did each thing that happened matter?”...
  • Leveraging Machine Learning for Game Development
    Today, we present an approach that leverages machine learning (ML) to adjust game balance by training models to serve as play-testers, and demonstrate this approach on the digital card game prototype Chimera ... By running millions of simulations using trained agents to collect data, this ML-based game testing approach enables game designers to more efficiently make a game more fun, balanced, and aligned with their original vision...



Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.

The course is broken down into three guides:
  1. Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

  2. Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

  3. Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!
Click here to learn more ...

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!



  • Data Scientist - HelloFresh - Chicago, IL or New York, NY

    Embedded in the NYC Tech Hub, we are building a cross-functional team of data scientists, analysts and engineers with the mission to bring the modeling and analytical capabilities of our marketing organization to the next level.

    As a Data Scientist, you will support the analytic needs of our Growth organization comprising Technology, Digital Product and Marketing. You will play a pivotal role in helping us continue to succeed as the leading global meal kit provider. This role will solve challenging problems using vast repositories of customer data to provide detailed and actionable insights; core responsibilities include the development and automation of Marketing BI tools, predictive modeling, professional-grade dashboarding and reporting for some of our most critical initiatives and enhancing and facilitating the information extraction process...

        Want to post a job here? Email us for details >>


Training & Resources

  • R vs. Python vs. Julia How easy it is to write efficient code?
    If you are a Data Scientist, chances are that you program in either Python or R. But there is a new kid on the block named Julia that promises C-like performance without compromising the way Data Scientists write code and interact with data...In this post...we will solve a very simple problem where built-in implementations are available and where programming the algorithm from scratch is straightforward. The goal is to understand our options when we need to write efficient code...
  • Jupyter Notebook Tips and Improvements
    Jupyter notebooks are a common but powerful tool in data science and machine learning. They allow for fast prototyping, EDA, and modeling in notebooks where code, documentation, plots, and more can be viewed all at once. That being said, it does not make its most useful features very obvious, so in this post I will go over a few ways to make the Jupyter notebook experience even better...
  • Data Science in Julia for Hackers
    One of the first things to note about this book is that it is not an academic textbook...What we want to deliver is a mathematical and computational methodology to face concrete Data Science problems, that is, applying theory and science to real-world problems involving data. The relationship between theory and practice is complex. Considering them as a whole can take us much farther. These pages may offer the theorist a way to think about problematic situations in a more down to earth manner, and to the practitioner, stimulation to go beyond the mere application of programming libraries and tools...



  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.