Data Science Weekly Newsletter

Issue

392

May 27, 2021

‍

Editor's Picks

‍

Object Detection from 9 FPS to 650 FPS in 6 Steps
Making code run fast on GPUs requires a very different approach to making code run fast on CPUs because the hardware architecture is fundamentally different...Machine learning engineers of all kinds should care about squeezing performance from their models and hardware — not just for production purposes, but also for research and training. In research as in development, a fast iteration loop leads to faster improvement...This article is a practical deep dive into making a specific deep learning model (Nvidia’s SSD300) run fast on a powerful GPU server, but the general principles apply to all GPU programming...

BMW writes code of ethics for AI in collaboration with the EU
BMW has decided to take a first step in the right direction and, even though AI today is far from what people imagine based on...Hollywood movies, the German brand decided to write an ethics code for the future. Right now, the Bavarian brand is already relying on some form of AI, to generate added value for customers, products, employees and processes. And while the applications may change over time, BMW says its focus will always be on people...the BMW Group has worked out seven basic principles covering the use of AI within the company...

Stanford Machine Learning Systems Seminar Series
In this seminar series, we want to take a look at the frontier of machine learning systems, and how machine learning changes the modern programming stack. Our goal is to help curate a curriculum of awesome work in ML systems to help drive research focus to interesting questions...Starting in Fall 2020, we’ll be livestreaming each talk in this seminar series Thursdays 3-4 PT on YouTube, and taking questions from the live chat, and videos of the talks will be available on YouTube afterwards as well...

‍

A Message From This Week's Sponsor

‍

Data scientists are in demand on Vettery

Get discovered by one of the thousands of hiring managers using Vettery to grow their companies’ data science teams. Here’s how it works: create a profile, name your salary, and connect with hiring managers from startups to Fortune 500 companies.
Get started - it’s completely free for job-seekers!

‍

Data Science Articles & Videos

‍

Shrinking the ‘data desert’: Inside efforts to make AI systems more inclusive of people with disabilities
Until recently, there hasn’t been enough relevant data to train machine learning algorithms to tackle...personalized object recognition for people with vision disabilities. That’s why City, University of London, a Microsoft AI for Accessibility grantee, has launched the Object Recognition for Blind Image Training (ORBIT) research project to create a public dataset from scratch, using videos submitted by people who are blind or have low vision...

Reinforcement learning is supervised learning on optimized data
The two most common perspectives on Reinforcement learning (RL) are optimization and dynamic programming...In this blog post we discuss a mental model for RL, based on the idea that RL can be viewed as doing supervised learning on the “good data”. What makes RL challenging is that, unless you’re doing imitation learning, actually acquiring that “good data” is quite challenging. Therefore, RL might be viewed as a joint optimization problem over both the policy and the data. Seen from this supervised learning perspective, many RL algorithms can be viewed as alternating between finding good data and doing supervised learning on that data...

What Color Is This, Part 2: The Computational Parts
We [Stitch Fix] have to find the colors of our clothes from their images, as we described in our last post about color. This will help us in many ways, including deciding what styles to buy and which clients to send them to. We described the hybrid human-computer approach, but only went into depth about the human part: translating our images into a hierarchy of colors. In this post, we’ll get into depth about the computational part: our current computer vision algorithm, some of our process in coming up with that algorithm, and ideas for what we’ll do next...

Understand TensorFlow by mimicking its API from scratch
The goal of this post is to build an intuition and understanding for how deep learning libraries work under the hood, specifically TensorFlow. To achieve this goal, we will mimic its API and implement its core building blocks from scratch. This has the neat little side effect that, by the end of this post, you will be able to use TensorFlow with confidence, because you’ll have a deep conceptual understanding of the inner workings. You will also gain further understanding of things like variables, tensors, sessions or operations...

Learning Adaptive Language Interfaces through Decomposition
Our goal is to create an interactive natural language interface that efficiently and reliably learns from users to complete tasks in simulated robotics settings. We introduce a neural semantic parsing system that learns new high-level abstractions through decomposition: users interactively teach the system by breaking down high-level utterances describing novel behavior into low-level steps that it can understand...

Deep Learning for Procedural Content Generation
Procedural content generation in video games has a long history. Existing procedural content generation methods, such as search-based, solver-based, rule-based and grammar-based methods have been applied to various content types such as levels, maps, character models, and textures. A research field centered on content generation in games has existed for more than a decade...This article surveys the various deep learning methods that have been applied to generate game content directly or indirectly, discusses deep learning methods that could be used for content generation purposes but are rarely used today, and envisages some limitations and potential future directions of deep learning for procedural content generation...

Nemo: Data discovery at Facebook
Finding the right information can be hard for several reasons. The problem might be discovery — the relevant table might have an obscure or nondescript name, or different teams might have constructed overlapping data sets. Or, the problem could be one of confidence — the dashboard someone is looking at might have been superseded by another source six months ago...Many companies, such as Airbnb, Lyft, Netflix, and Uber, have built their own custom solutions for this challenge. For us [FB], it was important to make the data discovery process simple and fast. Funneling everything through data experts to locate the necessary data each time we need to make a decision was not scalable. So we built Nemo, an internal data discovery engine. Nemo allows engineers to quickly discover the information they need, with high confidence in the accuracy of the results...

Fast reinforcement learning through the composition of behaviours
A major limitation in RL is that current methods require vast amounts of training experience. For example, in order to learn how to play a single Atari game, an RL agent typically consumes an amount of data corresponding to several weeks of uninterrupted playing. A study led by researchers at MIT and Harvard indicated that in some cases, humans are able to reach the same performance level in just fifteen minutes of play...One possible reason for this discrepancy is that, unlike humans, RL agents usually learn a new task from scratch. We would like our agents to leverage knowledge acquired in previous tasks to learn a new task more quickly, in the same way that a cook will have an easier time learning a new recipe than someone who has never prepared a dish before. In an article recently published in the Proceedings of the National Academy of Sciences (PNAS), we describe a framework aimed at endowing our RL agents with this ability...

Deep Learning vs Puzzle Games
Is deep learning better suited to solving Flow Free than good old brute force techniques?...Deep learning problems nowadays mostly reduce to deciding which algorithm to use. I started off with A* search. Even if it isn’t deep learning per se, it gives a good idea of the inherent complexity of the problem, and gives us a chance to try out a few heuristics a more advanced algorithm could figure out on its own...

‍

Training

‍

Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.
The course is broken down into three guides:

Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!

Click here to learn more
...
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

‍

Jobs

‍

Data Scientist - Associated Press (AP) - New York, NY

The Associated Press is the essential global news network, delivering fast, unbiased news from every corner of the world to all media platforms and formats. Founded in 1846, AP today is the largest and most trusted source of independent news and information. On any given day more than half the world's population sees news from AP.
The Associated Press seeks a Data Science Manager based in New York, NY. The Data Science Manager will help manage data analysis, data science and data engineering solutions supporting business intelligence, news search, content enrichment and metadata services. We are a small focused team within Metadata Technology working closely with various departments and functions across the organization to design and build solutions with data, analytics and machine learning methods...

Want to post a job here? Email us for details >> team@datascienceweekly.org

‍

Training & Resources

‍

Structural Time Series
Time series data is ubiquitous, and many methods of processing and modeling data over time have been developed. As with any data science project, there is no one method to rule them all, and the most appropriate approach ought to depend on the data in question, the goals of the modeler, and the time and resources available...In this report, we will focus on approaches that are more suited to...a single, noisy time series, with many missing points and no additional information. In particular, we will investigate structural time series, which are especially useful in cases where the time series exhibits some periodic patterns...

Indoor Mapping Data Format [Apple]
Indoor Mapping Data Format (referenced throughout this document as "IMDF") provides a generalized, yet comprehensive model for any indoor location, providing a basis for orientation, navigation and discovery. In this release there are also detailed instructions for modeling the spaces of an airport, a shopping mall, and a train station...This release also has an extension model which enables a venue, organization, or even an industry to create valid features and validations not available in the current specification for private or public use...

DIY Alexa with the ESP32 and Microphone Board
This tutorial will guide you through the process of creating your own DIY Alexa using the ESP32 and Wit.ai...

‍

Books

‍

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page
.

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

‍