Data Science Weekly Newsletter

Issue

375

January 28, 2021

‍

Editor's Picks

‍

Image GPT
We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples. By establishing a correlation between sample quality and image classification accuracy, we show that our best generative model also contains features competitive with top convolutional nets in the unsupervised setting...

Using GitHub Actions for MLOps & Data Science
Machine Learning Operations (or MLOps) enables Data Scientists to work in a more collaborative fashion, by providing testing, lineage, versioning, and historical information in an automated way. Because the landscape of MLOps is nascent, data scientists are often forced to implement these tools from scratch...we [GitHub] have created a series of GitHub Actions that integrate parts of the data science and machine learning workflow with a software development workflow. Furthermore, we provide components and examples that automate common tasks...

A Council of Citizens Should Regulate Algorithms
Despite their ubiquity in society, no real structure exists to regulate algorithms' use. We rely on journalists or civil society organizations to serendipitously report when things have gone wrong. In the meantime, the use of algorithms spreads to every corner of our lives and many agencies of our government. In the post-Covid-19 world, the problem is bound to reach colossal proportions...

‍

A Message From This Week's Sponsor

‍

Data scientists are in demand on Vettery

Vettery is an online hiring marketplace that's changing the way people hire and get hired. Ready for a bold career move? Make a free profile, name your salary, and connect with hiring managers from top employers today.

‍

Data Science Articles & Videos

‍

AI Researchers propose framework to measure AI’s social and environmental impact
Researchers at the Montreal AI Ethics Institute, McGill University, Carnegie Mellon, and Microsoft propose a four-pillar framework called SECure designed to quantify the environmental and social impact of AI. Through techniques like compute-efficient machine learning, federated learning, and data sovereignty, the coauthors assert scientists and practitioners have the power to cut contributions to the carbon footprint while restoring trust in historically opaque systems...

Gated Linear Networks
This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism; each neuron directly predicts the target, forgoing the ability to learn feature representations in favor of rapid online learning...

Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization
We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way...

SynSin: End-to-end View Synthesis from a Single Image
View synthesis allows for the generation of new views of a scene given one or more images. This is challenging; it requires comprehensively understanding the 3D scene from images. As a result, current methods typically use multiple images, train on ground-truth depth, or are limited to synthetic data. We propose a novel end-to-end model for this task using a single image at test time; it is trained on real images without any ground-truth 3D information. To this end, we introduce a novel differentiable point cloud renderer that is used to transform a latent 3D point cloud of features into the target view...

Tutorial on Evolutionary Computation and Games [Video]
We - Julian Togelius, Sebastian Risi, and Georgios N. Yannakakis - give an overview of how to use evolutionary computation in and for games. This covers methods for playing games, generating game content, and modeling players. We present both classic methods and some selected very recent papers...

Evaluation of COVID-19 Models
Here we present an evaluation of models from the COVID-19 Forecast Hub. These models are submitted weekly to the CDC COVID-19 Forecasting page to help inform public health decision-making...While a model's future projections can be useful, it is also important to take into account the model's historical performance in a transparent, rigorous, and non-biased manner. This is the goal of this project...

ICML 2020: Comprehensive analysis of authors, organizations, and countries
ICML is one of the most important conferences in Machine Learning and therefore it’s interesting to see who publishes at this conference. So I looked at the accepted papers for ICML 2020 and analyzed authors, organizations, and countries that participated this year. The conference will take place virtually from 13th to 18th July in 2020...

Building an AI-Powered Searchable Video Archive
In this post, I’ll show you how to build an AI-powered, searchable video archive using machine learning and Google Cloud–no experience required...

Undress or fail: Instagram’s algorithm strong-arms users into showing skin
An exclusive investigation reveals that Instagram prioritizes photos of scantily-clad men and women, shaping the behavior of content creators and the worldview of 140 millions Europeans in what remains a blind spot of EU regulations...

‍

Survey

‍

Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.
The course is broken down into three guides:

Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!

Click here to learn more
...
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

‍

Jobs

‍

Data Scientist - Amazon Demand Forecasting - New York

The Amazon Demand Forecasting team seeks a Data Scientist with strong analytical and communication skills to join our team. We develop sophisticated algorithms that involve learning from large amounts of data, such as prices, promotions, similar products, and a product's attributes, in order to forecast the demand of over 190 million products world-wide. These forecasts are used to automatically order more than $200 million worth of inventory weekly, establish labor plans for tens of thousands of employees, and predict the company's financial performance. The work is complex and important to Amazon. With better forecasts we drive down supply chain costs, enabling the offer of lower prices and better in-stock selection for our customers...

Want to post a job here? Email us for details >> team@datascienceweekly.org

‍

Training & Resources

‍

Machine Learning Reddit Advanced Courses List

We have a PhD level or Advanced courses thread in the sidebar but it's three year old now. There were two other 7-8 month old threads (1, 2) but they don't have many quality responses either...So, can we have a new one here?...To reiterate - CS231n, CS229, ones from Udemy etc are not advanced...Advanced ML/DL/RL, attempts at building theory of DL, optimization theory, advanced applications etc are some examples of what I believe should belong here, much like the original sidebar post...

TensorFlow, Keras and deep learning, without a PhD

In this codelab, you will learn how to build and train a neural network that recognises handwritten digits [Using TensorFlow 2.2]. Along the way, as you enhance your neural network to achieve 99% accuracy, you will also discover the tools of the trade that deep learning professionals use to train their models efficiently....

Getting started in NLP/ML research

Over the years, a number of friends have asked me about how they can get started in NLP and ML research. While no expert in CS curriculum design, I made this transition myself coming out of college (I majored in ECE, with focus in computer hardware and high power energy systems). This guide is a summary of how to get started in NLP/ML research. Following this guide will not make you an expert — that would require a formal education and years of practice. Rather, this aims to help you acquire enough experience to quickly pursue work in this area (e.g. working with a research lab)...

‍

Books

‍

Seven Databases in Seven Weeks:
A Guide to Modern Databases and the NoSQL Movement
"A book that tries to cover multiple database is a risky endeavor, a book that also provides hands on on each is even riskier but if implemented well leads to a great package. I loved the specific exercises the authors covered. A must read for all big data architects who don’t shy away from coding..."... For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page
.

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

‍