Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
December 24, 2020

Editor's Picks

  • Can AI Become Conscious?
    At the Allen Institute for Brain Science in Seattle, a large-scale effort is underway to understand how the 86 billion neurons in the human brain are connected. The aim is to produce a map of all the connections: the connectome. Scientists at the Institute are now reconstructing one cubic millimeter of a mouse brain, the most complex ever reconstructed. Mapping how exactly the brain is wired will help us to understand how healthy brains function, and what goes wrong in diseased brains...

A Message From This Week's Sponsor

State of Machine Learning - Survey

What is the state of machine learning across different industries? Reply to the questionnaire
to participate in this joint effort and see the data in real-time after giving your answers.
The questionnaire finds out what the respondents are trying to achieve in the near future and what are their biggest obstacles today. The organizer of the questionnaire is Valohai the MLOps platform
and the results are open to everyone.

Data Science Articles & Videos

  • Full Autopilot in GTA Using TensorFlow
    First off let me give you an introduction about openpilot, openpilot is a open source self driving car software developed by to explain the whole flow all together: image -> ubuntu laptop -> predictions with the model -> converting all the long and lateral control output -> sending it over my local network with zmq to my gaming pc -> gaming pc is emulating the xbox controller inputs -> driving in GTA!...
  • ALMa: Active Learning (Data) Manager
    Active Learning is a popular technique to reduce annotation costs by using AI to decide what to label next. The subtle bookkeeping involved in keeping track of what has been labeled is tedious and error prone. We need to train our learner on Labeled data, but sample new examples to label from Unlabeled data. As we label data it moves around between the two subsets of our Dataset and managing the bookkeeping a chore that should be abstracted away...Today we're happy to open-source ALMa the Active Learning Manager that abstracts away the bookkeeping. Whereas most implementations modify the data array in place, ALMa maintains views of Labeled and Unlabeled subsets of the original Dataset...
  • "This Word Does Not Exist" Project
    I've been working on this word does not exist. In it, I "learned the dictionary" and trained a GPT-2 language model over the Oxford English Dictionary. Sampling from it, you get realistic sounding words with fake definitions and example usage...On the website, I've also made it so you can prime the algorithm with a word, and force it to come up with an example...The project allows people to train a variant of GPT-2 that makes up words, definitions and examples from scratch..
  • How to read deep learning papers? [Reddit Discussion]
    I've been reading some deep learning papers and it seems like a lot of the choice of architecture is wishy-washy stuff that we just have to "accept" for some reason. I know that explaining a deep network is almost impossible, but that makes it really difficult to decide whether a given network outperforms another, especially when considering artificial data sets...In short, everything seems like it's a question of throwing more layers at the problem until you reach some sort of solution that you can claim is "better" than others', except since nothing is standardized, even for a particular type of problem, it can result in widely varying results and reproducibility...How do you manage this?...
  • A Commit History of BERT and its Forks
    I recently came across an interesting thread on Twitter discussing a hypothetical scenario where research papers are published on GitHub and subsequent papers are diffs over the original paper. Information overload has been a real problem in ML with so many new papers coming every month...This post is a fun experiment showcasing how the commit history could look like for the BERT paper and some of its subsequent variants...
  • Farewell, TensorFlow
    Friday was my last day working on TensorFlow at Google. The past five years were a lot of fun, and I feel incredibly lucky to have been in the war room at 6 AM when we launched the project into the world. Since then, it’s been inspiring to see how people have used TensorFlow, from all the folks asking questions on Stack Overflow, through the machine learning and systems research that built on our work, to the huge software ecosystem that has grown up around it...Before I move on, I wanted to reminisce about what I enjoyed most about working at Google...


Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.
The course is broken down into three guides:
  1. Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

  2. Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

  3. Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!
Click here to learn more
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!


  • Data Scientist - Amazon Demand Forecasting - New York

    The Amazon Demand Forecasting team seeks a Data Scientist with strong analytical and communication skills to join our team. We develop sophisticated algorithms that involve learning from large amounts of data, such as prices, promotions, similar products, and a product's attributes, in order to forecast the demand of over 190 million products world-wide. These forecasts are used to automatically order more than $200 million worth of inventory weekly, establish labor plans for tens of thousands of employees, and predict the company's financial performance. The work is complex and important to Amazon. With better forecasts we drive down supply chain costs, enabling the offer of lower prices and better in-stock selection for our customers...

        Want to post a job here? Email us for details >>

Training & Resources

  • A 2020 Vision of Linear Algebra
    These six brief videos, recorded in 2020, contain ideas and suggestions from Professor Strang about the recommended order of topics in teaching and learning linear algebra. The first topic is called A New Way to Start Linear Algebra. The key point is to start right in with the columns of a matrix A and the multiplication Ax that combines those columns...


40% off at Manning

Do more with your data!
If you're looking to make your data skills stand out, then be sure to check out Manning's range of books and video courses. They're offering 40% off everything in their catalog, so there's no better time to learn something new...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page
  P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Easy to unsubscribe at any time. Your e-mail address is safe.