Data Science Weekly Newsletter - Issue 402

Issue #370

Dec 24 2020

Editor Picks
  • NeRF Explosion 2020
    Besides the COVID-19 pandemic and political upheaval in the US, 2020 was also the year in which neural volume rendering exploded onto the scene, triggered by the impressive NeRF paper by Mildenhall et al. This blog post is my way of getting up to speed in a fascinating and very young field and share my journey with you...
  • Taking Questions from the Late Justice Ginsburg:
    Fine-Tuning Billion+ Parameter Transformers Using Model Parallelism

    We’ll never know what Justice Ginsburg might have asked had she completed the current term of the Court. However in memory of the Justice, we can use her words from decades of oral arguments to get a sense of some of the questions she might have posed. In this example, we will create a persona-based dialogue model of Justice Ginsburg using the largest versions of t5 and gpt2, fine-tuning on models of 1.5 billion and 11 billion parameters respectively in just a few hours using model parallelism...

A Message from this week's Sponsor:


Data scientists are in demand on Vettery

Vettery is an online platform that connects you with thousands of actively hiring startups and Fortune 500 companies. Create a free profile, name your salary, and get discovered by hiring managers looking to grow their teams.

Get started - it’s completely free for job-seekers!


Data Science Articles & Videos

  • DeepMind's AI agent MuZero could turbocharge YouTube
    DeepMind's latest AI program can attain "superhuman performance" in tasks without needing to be given the rules. But unlike its predecessors, it had to work out their rules for itself. It is already being put to practical use to find a new way to encode videos, which could slash YouTube's costs...
  • Homemade Machine Learning
    The purpose of this repository is not to implement machine learning algorithms by using 3rd party library one-liners but rather to practice implementing these algorithms from scratch and get better understanding of the mathematics behind each algorithm. That's why all algorithms implementations are called "homemade" and not intended to be used for production...
  • Optimization is as hard as approximation
    You asked for an efficient algorithm for non-convex optimization for Christmas? It won’t be possible unless you have a lot of smoothness. See why in this months blog post....
  • AlphaFold 2 & Equivariance
    A few weeks ago, in the latest CASP competition for protein structure prediction (CASP14), DeepMind’s AlphaFold 21 outperformed all its competitors with an unprecedented margin. In this blog post, we aim to shed light on one of the important building blocks that distinguishes AlphaFold 2 from the other approaches and likely contributed to their success: an equivariant structure prediction module...
  • Extracting Training Data from Large Language Models
    It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model. We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data...



Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.

The course is broken down into three guides:
  1. Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

  2. Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

  3. Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!
Click here to learn more ...

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!



  • Data Scientist - Apple Pay Analytics - NYC

    You will play a key role improving the Apple Pay product experience. As a member of the analytics team you will be supporting a product function. You will partner with business owners, understand goals, craft KPIs and measure ongoing performance. You will initially engage with the product and engineering teams in ensuring that we have the appropriate instrumentation in place to deliver on these metrics. You will subsequently use advanced statistical, ML and analytical techniques to analyze product performance and identify key insights that inform product improvements and business strategy. The role requires a high degree of independence, ownership and collaboration working cross functionally across all levels of a highly matrixed organization...

        Want to post a job here? Email us for details >>


Training & Resources




  • Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

    Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.