Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
December 30, 2021

Editor's Picks

  • Graph ML in 2022: Where Are We Now?
    It’s been quite a year for Graph ML — thousands of papers, numerous conferences and workshops… How do we catch up with so many cool things happening around? Well, we are puzzled as well and decided to present a structured look at Graph ML highlighting trends and major advancements...
  • Papers with Code 2021 : A Year in Review
    Papers with Code indexes various machine learning artifacts — papers, code, results — to facilitate discovery and comparison. Using this data we can get a sense of what the ML community found useful and interesting this year. Below we summarize the top trending papers, libraries and datasets for 2021 on Papers with Code...

A Message From This Week's Sponsor

High quality data labeling, consistently Edge cases are the most common challenges that ML teams face when training their AI models, making it difficult to reach 95+% accuracy. This can be more complex once you need to scale and start working with 3rd party data labeling solutions. The evaluation metrics that we use to measure the quality of labeled data - Intersection over Union (IOU) and F1 score - has allowed us to make swift adjustments on the go and continuously improve the quality of our labeling standards. To find out more and start exploring our end-to-end data labeling service, speak to the team at Supahands today.

Data Science Articles & Videos

  • The Statistical Complexity of Interactive Decision Making
    A fundamental challenge in interactive learning and decision making, ranging from bandit problems to reinforcement learning, is to provide sample-efficient, adaptive learning algorithms that achieve near-optimal regret...The main result of this work provides a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning...
  • Visual Tutorial NER Chunking Token Classification
    Why is there so much talk about Named Entity Recognition in a competition that requires us to detect and classify spans of texts? It looks like most of the top public notebooks used a NER approach to solve the Feedback Prize challenge. In this tutorial, I’d like to provide a beginner-friendly, visual explanation to this approach...
  • tree-math: mathematical operations for JAX pytrees
    tree-math makes it easy to implement numerical algorithms that work on JAX pytrees, such as iterative methods for optimization and equation solving. It does so by providing a wrapper class tree_math.Vector that defines array operations such as infix arithmetic and dot-products on pytrees as if they were vectors...
  • Bits Of Deep Learning Podcast Episode #1 with Eric Jang
    A conversation with Eric Jang on the Present and Future of Robotics...a) Introduction, b) Eric's Background, c) The “Just Ask For Generalization” recipe, d) Meta-Learning vs Supervised Learning in Robotics, e) The compute concern around GPT like models, f) How to scale to real-world robots? The Importance of simulators, g) End-to-end in robotics. Pros and Cons, h) How and who will solve robotics?, i) Self-supervised learning: What is it and why is it useful., j) Biggest obstacles in robotics?, k) General-purpose robots and Tesla Bot, and l) Conclusion...
  • AI Accelerators — Part I: Intro
    I will give a high-level overview of accelerators for artificial intelligence applications — what they are, and how they became so popular. As discussed in later posts, accelerators stem from a broader concept rather than just a particular type of system or implementation. They are also not purely hardware-driven, and in fact — much of the AI accelerator industry’s focus has been around building robust and sophisticated software libraries and compiler toolchains...
  • Generative Modeling by Estimating Gradients of the Data Distribution
    This blog post focuses on a promising new direction for generative modeling. We can learn score functions (gradients of log probability density functions) on a large number of noise-perturbed data distributions, then generate samples with Langevin-type sampling. The resulting generative models, often called score-based generative models, have several important advantages over existing model families: GAN-level sample quality without adversarial training, flexible model architectures, exact log-likelihood computation, and inverse problem solving without re-training models. In this blog post, we will show you in more detail the intuition, basic concepts, and potential applications of score-based generative models...
  • Engineering Trade-Offs in Automatic Differentiation: from TensorFlow and PyTorch to Jax and Julia
    To understand the differences between automatic differentiation libraries, let's talk about the engineering trade-offs that were made. I would personally say that none of these libraries are "better" than another, they simply all make engineering trade-offs based on the domains and use cases they were aiming to satisfy. The easiest way to describe these trade-offs is to follow the evolution and see how each new library tweaked the trade-offs made of the previous...
  • CPPE-5: Medical Personal Protective Equipment Dataset
    We present a new challenging dataset, CPPE - 5 (Medical Personal Protective Equipment), with the goal to allow the study of subordinate categorization of medical personal protective equipments, which is not possible with other popular data sets that focus on broad level categories (such as PASCAL VOC, ImageNet, Microsoft COCO, OpenImages, etc). To make it easy for models trained on this dataset to be used in practical scenarios in complex scenes, our dataset mainly contains images that show complex scenes with several objects in each scene in their natural context...


Free Course: Natural Language Processing (NLP) for Semantic Search Learn how to build semantic search applications by making machines understand language as people do. This free course covers everything you need to build state-of-the-art language models, from machine translation to question-answering, and more. Brought to you by Pinecone. Start reading now. *Sponsored post. If you want to be featured here, or as our main sponsor, contact us!


Training & Resources


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Easy to unsubscribe at any time. Your e-mail address is safe.