Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
November 18, 2021

Editor's Picks

  • AI helps scientists spy on chimp behavior in the wild
    Chimpanzees in West Africa have a clever trick to get at the tasty kernels inside oil palm nuts. They carefully select a flat rock to act as an anvil and place a nut on top. Then, using another stone as a hammer, they pound away until the nut’s hard exterior cracks with a crunch...Until now, scientists eager to learn more about this tool use could spend weeks combing through hours of raw footage to find the relevant recordings. But a new AI system out today can do the grunt work for them, automatically finding and identifying the right clips in footage captured from the wild...
A Message from this week's Sponsor:

Kickstart Your New Career with a Data Science & Analytics Bootcamp Don’t miss your chance to join a Data Scientist-led, online Metis bootcamp plus get career support until you’re hired. Bootcamps are starting soon! Ready to take your data science or analytics career to the next level? Learn more about the Metis Online Data Science & Analytics Bootcamps.

A Message From This Week's Sponsor

Kickstart Your New Career with a Data Science & Analytics Bootcamp

Bootcamps are starting soon! Don’t miss your chance to join a Data Scientist-led, online Metis bootcamp with career support until you’re hired. Ready to take your data science or analytics career to the next level? Learn more about the Metis Online Data Science & Analytics Bootcamps

Data Science Articles & Videos

  • Dynamic Pricing Competition
    The Dynamic Pricing Competition is a Reinforcement Learning challenge and brings together people from academia and industry to compete with smart algorithms. Are you ready to outsmart the competition? Join now and win real prize money! The competition starts again in November and runs until the end of December. We've set up three different challenges / competitions and the winner of each challenge receives a cash prizes...
  • Growing a Career in NLP with Primer’s Amy Heineike
    How do you build a career in one of the most promising corners of tech?...Amy Heineike’s answer might not be what you’d expect. At this point in her career, she’s VP of Engineering (EMEA) and one of the original team members at Primer, a company at the forefront of the NLP revolution. But she didn’t exactly take a direct path to get there...In our first episode of Technical Women, Amy talks to hosts Natalie Vais and Renee Shah about how her winding path led her to build completely new language processing models...
  • Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces
    Generative adversary network (GAN) generated high-realistic human faces have been used as profile images for fake social media accounts and are visually challenging to discern from real ones. In this work, we show that GAN-generated faces can be exposed via irregular pupil shapes. This phenomenon is caused by the lack of physiological constraints in the GAN models. We demonstrate that such artifacts exist widely in high-quality GAN-generated faces and further describe an automatic method to extract the pupils from two eyes and analysis their shapes for exposing the GAN-generated faces...
  • Interview: Open-Source Analytical Computing (pandas, Apache Arrow) — with Wes McKinney
    Wes McKinney joins us to discuss the history and philosophy of pandas and Apache Arrow as well as his continued work in open source tools...In this episode you will learn: • History of pandas [5:18] • The trends of R and Python [21:20] • Python for Data Analysis [23:43] • pandas updates and community [27:53] • Apache Arrow [39:38] • Voltron Data [53:03] • Origin of Wes’s project names [1:05:56] • Wes’s favorite tools [1:07:30] • Audience Q&A [1:13:18]...
  • Survey of Deep Learning Methods for Inverse Problems
    In this paper we investigate a variety of deep learning strategies for solving inverse problems. We classify existing deep learning solutions for inverse problems into three categories of Direct Mapping, Data Consistency Optimizer, and Deep Regularizer. We choose a sample of each inverse problem type, so as to compare the robustness of the three categories, and report a statistical analysis of their differences. We perform extensive experiments on the classic problem of linear regression and three well-known inverse problems in computer vision, namely image denoising, 3D human face inverse rendering, and object tracking, selected as representative prototypes for each class of inverse problems...
  • The EU and the US: two different approaches to AI governance
    In this new blog, we compare the EU and US approaches to AI governance and consider the implications for future collaboration...In our recent paper, published in Science and Engineering Ethics, we take a deep dive into the AI governance approaches of the EU and US, considering the progress that each has made and assess the likelihood of transatlantic cooperation in the future. Perhaps unsurprisingly, we find that the EU and US have been taking highly divergent strategies to maximise the opportunities and minimise the risks of AI...
  • MLOps Anti-Patterns
    The Data Exchange Podcast: Nikhil Muralidhar on lessons learned from developing and deploying machine learning models at scale...He is the lead author of an excellent survey paper entitled “Using AntiPatterns to avoid MLOps Mistakes”. Nikhil and his co-authors provide a vocabulary of anti-patterns encountered in ML pipelines, with a focus on the financial services industry. In addition, they make several recommendations for documenting and managing MLOps at an enterprise scale...


Create AI-powered search and recommendation apps with Pinecone Pinecone is a fully managed vector database that makes it easy to add vector search to production applications. It combines state-of-the-art vector search libraries, advanced features such as filtering, and distributed infrastructure to provide high performance and reliability at any scale. Get started now — it's free!
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!


Training & Resources

  • PCA: Beyond Dimensionality Reduction
    Many beginner Data Scientists have their first contact with the algorithm learning that it is good for dimensionality reduction, meaning that when we have a wide dataset, with many variables, we can use PCA to transform our data to as many components as we want, therefore reducing it before predictions...That is true and a good technique, actually. But in this post I want to show you another good use of PCA: verify how the features are varying together...
  • Complete Machine Learning pipeline for NLP tasks
    This article has everything one needs to convert a Proof o Concept to a full blown ML product including reference implementation of a simplified pipeline as well as hints on what to do next. The particular problem the system solves can be described as follows: Extracting names of companies from incoming emails and recording them. This may sound like a superficial problem but should work to demonstrate how an ML system can be put into production...


P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

Easy to unsubscribe at any time. Your e-mail address is safe.