Data Science Weekly Newsletter

Issue

417

November 18, 2021

‍

Editor's Picks

‍

To Be Energy-Efficient, Brains Predict Their Perceptions
Results from neural networks support the idea that brains are “prediction machines” — and that they work that way to conserve energy...

Scientific Visualization: Python + Matplotlib
An open access book on scientific visualization using python and matplotlib...

AI helps scientists spy on chimp behavior in the wild
Chimpanzees in West Africa have a clever trick to get at the tasty kernels inside oil palm nuts. They carefully select a flat rock to act as an anvil and place a nut on top. Then, using another stone as a hammer, they pound away until the nut’s hard exterior cracks with a crunch...Until now, scientists eager to learn more about this tool use could spend weeks combing through hours of raw footage to find the relevant recordings. But a new AI system out today can do the grunt work for them, automatically finding and identifying the right clips in footage captured from the wild...

A Message from this week's Sponsor:

Kickstart Your New Career with a Data Science & Analytics Bootcamp Don’t miss your chance to join a Data Scientist-led, online Metis bootcamp plus get career support until you’re hired. Bootcamps are starting soon! Ready to take your data science or analytics career to the next level? Learn more about the Metis Online Data Science & Analytics Bootcamps.

‍

A Message From This Week's Sponsor

‍

Kickstart Your New Career with a Data Science & Analytics Bootcamp

Bootcamps are starting soon! Don’t miss your chance to join a Data Scientist-led, online Metis bootcamp with career support until you’re hired. Ready to take your data science or analytics career to the next level? Learn more about the Metis Online Data Science & Analytics Bootcamps
.

‍

Data Science Articles & Videos

‍

Dynamic Pricing Competition
The Dynamic Pricing Competition is a Reinforcement Learning challenge and brings together people from academia and industry to compete with smart algorithms. Are you ready to outsmart the competition? Join now and win real prize money! The competition starts again in November and runs until the end of December. We've set up three different challenges / competitions and the winner of each challenge receives a cash prizes...

Growing a Career in NLP with Primer’s Amy Heineike
How do you build a career in one of the most promising corners of tech?...Amy Heineike’s answer might not be what you’d expect. At this point in her career, she’s VP of Engineering (EMEA) and one of the original team members at Primer, a company at the forefront of the NLP revolution. But she didn’t exactly take a direct path to get there...In our first episode of Technical Women, Amy talks to hosts Natalie Vais and Renee Shah about how her winding path led her to build completely new language processing models...

Clarity and Aesthetics in Visualization: Attention, Contrast, Grouping
Three concepts from visual perception that can help you think better about clarity and aesthetics...

Eyes Tell All: Irregular Pupil Shapes Reveal GAN-generated Faces
Generative adversary network (GAN) generated high-realistic human faces have been used as profile images for fake social media accounts and are visually challenging to discern from real ones. In this work, we show that GAN-generated faces can be exposed via irregular pupil shapes. This phenomenon is caused by the lack of physiological constraints in the GAN models. We demonstrate that such artifacts exist widely in high-quality GAN-generated faces and further describe an automatic method to extract the pupils from two eyes and analysis their shapes for exposing the GAN-generated faces...

“The power to surveil, control, and punish”: The dystopian danger of a mandatory biometric database in Mexico
Digital rights activist Luis Fernando García on the weaponization of personal data and why international agencies are pushing countries in the Global South to collect their citizens’ information...

Interview: Open-Source Analytical Computing (pandas, Apache Arrow) — with Wes McKinney
Wes McKinney joins us to discuss the history and philosophy of pandas and Apache Arrow as well as his continued work in open source tools...In this episode you will learn: • History of pandas [5:18] • The trends of R and Python [21:20] • Python for Data Analysis [23:43] • pandas updates and community [27:53] • Apache Arrow [39:38] • Voltron Data [53:03] • Origin of Wes’s project names [1:05:56] • Wes’s favorite tools [1:07:30] • Audience Q&A [1:13:18]...

Survey of Deep Learning Methods for Inverse Problems
In this paper we investigate a variety of deep learning strategies for solving inverse problems. We classify existing deep learning solutions for inverse problems into three categories of Direct Mapping, Data Consistency Optimizer, and Deep Regularizer. We choose a sample of each inverse problem type, so as to compare the robustness of the three categories, and report a statistical analysis of their differences. We perform extensive experiments on the classic problem of linear regression and three well-known inverse problems in computer vision, namely image denoising, 3D human face inverse rendering, and object tracking, selected as representative prototypes for each class of inverse problems...

The EU and the US: two different approaches to AI governance
In this new blog, we compare the EU and US approaches to AI governance and consider the implications for future collaboration...In our recent paper, published in Science and Engineering Ethics, we take a deep dive into the AI governance approaches of the EU and US, considering the progress that each has made and assess the likelihood of transatlantic cooperation in the future. Perhaps unsurprisingly, we find that the EU and US have been taking highly divergent strategies to maximise the opportunities and minimise the risks of AI...

Graph Neural Networks through the lens of Differential Geometry and Algebraic Topology
Differential geometry and algebraic topology are not encountered very frequently in mainstream machine learning. In this series of posts, I show how tools from these fields can be used to reinterpret Graph Neural Networks and address some of their common plights in a principled way...

MLOps Anti-Patterns
The Data Exchange Podcast: Nikhil Muralidhar on lessons learned from developing and deploying machine learning models at scale...He is the lead author of an excellent survey paper entitled “Using AntiPatterns to avoid MLOps Mistakes”. Nikhil and his co-authors provide a vocabulary of anti-patterns encountered in ML pipelines, with a focus on the financial services industry. In addition, they make several recommendations for documenting and managing MLOps at an enterprise scale...

‍

Tools

‍

Create AI-powered search and recommendation apps with Pinecone Pinecone is a fully managed vector database that makes it easy to add vector search to production applications. It combines state-of-the-art vector search libraries, advanced features such as filtering, and distributed infrastructure to provide high performance and reliability at any scale. Get started now — it's free!
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

‍

Jobs

‍

Entry Level Data Scientist: 2022 - IBM - Multiple Locations As a Data Scientist at IBM, you will help transform our clients’ data into tangible business value by analyzing information, communicating outcomes and collaborating on product development. Work with Best in Class open source and visual tools, along with the most flexible and scalable deployment options. Whether it’s investigating patient trends or weather patterns, you will work to solve real world problems for the industries transforming how we live.

Want to post a job here? Email us for details >> team@datascienceweekly.org

‍

Training & Resources

‍

PCA: Beyond Dimensionality Reduction
Many beginner Data Scientists have their first contact with the algorithm learning that it is good for dimensionality reduction, meaning that when we have a wide dataset, with many variables, we can use PCA to transform our data to as many components as we want, therefore reducing it before predictions...That is true and a good technique, actually. But in this post I want to show you another good use of PCA: verify how the features are varying together...

Monte Carlo Methods or Why it's a Bad Idea to Go to the Casino
A Monte Carlo method is a fairly simple way to get an answer to a task without having to analyze it mathematically. The solution is to simulate the task and see what happens. And this is best done with a computer program...

Complete Machine Learning pipeline for NLP tasks
This article has everything one needs to convert a Proof o Concept to a full blown ML product including reference implementation of a simplified pipeline as well as hints on what to do next. The particular problem the system solves can be described as follows: Extracting names of companies from incoming emails and recording them. This may sound like a superficial problem but should work to demonstrate how an ML system can be put into production...

‍

Books

‍

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...

For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page...

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

‍