Data Science Weekly Newsletter

Issue

389

May 6, 2021

‍

Editor's Picks

‍

Four communication techniques for solving technical problems
All data and engineering teams are faced with a constant inflow of organizational, technical, and interpersonal problems and the ability of your team to have business impact will depend largely on how effectively it can move towards optimal solutions to those problems. In this article, I discuss four communication techniques that improve the ability of a team to solve problems...

C3.ai COVID-19 Grand Challenge
On Sept. 15, C3.ai launched the C3.ai COVID-19 Grand Challenge, an international public data science competition calling for projects that help combat the COVID-19 pandemic. C3.ai will be awarding $200,000 in prize money to the seven top teams and will be publishing the winning solutions. This competition is an exciting opportunity to have an impact on the local, national, and even global community...

Introducing Bean Machine - A Declarative Probabilistic Programming Language
We are all used to two basic kinds of programming: produce an effect and compute a result. The important thing to understand is that Bean Machine is firmly in the “compute a result” camp. In our PPL (Probabilistic Programming Language) the goal of the programmer is to declaratively describe a model of how the world works, then input some observations of the real world in the context of the model, and have the program produce posterior distributions of what the real world is probably like, given those observations. It is a language for writing statistical model simulations...

‍

A Message From This Week's Sponsor

‍

Ray Summit: Learn about the latest trends in ML & Computing

What do Ant Financial, AWS, JP Morgan & Facebook have in common? They all use Ray to scale their machine learning because of its flexibility & scalability. Don’t miss a chance to hear from experts like Michael Jordan, Oriol Vinyals, Ion Stoica on the latest trends in computing and why Ray is becoming the dominant framework for scaling ML & AI applications at next week’s Ray Summit.
Register now
to join the livestream or watch the sessions on-demand—it's free.

‍

Data Science Articles & Videos

‍

Dynabench: Rethinking the way we benchmark AI
Benchmarks — from MNIST to ImageNet to GLUE — have played a hugely important role in driving progress in AI research...However, benchmarks have been saturating faster and faster...While it took the research community about 18 years to achieve human-level performance on MNIST and about six years to surpass humans on ImageNet, it took only about a year to beat humans on the GLUE benchmark for language understanding...introducing a novel platform called Dynabench, which puts humans and state-of-the-art AI models “in the loop” together and measures how often models make mistakes when humans attempt to fool them. And by adapting to a model’s responses, Dynabench can challenge it in ways that a static test can’t...

Ethical Machine Learning in Health Care
The use of machine learning (ML) in health care raises numerous ethical concerns, especially as models can amplify existing health inequities. Here, we outline ethical considerations for equitable ML in the advancement of health care. Specifically, we frame ethics of ML in health care through the lens of social justice. We describe ongoing efforts and outline challenges in a proposed pipeline of ethical ML in health, ranging from problem selection to post-deployment considerations. We close by summarizing recommendations to address these challenges...

A Pragmatic Approach to Live Collaboration
At Hex, we're all about making data workflows more collaborative. Our product allows users to connect to data, build analyses with Python and SQL, and turn them into interactive apps anyone can use...The backing "Logic View" of a Hex project is powered by a notebook-style interface, similar in spirit to products like Mathematica or Jupyter. From early on, we wanted to support live multi-user editing in this Logic View so users can review or assist each other with their work...Our team evaluated several options, and wound up pursuing a pragmatic approach which we were able to implement for our entire application in less than six weeks. We are excited to share some details for others who might be thinking through similar decisions...

Using machine learning to modernize medical triage and monitoring systems
In this episode of the Data Exchange I speak with Kira Radinsky, Chairwoman & Chief Technology Officer at Diagnostic Robotics, a startup using AI to build a medical-grade triage and clinical-predictions platform. She is also a visiting Professor at Technion – Israel Institute of Technology. Kira has extensive experience using data science and machine learning in a variety of settings, and she was one of the pioneers in using alternative data sources to augment forecasting models...

Outer Join - Remote jobs in data science
Outer Join is a new job board for remote work in data science, analytics, and engineering. Job seekers can easily filter remote jobs and teams by role and sign up for email digests to be notified regularly of new openings in the industry. To kick off its launch, Outer Join is offering free job listings for a limited time to employers hiring for remote roles in data science. There is still much to come, so feedback from job seekers and employers alike is requested to help shape the future of the platform...

Mask On — A Push for Social Media Change
The goal of my project was to build a mask detection classifier that can be integrated with Instagram to encourage mask-wearing habits and trends...I will break down my process of collecting the datasets and creating a neural network to classify the images. Later on, I will show a demo of a potential integration idea with Instagram using my model classifier...

Introducing TensorFlow Recommenders
We have a recommendation for that!...Introducing TensorFlow Recommenders, an open-source package that makes building, evaluating, and serving recommender models easy. Find recommendations for movies, restaurants, and much more!...

Recommender Systems are a Joke - Unsupervised Learning with Stand-Up Comedy
My main objective for this project was to develop a Flask application that provided more nuanced recommendations of comedy specials than what’s currently available on mainstream streaming platforms...Luckily, I was able to quickly find a website (Scraps from the Loft) with several hundred comedy transcripts...

The Essential Landscape of Enterprise AI Companies
Plenty of enterprise companies use combinations of automated data science, machine learning, and modern deep learning approaches for tasks like data preparation, predictive analytics, and process automation. Many are well-established players with deep domain expertise and product functionality. Others are hot new startups applying artificial intelligence to new problems. We cover a mix of both...To help you identify the best tools for your business, we’ve shortlisted the most promising companies below based on research papers, case studies, customer testimonials, and industry assessments...

‍

Training

‍

Join FourthBrain’s Machine Learning Engineer Program backed by Andrew Ng’s AI Fund

FourthBrain is a part-time, online program to train Machine Learning Engineers. Designed for students with software skills who want to transition to machine learning, our program combines the best in flexibility and accountability with weekly live sessions led by an instructor. You'll learn with a project-based curriculum, and will focus on the skills that are in demand by employers, including core ML and DL concepts, and how to scale and deploy production-ready models. Submit your 10-minute application by September 28 for our next cohort.
*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!

‍

Jobs

‍

Data Scientist - JetBlue - Long Island, NY

The Data Scientist applies machine learning and statistical techniques to help solve JetBlue’s most complex commercial and operational challenges. The Data Scientist will be responsible for exploring and creating compelling visualizations of new datasets, identify key features and engineer new ones to be used in modeling, and discover the modeling approaches that deliver the best results based on appropriate evaluation metrics...

Want to post a job here? Email us for details >> team@datascienceweekly.org

‍

Training & Resources

‍

Neural Architecture Search
Neural Architecture Search (NAS) automates network architecture engineering. It aims to learn a network topology that can achieve best performance on a certain task. By dissecting the methods for NAS into three components: search space, search algorithm and child model evolution strategy, this post reviews many interesting ideas for better, faster and more cost-efficient automatic neural architecture search...

GANs for Good - A GANs Specialization Virtual Expert Panel
[Free Event: 30 Sep 2020]
An online event hosted by DeepLearning.AI featuring distinguished GANs experts sharing thoughts on current GANs trend and applications...

The True Impact of Baselines in Policy Gradient Methods
I have been working with policy gradient (PG) methods for quite some time now and I thought I should continue sharing our findings here. In June, we put out a paper on how we could see PG methods from an operators perspective. I even wrote a blog post on this. Today I’m going to talk about a paper we put out last month about the role of baselines in PG methods...

‍

Books

‍

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits
Integrate scikit-learn with various tools such as NumPy, pandas, imbalanced-learn, and scikit-surprise and use it to solve real-world machine learning problems...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page
.

P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian

‍