Data Science Weekly Newsletter

Blended curriculum
Career-ready courses
Affordable excellence

Move your career forward in one of the fields with the largest demand. Learn more at clarku.edu/analytics

‍

Data Science Articles & Videos

‍

Everyone Poops
Here in San Francisco, human waste is a growing issue; both for the people who run into it and for the people that have no other options than to relieve themselves on public streets. This is a multi-faceted problem, with many potential solutions that are best solved by social scientists. However, I think there is a place for data science in this conversation...

Major or Minor? Classifying the Mode of a Song
For a while now, I have wanted to work with music in machine learning in one capacity or another. The day of reckoning has come...

Baseball Pitch Recommendation: a look into the data science process.
I’ll walk you through a model I created to recommend pitches to the Cubs in games against the Cardinals, and the steps I took to get there. (Technically, this model could help any team, or any talented pitcher quite frankly when throwing pitches against Cardinals player, but my model is dedicated to my Cubs)...

Attentive GAN for Raindrop Removal from A Single Image
Raindrops adhered to a glass window or camera lens can severely hamper the visibility of a background scene and degrade an image considerably. In this paper, we address the problem by visually removing raindrops, and thus transforming a raindrop degraded image into a clean one...

Add Constrained Optimization To Your Toolbelt
At Stitch Fix, whenever we serve a customer, we must choose the right day to style that client’s fix, the right stylist for that client’s particular aesthetic, the right warehouse for that client’s shipping address. But stylists and warehouses are in high demand, with many clients competing for their time and attention. Constrained optimization helps us get work to stylists and warehouses in a manner that is fair and efficient, and gives our clients the best possible experience...

Data Dictionary: a how to and best practices
A data dictionary is a list of key terms and metrics with definitions, a business glossary. While it is sounds simple, almost trivial, its ability to align the business and remove confusion can be profound. In fact, a data dictionary is possibly one of the most valuable artifacts that a data team can deliver to the business...

How Can Neural Network Similarity Help Us Understand Training and Generalization?
In our most recent collaboration with Google Brain, we measure the similarity between neural network representations to provide insights into generalisation and the training dynamics of RNNs...

Travel Time Optimization With Machine Learning And Genetic Algorithm
What is the relationship between machine learning and optimization? — On the one hand, mathematical optimization is used in machine learning during model training, when we are trying to minimize the cost of errors between our model and our data points. On the other hand, what happens when machine learning is used to solve optimization problems?...

‍

Jobs

‍

eCommerce Data Science & Machine Learning Analyst - PepsiCo - NYC
Have a strong opinion about Tensorflow lacking an autoregressive dynamic network? So do we!
PepsiCo’s eCommerce Data Science and Analytics group is a team of data scientists, technology specialists, and business innovators who operate within eCommerce to build industry-leading systems and solutions. By focusing on machine learning and automation, the Data Science & Analytics group is pushing the bounds of possibility for PepsiCo and its strategic partners...

‍

Training & Resources

‍

Turn A List Of PyTorch Tensors Into One Tensor
Learn how to turn a list Of PyTorch Tensors into One Tensor, via a screencast video and full tutorial transcript...

The Hitchhiker’s Guide to Hyperparameter Tuning
If you go around and ask people how they tune their models, their most likely answer will be “just write a script that does it for you”. Well, that’s easier said than done... Apparently, there are a few things you should keep in mind when implementing such a script. Here, at Taboola, we implemented a hyperparameter tuning script. Let me share with you the things we learned along the way...

Using fastText and Comet.ml to classify relationships in Knowledge Graphs
In this post, we will examine how a simple model, fastText, learns to represent entities in a subset of the FB15K knowledge graph, by classifying the relationship between pairs of entities in the graph...

‍

Books

‍

Text Mining with R: A Tidy Approach
Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective....

For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page
.

‍

Editor's Picks

A Message From This Week's Sponsor

Data Science Articles & Videos

Jobs

Training & Resources

Books