Receive the Data Science Weekly Newsletter every Thursday
Easy to unsubscribe at any time. Your e-mail address is safe.
Data Science Weekly Newsletter
September 8, 2016
A Technical Primer On Causality
What does “causality” mean, and how can you represent it mathematically? How can you encode causal assumptions, and what bearing do they have on data analysis? These types of questions are at the core of the practice of data science, but deep knowledge about them is surprisingly uncommon...
The Pallettes of Earth
Take a satellite image, and extract the pixels into a uniform 3-D color space. Then run a clustering algorithm on those pixels, to extract a number of clusters. The centroids of those clusters them make a representative palette of the image...
Deep Neural Networks for YouTube Recommendations
YouTube represents one of the largest scale and most sophisticated industrial recommendation systems in existence. In this paper, we describe the system at a high level and focus on the dramatic performance improvements brought by deep learning...
Join Yhat cofounder and CTO Greg Lamp & Rodeo Product Manager Colin Ristig for a live product tour of Yhat's open-source Python IDE, Rodeo, and enterprise model deployment platform, ScienceOps. Greg and Colin will walk through a demo of both products using a beer recommender algorithm and web app as an example. The webinar will take place on Wednesday, September 21 at 2 PM EST.
Get your invite to the Yhat webinar today!
Data Science Articles & Videos
Artificial Intelligence Swarms Silicon Valley on Wings and Wheels
For more than a decade, Silicon Valley’s technology investors and entrepreneurs obsessed over social media and mobile apps that helped people do things like find new friends, fetch a ride home or crowdsource a review of a product or a movie. Now Silicon Valley has found its next shiny new thing. And it does not have a “Like” button...
How a Japanese Cucumber Farmer is using Deep Learning and TensorFlow
It’s not hyperbole to say that use cases for machine learning and deep learning are only limited by our imaginations. About one year ago, a former embedded systems designer from the Japanese automobile industry named Makoto Koike started helping out at his parents’ cucumber farm, and was amazed by the amount of work it takes to sort cucumbers by size, shape, color and other attributes...
Experimentation in a Ridesharing Marketplace
Technology companies strive to make data-driven product decisions — and Lyft is no exception. Because of that, online experimentation, or A/B testing, has become ubiquitous. The way it’s bandied about, you’d be excused for thinking that online experimentation is a completely solved problem. In this post, we’ll illustrate why that’s far from the case for systems — like a ridesharing marketplace — that evolve according to network dynamics. As we’ll see, naively partitioning users into treatment and control groups can bias the effect estimates you care about...
A Decomposable Attention Model for Natural Language Inference
We propose a simple neural architecture for natural
language inference. Our approach uses attention
to decompose the problem into subproblems
that can be solved separately, thus making
it trivially parallelizable. On the Stanford Natural
Language Inference (SNLI) dataset, we obtain
state-of-the-art results with almost an order
of magnitude fewer parameters than previous
work and without relying on any word-order information...
Craigslist and U.S. Rental Housing Markets
The UC Berkeley Urban Analytics Lab collected, validated, and analyzed 11 million Craigslist rental listings to discover fine-grained patterns across metropolitan housing markets in the United States. I’ll summarize our findings below and explain the methodology at the bottom...
Hierarchical Multiscale Recurrent Neural Networks
Learning both hierarchical and temporal representation has been among the long-standing challenges of recurrent neural networks. Multiscale recurrent neural networks have been considered as a promising approach to resolve this issue, yet there has been a lack of empirical evidence showing that this type of models can actually capture the temporal dependencies by discovering the latent hierarchical structure of the sequence. In this paper, we propose a novel multiscale approach, called the hierarchical multiscale recurrent neural networks...
A Survival Guide to a PhD
Now that my PhD has come to an end I wanted to compile a similar retrospective document in hopes that it might be helpful to some. Unlike the undergraduate guide, this one was much more difficult to write because there is significantly more variation in how one can traverse the PhD experience. Therefore, many things are likely contentious and a good fraction will be specific to what I’m familiar with (Computer Science / Machine Learning / Computer Vision research). But disclaimers are boring, lets get to it!...
Data Scientist - Blue Owl - San Francisco
A million people a year die in car collisions around the world. That number should be zero. You can help us create a new insurance company that uses the latest technology and data science methods to save lives by preventing car collisions before they happen. The field is rich with data and we will be pushing the boundaries of what is possible...
Python Pocket Reference
Updated for both Python 3.4 and 2.7, this convenient pocket guide is the perfect on-the-job quick reference. You’ll find concise, need-to-know information on Python types and statements, special method names, built-in functions and exceptions, commonly used standard library modules, and other prominent Python tools...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page...