Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
January 3, 2019

Editor's Picks

  • Album de Statistique Graphique
    The Album de Statistique Graphique is a set of annual publications of data visualizations in France in the late 1800’s. I first heard about them from Michael Friendly a decade ago and have always been on the lookout to find them. Over the course of my thesis I did find a couple copies in research libraries, but the particular libraries required signing agreements that I would not share the photos (why do libraries do this?). Now, finally, they are on-line, easily accessible, in high quality scans courtesy of David Rumsey (thank you!). And they are amazing!...
  • Surnames and Social Mobility
    To what extent do parental characteristics explain child social outcomes? Typically, parent-child correlations in socioeconomic measures are in the range 0.2-0.6. Surname evidence suggests, however, that the intergenerational correlation of overall status is much higher. This paper shows, using educational status in England 1170-2012 as an example, that the true underlying correlation of social status is in the range 0.75-0.85...

A Message From This Week's Sponsor

Up to 50% Off Part-Time Data Science Courses

Leading data science training provider, Metis, is now offering reduced tuition on all part-time, online courses
. Get up to 50% off of Introduction to Data Science
and 40% off of Beginner Python and Math for Data Science
View Courses

Data Science Articles & Videos

  • Dynamic Planning Networks
    We introduce Dynamic Planning Networks (DPN), a novel architecture for deep reinforcement learning, that combines model-based and model-free aspects for online planning. Our architecture learns to dynamically construct plans using a learned state-transition model by selecting and traversing between simulated states and actions to maximize valuable information before acting...
  • CycleGAN, a Master of Steganography
    CycleGAN [Zhu et al., 2017] is one recent successful approach to learn a transformation between two image distributions. In a series of experiments, we demonstrate an intriguing property of the model: CycleGAN learns to “hide” information about a source image into the images it generates in a nearly imperceptible, highfrequency signal...
  • "Modern" C++ Lamentations
    This will be a long wall of text, and kinda random! My main points are: 1. C++ compile times are important,2. Non-optimized build performance is important, 3. Cognitive load is important. I don’t expand much on this here, but if a programming language or a library makes me feel stupid, then I’m less likely to use it or like it. C++ does that a lot :) ...
  • Evolved Radio and its Implications for Modelling Evolution of Novel Sensors
    This paper describes an evolvable hardware experiment that resulted in a network of transistors sensing and utilising the radio waves emanating from nearby PCs. We argue that this evolved ‘radio’ is only the second device ever whose sensors were constructed in a way that in key aspects is analogous to that found in nature. We highlight the advantages and disadvantages of this approach and show why it is practically impossible to implement a similar process in simulation....
  • Probable more likely than probably
    What kind of probability are people talking about when they say something is "highly likely" or has "almost no chance"? The chart below, created by Reddit user zonination, visualizes the responses of 46 other Reddit users to "What probability would you assign to the phase: "for various statements of probability. Each set of responses has been converted to a kernel destiny estimate and presented as a joyplot using R.
  • Machine Learning Classification Methods and Factor Investing
    In this piece, we’ll first review machine learning for classification, a problem which may be less familiar to investors, but fundamental to machine learning professionals. Next, we’ll apply classification to the classic value/momentum factors (spoiler: the results are pretty good)...


  • Data Scientist, Retention - Disney Streaming Services - NYC

    The Data Scientist is a critical position within DSS and in the Data organization who specializes in applying machine learning methods to meet optimization, personalization, recommendations and efficiency related challenges, in close collaboration with engineering and business partners. In this role, you will build and apply machine learning techniques and modern statistics to data both augment decision-making but to also significantly improve operational process problems through automation. You will collaborate across teams to define problems and develop automated solutions with the Data, Product and Engineering teams to be built into our products...

Training & Resources

  • Modern Deep Learning Techniques Applied to NLP
    This project contains an overview of recent trends in deep learning based natural language processing (NLP). It covers the theoretical descriptions and implementation details behind deep learning models, such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), and reinforcement learning, used to solve various NLP tasks and applications. The overview also contains a summary of state of the art results for NLP tasks such as machine translation, question answering, and dialogue systems...


  • Data Science from Scratch: First Principles with Python "It does three things superbly: covers the basic low level tools of a data scientist (the "from scratch" part), gives a great overview of useful Python programming examples for those new to Python, and gives an amazingly succinct yet high level overview of the mathematics and statistics required for data science..."...
    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page

Easy to unsubscribe at any time. Your e-mail address is safe.