Data Science Weekly Newsletter - Issue 252

Issue #252

Sept 20 2018

Editor Picks
  • Pattern to the Seemingly Random Distribution of Prime Numbers Discovered
    Often known as “the building blocks of mathematics,” prime numbers have fascinated mathematicians for centuries due to their highly unpredictable and seemingly random nature. However, a team of researchers at Princeton University have recently discovered a strange pattern in the primes’ chaos. Their novel modelling techniques revealed a surprising similarity between primes and certain naturally occurring crystalline material...

A Message from this week's Sponsor:


  • Mode Studio: SQL, Python, R, & charts in one platform
    No more jumping between applications. Mode Studio is the analytics toolkit that brings everything together, and gets out of the way. Explore data in our SQL editor, and pass results to integrated Python or R notebooks for deeper exploration and visualization. You can also layer charts over results quickly with built-in visualization tools, and sharing is easy—just send the report URL to teammates when you're ready...


Data Science Articles & Videos

  • Patent analysis using the Google Patents Public Datasets on BigQuery
    Google Patents Public Datasets is a collection of compatible BigQuery database tables from government, research and private companies for conducting statistical analysis of patent data. This is a great starting point if you need to do technical document comparison in ML...
  • Deep Reinforcement Learning Doesn't Work Yet
    Deep reinforcement learning is surrounded by mountains and mountains of hype. And for good reasons! Reinforcement learning is an incredibly general paradigm, and in principle, a robust and performant RL system should be great at everything. Merging this paradigm with the empirical power of deep learning is an obvious fit. Deep RL is one of the closest things that looks anything like AGI, and that’s the kind of dream that fuels billions of dollars of funding. Unfortunately, it doesn’t really work yet...
  • Deep learning made easier with transfer learning
    In this article, we’re going to look at transfer learning. Rather than developing an entirely customized solution to your problem, transfer learning allows you to transfer knowledge from related problems to help solve your custom problem more easily. By transferring that knowledge, you are taking advantage of the expensive resources that were used to acquire it - training data, hardware, researchers - without the incurring the cost. Let’s see how and when this approach is effective...
  • A foundation for scikit-learn at Inria
    This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core contributors, and to hire more people on the team. The goal is to help sustaining quality (more frequent releases?) and to tackle some ambitious features...


  • Entry Level Data Scientist - IBM - Multiple locations
    Entry-Level Data Scientists extract knowledge or insights from structured or unstructured data. They draw upon the practice of data analysis, using predictive analytics, data mining, pattern recognition, data modeling, machine learning and various statistical methods in order to solve large scale optimization problems and to understand the meaning behind vast data sets.

    Entry-Level Data Scientists are in demand across IBM's growth areas. You'll be matched and deployed to a team in a strategic business, based on your offered location and fit...


Training & Resources

  • Tabular Data in Scikit-Learn and Dask-ML
    Scikit-Learn 0.20.0 will contain some nice new features for working with tabular data. This blogpost will introduce those improvements with a small demo. We'll then see how Dask-ML was able to piggyback on the work done by scikit-learn to offer a version that works well with Dask Arrays and DataFrames...




  • Data Visualization with Python and JavaScript:
    Scrape, Clean, Explore & Transform Your Data

    Learn how to turn raw data into rich, interactive web visualizations with the powerful combination of Python and JavaScript. With this hands-on guide, author Kyran Dale teaches you how build a basic dataviz toolchain with best-of-breed Python and JavaScript libraries—including Scrapy, Matplotlib, Pandas, Flask, and D3—for crafting engaging, browser-based visualizations...

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

    P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.