Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.

Data Science Weekly Newsletter
May 23, 2019

Editor's Picks

  • Inside Facebook's New Robotics Lab,
    Where AI and Machines Friend One Another

    But, like a hare zigzagging back and forth to avoid a falcon, this robot’s seeming madness is in fact a special brand of cleverness, one that Facebook thinks holds the key not only for better robots, but for developing better artificial intelligence. This robot, you see, is teaching itself to explore the world. And that, Facebook says, could one day lead to intelligent machines like telepresence robots...

A Message From This Week's Sponsor

Become a Data Analyst with Thinkful

The Data Analytics program is for people who are starting from the very beginning. Learn how to scrape, collect and analyze data, use SQL and Tableau, and get an introduction to Python. We'll get you a job within six months of graduating or you'll get your tuition back.

Data Science Articles & Videos

  • Grocery bills can predict diabetes rates by neighborhood
    Dietary habits are notoriously difficult to monitor. Now data scientists have analyzed sales figures from London’s biggest grocer to link eating patterns with local rates of high blood pressure, high cholesterol, and high blood sugar...
  • Maintainable ETLs: Tips for Making Your Pipelines Easier to Support and Extend
    Core to any data science project is…wait for it…data! Preparing data in a reliable and reproducible way is a fundamental part of the process. If you’re training a model, calculating analytics, or just combining data from multiple sources and loading them into another system, you’ll need to build a data processing or ETL1 pipeline...
  • PaperRobot: Incremental Draft Generation of Scientific Ideas
    We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper...
  • Cross-lingual Language Model Pretraining
    Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining...
  • Data-Efficient Image Recognition with Contrastive Predictive Coding
    Large scale deep learning excels when labeled images are abundant, yet data-efficient learning remains a longstanding challenge. While biological vision is thought to leverage vast amounts of unlabeled data to solve classification problems with limited supervision, computer vision has so far not succeeded in this `semi-supervised' regime. Our work tackles this challenge with Contrastive Predictive Coding, an unsupervised objective which extracts stable structure from still images...


Big Data and AI Toronto 2019

Big Data and AI Toronto is a 2-in-1 learning experience engineered to address the greatest business challenges technology leaders are facing today.
During 2 days of case studies, demos and panels, attendees will engage with global thought-leaders in Big Data and AI, including experts from Uber, Bloomberg and SAS!
Register for your free expo pass and join 5000 attendees, 150 speakers, and 90 exhibiting brands on June 12-13 th at The Metro Toronto Convention Centre.
Stay up-to-date on the newest speakers and program highlights by subscribing to the Big Data and AI Toronto  newsletter

Want to post here? Email us for details >>


  • Data Enginner / Data Scientist - Validate Health - Chicago

    Interested in being part of a small founding team, so you can see your direct impact on improving the healthcare industry? Want to be one of the rockstars building an innovative product from the ground up?​
    Validate Health is an early stage healthcare analytics company on a mission to improve accessibility to healthcare by enabling medical organizations to operate at stable and sustainable financial models.
    This position is a versatile combination of Data Engineer and Data Scientist roles. You’ll get to play a key role in shaping the delivery of powerful data-driven products that enable sustainable value-based healthcare models...

Training & Resources

  • Eureka: Mixmatch — A holistic approach to semi-supervised learning
    Semi-supervised learning (SSL) is a form of supervised learning where we have a lot of extra information about the input data (X). SSL aims to utilize this extra information about the input data distribution to make a prediction on unlabeled data using only miniscule of labeled data. Or, you can look at SSL as unsupervised learning which has certain constraints for clustering — Points having the same label should go to the same cluster and so on. Some of the hyper-parameters in unsupervised learning could be inferred from this extra data — Number of clusters in case of clustering. Let us now see what are the assumption in an SSL...


40% off at Manning

Do more with your data!
If you're looking to make your data skills stand out, then be sure to check out Manning's range of books and video courses. They're offering 40% off everything in their catalog, so there's no better time to learn something new...
For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page
P.S., Want to reach our audience / fellow readers? Consider sponsoring - grab a spot now; first come first served! All the best, Hannah & Sebastian

Easy to unsubscribe at any time. Your e-mail address is safe.