Data Science Weekly Newsletter - Issue 31

Issue #31

June 26 2014

Editor Picks

  • Extreme Learning Machines With Julia

    There is a concept known as Liquid State Machine, and a relatively better known Echo State Network which is used for training Recurrent Neural Nets. Both of them are based on reservoir computing. On the lines of reservoir computing and very similar in concept is the topic of this post, Extreme Learning Machine...
  • Square's Machine Learning Infrastructure and Applications

    In this talk, Dr. Rong Yan (Director of Data Science and Infrastructure, Square), gives a high-level overview of data applications at Square followed by a deep dive on how machine learning is used in our industrial leading fraud detection models...

Data Science Articles & Videos

  • Natural Language Processing in Investigative Journalism
    Journalists frequently have far too many documents to read manually, whether it's a 10,000 page response to a Freedom of Information Request or 250,000 leaked diplomatic cables. We've spent the last three years applying NLP and visualization techniques to this problem, building a system called Overview which has now been used by journalists all over the world. In this talk I'll show you exactly how Overview's language processing pipeline works.
  • Machine Learning Isn't Kaggle Competitions
    Doing Kaggle problems is fun! It means you can focus on machine learning algorithm nerdery and get better at that. But it’s pretty far removed from my job...where I do (among other things) machine learning! What gives?
  • A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews
    Sarcasm is a sophisticated form of speech act widely used in online communities. Automatic recognition of sarcasm is, however, a novel task. Sarcasm recognition could contribute to the performance of review summarization and ranking systems. This paper presents SASI, a novel Semi-supervised Algorithm for Sarcasm Identification that recognizes sarcastic sentences in product review...
  • Neural Networks and Deep Learning Book
    Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing. This book will teach you the core concepts behind neural networks and deep learning...
  • Association Rule Mining to Find Unbeatable Strategy in a Tic Tac Toe Game
    This is the short overview what I am going to discuss today: a) Goal of our task, b) What is the Association Rule and the main objective of Association Rule mining, c) The Apriori algorithm, d) The description of our data, e) And finally some results after we extract some association rules out of the data...
  • Rapid User Testing with Mechanical Turk
    How We Supercharged Our User Testing Using Mechanical Turk, Google Forms and Usability Hub...We’ve been experimenting with different methods for getting rapid user feedback and we’d like to share some of our explorations...We’ll start by designing the test, followed by distributing the test, and we’ll finish with organizing the data...
  • Probabilistic Models of Cognition
    In this book, we explore the probabilistic approach to cognitive science, which models learning and reasoning as inference in complex probabilistic models. In particular, we examine how a broad range of empirical phenomena in cognitive science (including intuitive physics, concept learning, causal reasoning, social cognition, and language understanding) can be modeled using a functional probabilistic programming language called Church....


  • Insight Data Engineering Fellows

    Today we are announcing the opening of applications for the September 2014 session of the Insight Data Engineering Fellows Program. Insight is a free, full-time, six week program based in Silicon Valley that helps engineers and computer scientists transition to a career in big data engineering. Data engineers from Facebook, LinkedIn, Twitter, Yelp, Square, Microsoft, Intuit, AT&T, Climate Corporation, Beats Music, Jawbone, RelateIQ, and Airbnb will be mentoring and hiring from the program. Additionally, community leaders from open-source projects such as Apache Storm, Apache Spark, and Apache Cassandra will be mentoring as well...

Training & Resources

  • Introduction to Deep Learning on Hadoop
    As the data world undergoes its cambrian explosion phase our data tools need to become more advanced to keep pace. Deep Learning has emerged as a key tool in the non-linear arms race of machine learning. In this session we will take a look at how we parallelize Deep Belief Networks in Deep Learning on Hadoop’s next generation YARN framework with Iterative Reduce. We’ll also look at some real world examples of processing data with Deep Learning such as image classification and natural language processing...
  • AI on the Web
    This page links to 868 pages around the web with information on Artificial Intelligence. Some of the links will pop up additional information when you move the mouse over them. Links in Bold* followed by a star are especially useful and interesting sites...



  • Naked Statistics: Stripping the Dread from the Data

    Interesting take on the importance of statistics...

    "While a great measure of the book’s appeal comes from Mr. Wheelan’s fluent style—a natural comedian, he is truly the Dave Barry of the coin toss set—the rest comes from his multiple real world examples illustrating exactly why even the most reluctant mathophobe is well advised to achieve a personal understanding of the statistical underpinnings of life " - New York Times

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.