Data Science Weekly Newsletter - Issue 28

Issue #28

June 5 2014

Editor Picks

 
  • Machine Learning as a Service:
    Making Sentiment Predictions in Realtime with ZMQ and NLTK

    I am a Machine Learning (ML) and Natural Language Processing enthusiast. For my university dissertation I created a realtime sentiment analysis classifier for Twitter. My talk is about the experience and the lessons learned... showing how easy it can be to build a ML SaaS by using some of the amazing libraries such as NLTK, ZMQ and MrJob that have helped me...
  • Realtime Predictive Analytics using scikit-learn & RabbitMQ

    Scikit-learn is an awesome tool allowing developers with little or no machine learning knowledge to predict the future! But once you’ve trained a scikit-learn algorithm, what now? In this talk, I describe how to deploy a predictive model in a production environment using scikit-learn and RabbitMQ. You’ll see a realtime content classification system to demonstrate this design...
 
 

Data Science Articles & Videos

 
  • A Growing Number of Applications are being built with Spark
    The number of companies that are using (or plan to use) Spark in production1 has exploded over the last year. The surge in popularity of the Apache Spark ecosystem stems from the maturation of its individual open source components and the growing community of users...
  • Convolutional Network Demo from 1993, featuring Yann LeCun
    This is a demo of "LeNet 1", the first convolutional network that could recognize handwritten digits with good speed and accuracy. It was developed between 1988 and 1993 in the Adaptive System Research Department, headed by Larry Jackel, at Bell Labs in Holmdel, NJ...
  • On the Importance of Text Analysis for Stock Price Prediction
    We investigate the importance of text analysis for stock price prediction. In particular, we introduce a system that forecasts companies’ stock price changes (UP, DOWN, STAY) in response to financial events reported in 8-K documents. Our results indicate that using text boosts prediction accuracy over 10% (relative) over a strong baseline that incorporates many financially-rooted features...
  • Yann LeCun's answers from the Reddit AMA
    On May 15th Yann LeCun answered “ask me anything” questions on Reddit. We hand-picked some of his thoughts and grouped them by topic for your enjoyment...
  • Bandits for Recommendation Systems
    In this blog post, we will discuss the bandit problem and how it relates to online recommender systems. Then, we'll cover some classic algorithms and see how well they do in simulation...
  • Statistical Language Wars: The Infograph
    A feature all programming communities have in common is the numerous debates about why their programming language of choice is better, more advanced, faster, holier etc. In today’s data science community, it seems like these discussions are omnipresent with advocates of SAS, SPSS, R, Python, Julia, etc. battling and challenging each other on every online medium...
  • Everything You Wanted to Know about the Kernel Trick
    The goal of this writeup is to provide a high-level introduction to the "Kernel Trick" commonly used in classification algorithms such as Support Vector Machines (SVM) and Logistic Regression. My target audience are those who have had some basic experience with machine learning, yet are looking for an alternative introduction to kernel methods...
 
 

Jobs

 
  • Senior Data Scientist - The Weather Company, Madison WI

    Are you interested in applying machine learning or data mining on problems that truly improve people’s life? We’re looking for a mathematician/data scientist eager to tackle unique challenges in the realm of predicting weather’s impact on business. You will work on a skilled team of passionate data scientists and meteorologists. Examples of projects you may encounter would be anything from predicting the electricity output of a solar park in Arizona, to predicting how much ice cream is going to be sold next week in Chicago...
 
 

Training & Resources

 
  • Deep Learning

    Draft of a book on Deep Learning by Yoshua Bengio, Ian Goodfellow, and Aaron Courville...
 
 

Books

 

  • Outlier Detection for Temporal Data

    Just released!...

    "Outlier Detection for Temporal Data covers topics in temporal outlier detection, which have applications in numerous fields. It starts with the basic topics then moves on to state of the art techniques in the field."

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
 
 
P.S. Did you enjoy the newsletter? Do you have friends/colleagues who might like it too? If so, please forward it along - we would love to have them onboard :)
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.