Data Science Weekly Newsletter - Issue 36

Issue #36

July 31 2014

Editor Picks

 
  • SparklingPandas

    SparklingPandas aims to make it easy to use the distributed computing power of PySpark to scale your data anlysis with Pandas...
 
 

Data Science Articles & Videos

 
  • How to Use a Decision Tree to Trade Bank of America Stock
    In our last article we went through a basic example of building a machine-learning algorithm to predict the direction of Apple stock, now we’ll explore how you can actually use these algorithms to help you come up with your own strategy...
  • Social Media & Machine Learning tell Stores where to locate:
    Dmytro Karamshuk Interview

    We recently caught up with Dmytro Karamshuk, Researcher in Computer Science and Engineering at King's College London - investigating data mining, complex networks, human mobility and mobile networks. We were keen to learn more about his background, how human mobility modeling has evolved and what his research has uncovered in terms of applying machine learning to social media data to determine optimal retail store placement… ...
  • Big Public Data to Predict Crowd Behavior: Nathan Kallus Interview
    We recently caught up with Nathan Kallus, PhD Candidate at the Operations Research Center at MIT. We were keen to learn more about his background, his research into data-driven decision making and the recent work he's done using big public data to predict crowd behavior - especially as relates to social unrest...
  • Sentiment Analysis on Movie Reviews
    I've been playing with this problem the open-source way, putting all my code on Github. The other day I got lucky and reached 2nd place with score of 0.65 and I thought it would be nice to share what I did with everybody...
  • Scaled Inference Wants To Be The Google Brain For Everyone
    Google Brain, an artificial intelligence and machine learning project at Google, has been used to power services like Android’s speech recognition system and photo search on Google+. Now, two of the most longstanding machine learning engineers, one of whom worked on Google Brain, have left the search giant to start a new company...
  • Six Steps To Take Before Pursuing Education To Get A Data Science Job
    Do you find yourself thinking I'm a bit of an unlikely candidate to do Data Science because I don't have a PhD in Computer Science or Mathematics? Do you feel like almost all of the companies hiring Data Scientists are looking for people with advanced degrees, so you find yourself wondering if you should go through 5-6 years in a PhD program? As you look at the "Data Science Venn Diagram" or the "Becoming a Data Scientist – Curriculum via Metromap" do you find yourself wondering if you have what it takes to make it?...
 
 

Jobs

 
  • Data Scientist, Analytics - Facebook - New York

    We’re looking for data scientists to work on our core products with a passion for social media to help drive informed business decisions for Facebook. You will enjoy working with one of the richest data sets in the world, cutting edge technology, and the ability to see your insights turned into real products on a regular basis. The perfect candidate will have a background in computer science or a related technical field, will have experience working with large data stores, and will have some experience building software...
 
 

Training & Resources

 
  • One Hundred Million Creative Commons Flickr Images for Research
    Today, we are announcing the Flickr Creative Commons dataset as part of Yahoo Webscope’s datasets for researchers. The dataset, we believe, is one of the largest public multimedia datasets that has ever been released—99.3 million images and 0.7 million videos, all from Flickr and all under Creative Commons licensing...
 
 

Books

 

  • Probabilistic Approaches to Recommendations

    Just released!...

    "This book starts with a brief summary of the recommendation problem and its challenges and a review of some widely used techniques Next, we introduce and discuss probabilistic approaches for modeling preference data. We focus our attention on methods based on latent factors, such as mixture models, probabilistic matrix factorization, and topic models, for explicit and implicit preference data. These methods represent a significant advance in the research and technology of recommendation. The resulting models allow us to identify complex patterns in preference data, which can be exploited to predict future purchases effectively. ..."

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
 
 
P.S. Enjoyed the newsletter and want to show your support? Buy us a coffee to keep us energized ;) - All the best, Hannah & Sebastian

 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.