Data Science Weekly Newsletter - Issue 111

Issue #111

January 7 2016

Editor Picks
 
  • Why too much evidence can be a bad thing
    Under ancient Jewish law, if a suspect on trial was unanimously found guilty by all judges, then the suspect was acquitted. This reasoning sounds counterintuitive, but the legislators of the time had noticed that unanimous agreement often indicates the presence of systemic error...
  • AMA Data Scientist: Jake Porway of DataKind
    Kick off 2016 with DataKind’s founder and executive director Jake Porway for his first ever Reddit AMA January 13. Join in for a candid discussion of what it takes to apply data science for social good. (Hint - much more than good intentions.) Hope to see you on /r/DataScience!...
 
 

A Message from this week's Sponsor:

 

  • Build real-time apps.
    Syncano. Database. Backend. Middleware. Real-time. Support. Start for free!
     

 

Data Science Articles & Videos

 
  • Analyzing networks of characters in 'Love Actually'
    Every Christmas Eve, my family watches Love Actually. Even on the eighth or ninth viewing, it’s impressive what an intricate network of characters it builds. This got me wondering how we could visualize the connections quantitatively, based on how often characters share scenes. So last night, while my family was watching the movie, I loaded up RStudio, downloaded a transcript, and started analyzing...
  • The Myth Of AI
    A Conversation With Jaron Lanier - Computer Scientist; Musician; Author of Who Owns the Future?...
  • Attention And Memory In Deep Learning And NLP
    A recent trend in Deep Learning are Attention Mechanisms. In an interview, Ilya Sutskever, now the research director of OpenAI, mentioned that Attention Mechanisms are one of the most exciting advancements, and that they are here to stay. That sounds exciting. But what are Attention Mechanisms?...
  • Winning The Bias-Variance Tradeoff
    Machine learning is a strange mix of math and weird heuristics. When I started studying machine learning, I was SO FRUSTRATED. Everything was “well it works in practice” and so little of it was math. I was a pure math major at the time, so arguments like “well it works in practice” made me REALLY MAD...The bias-variance decomposition is a small piece of math that actually explains why some things in machine learning work!...
  • Machine Learning for Artists
    This spring I will be teaching a course at NYU’s Interactive Telecommunications Program (ITP) called “Machine Learning for Artists.” Since the subject is fairly uncommon outside of the realm of scientific research, I thought it would be helpful to outline my motivations for offering this class...
 
 

Jobs

 
  • Data Scientist - Johns Hopkins Health System - Glen Burnie, MD

    Upon joining Johns Hopkins Health System, you become part of a diverse organization dedicated to its patients, their families, and the community we serve, as well as to our employees. The Data Scientist is responsible for monitoring data quality and for analyzing data using statistical tools like R, SAS, SPSS, WEKA, Rapidminer. Presenting data in charts, graphs, tables and leveraging relational databases for collecting data. Through innovation the data scientist will find meaningful patterns, trends, and relationships by evaluating large amounts of healthcare data and be able to interpret and explain the findings to the organization....
 
 

Training & Resources

 
  • bayes.js: A Small Library For Doing MCMC In The Browser
    Bayesian data analysis is cool, Markov chain Monte Carlo is the cool technique that makes Bayesian data analysis possible, and wouldn’t it be coolness if you could do all of this in the browser? That was what I thought, at least, and I’ve now made bayes.js: A small JavaScript library that implements an adaptive MCMC sampler and a couple of probability distributions, and that makes it relatively easy to implement simple Bayesian models in JavaScript...
  • Introducing Guesstimate, a Spreadsheet for Things That Aren’t Certain
    A spreadsheet that’s as easy to use as existing spreadsheets, but works for uncertain values. For any cell you can enter confidence intervals (lower and upper bounds) that can represent full probability distributions. 5000 Monte Carlo simulations are performed to find the output interval for each equation, all in the browser...
 
 

Books

 

 
 
P.S. Interested in reaching fellow readers of this newsletter? Consider sponsoring! Email us for details :) - All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.