Data Science Weekly Newsletter - Issue 382

Issue #350

Aug 6 2020

Editor Picks
 
  • Can GPT-3 Make Analogies?
    Many articles and social media posts have given examples of GPT-3’s extraordinarily human-like text, its seemingly endless knowledge of (mostly Western) culture, and even its ability to create computer programs just by being given a few input-output examples. My purpose in this article is not to review the success, hype, or counter-hype on GPT-3. Instead, I want to explore its ability to make Copycat letter-string analogies...
  • Why You Should Do NLP Beyond English
    7000+ languages are spoken around the world but NLP research has mostly focused on English. This post outlines why you should work on languages other than English...
 
 

A Message from this week's Sponsor:

 

 
Exclusive deal for Data Science Weekly readers

With authors including the creator of Keras and Google Cloud AI engineers, you can be sure that when you’re learning from Manning, you’re learning from the very best.
 

 

Data Science Articles & Videos

 
  • TikTok and the Sorting Hat
    I’ve been fascinated with TikTok. Here in 2020, TikTok is, for many, including myself, the most entertaining short video app going. The U.S. government is considering banning the app as a national security risk, and while that’s the topic du jour for just about everyone right now, I’m much more interested in tracing how it got a foothold in markets outside of China, especially the U.S. with its powerful incumbents.... The answer, I believe, has significant implications for the future of cross-border tech competition, as well as for understanding how product developers achieve product-market-fit. The rise of TikTok updated my thinking. It turns out that in some categories, a machine learning algorithm significantly responsive and accurate can pierce the veil of cultural ignorance...
  • How graph technologies are being used to solve complex business problems
    In this episode of the Data Exchange I speak with Denise Gosnell, Chief Data Officer at DataStax1. This conversation is a great introduction to what has become an important class of technologies and tools. Graph technologies are used to power a wide array of applications, including recommendation engines, fraud detection systems, identity and access management, search, and many other use cases...
  • Dealing with Overconfidence in Neural Networks: Bayesian Approach
    I trained a multi-class classifier on images of cats, dogs and wild animals and passed an image of myself, it’s 98% confident I’m a dog. The problem isn’t that I passed an inappropriate image, because models in the real world are passed all sorts of garbage. It’s that the model is overconfident about an image far away from the training data. Instead we expect a more uniform distribution over the classes. The overconfidence makes it difficult to post-process model output (setting a threshold on predictions, etc.), which means it needs to be dealt with by the architecture. In this post I explore a Bayesian method for dealing with overconfident predictions for inputs far away from training data in neural networks. The method is called last layer Laplace approximation (LLLA)...
  • A very short history of some times we solved AI
    This is obviously a very selective list, and I could easily find a handful more examples of when we solved the most important challenge for artificial intelligence and created software systems that were truly intelligent. These were all moments that changed everything, after which nothing would ever be the same. Because we made the machine do something that everyone agreed required true intelligence, the writing was on the wall for human cognitive superiority. We've been prognosticating the imminent arrival of our new AI overlords since at least the 50s. Beyond the sarcasm, what is it I want to say with this?...
  • 5 Spark Best Practices For Data Science
    It takes time to learn how to make spark do its magic but these 5 practices really pushed my project forward and sprinkled some spark magic on my code. To conclude, this is the post I was looking for (and didn’t find) when I started my project — I hope you found it just in time...
 
 

Training*

 

 
Quick Question For You: Do you want a Data Science job?

After helping hundred of readers like you get Data Science jobs, we've distilled all the real-world-tested advice into a self-directed course.

The course is broken down into three guides:
  1. Data Science Getting Started Guide. This guide shows you how to figure out the knowledge gaps that MUST be closed in order for you to become a data scientist quickly and effectively (as well as the ones you can ignore)

  2. Data Science Project Portfolio Guide. This guide teaches you how to start, structure, and develop your data science portfolio with the right goals and direction so that you are a hiring manager's dream candidate

  3. Data Science Resume Guide. This guide shows how to make your resume promote your best parts, what to leave out, how to tailor it to each job you want, as well as how to make your cover letter so good it can't be ignored!
Click here to learn more ...

*Sponsored post. If you want to be featured here, or as our main sponsor, contact us!
 

 

Jobs

 
  • Data Scientist (Entry Level) - Saturn Cloud - Remote

    Saturn Cloud helps companies perform data science at a new level of scale, with one-click solutions, to solve the world’s hardest problems. Our product is a SaaS platform which equips data science teams with high-leverage automation tools, eliminating hours of traditional, manual work. The platform is user-friendly, scalable and secure.

    You will be an entry-level Data Scientist for Saturn Cloud, an exciting new venture founded by the creators of Anaconda, NumPy, and SciPy. The role features drafting the first generation of Saturn resource materials, tutorials, and technical content...

        Want to post a job here? Email us for details >> team@datascienceweekly.org
 

 

Training & Resources

   
 

Books

 

  • Seven Databases in Seven Weeks:
    A Guide to Modern Databases and the NoSQL Movement


    "A book that tries to cover multiple database is a risky endeavor, a book that also provides hands on on each is even riskier but if implemented well leads to a great package. I loved the specific exercises the authors covered. A must read for all big data architects who don’t shy away from coding..."

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.
     


    P.S., Enjoy the newsletter? Please forward it to your friends and colleagues - we'd love to have them onboard :) All the best, Hannah & Sebastian
 
Sign up to receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe. No spam — we keep your email safe and do not share it.