We recently caught up with Sudeep Das, Astrophysicist and Data Scientist at OpenTable. We were keen to learn more about his background, his work in academia and how he is applying data science in his new role - transforming OpenTable into a local dining expert…
Hi Sudeep, firstly thank you for the interview. Let's start with your background and how you became interested in working with data...
Q - What is your 30 second bio?
A - In no particular order, I am a coffee aficionado, a foodie, and a scientist. For most of my professional life, I have been an astrophysicist. After finishing my Ph.D. from Princeton in 2008, I moved to the Berkeley Center for Cosmological Physics as Prize Fellow, and then to Argonne National Laboratory as a David Schramm Fellow working on minute fluctuations in the afterglow of the Big Bang called the cosmic microwave background. During the past year, I became increasingly interested in the booming field of data science and decided to switch fields to start an adventure in this new area. Currently, I am a data scientist at OpenTable using dining related data to help personalize the user experience, discover cultural and regional nuances in dining preferences, as well as help provide insights to restaurateurs. I am also an avid blogger, and write about science and data science on my blog datamusing.info.
Q - How did you get interested in working with data?
A - Well, much of my thesis work involved dealing with vast amounts of data from the extreme edges of the observable universe. In my every day work, I would perform a significant amount of munging, reduction, and analysis of the raw and noisy data collected by our telescope stationed in Chile. The last stage of the analysis would be applying machine learning and Bayesian techniques to make inference about fundamental parameters of the universe! This was a long and tortuous process with all kinds of nasty data problems one could imagine, but the results were rewarding! You could not do it if you did not love working with data.
Q - I can imagine! :) So, what was the first data set you remember working with? What did you do with it?
A - The first significant data set I worked on was the cosmic microwave background data on a patch of sky where there was supposed to be a big cluster of galaxies called the Bullet Cluster, and this cluster was supposed to leave an impression in the data. For several weeks, all we saw was noise. I was involved in making a map of that patch of sky, and tried various filters for suppressing the noise and various forms of visualizations. Finally, I was able to find the tiny dark dot at the position where the Bullet Cluster was supposed to be.
Q - That must have been very rewarding! Maybe it was that moment, though was there a specific "aha" moment when you realized the power of data?
A - Undoubtedly, this was when I first saw the signs of an extremely faint signal called the gravitational lensing of the cosmic microwave background (CMB) in the data from our telescope. These are tiny distortions to the patterns in the CMB due to gravitational pull of massive structures in the universe. It took excellent observations, a large arsenal of statistical tools, excellent team work and careful analysis of data to come to this point. It was the first ever detection of this effect, it was amazing, and definitely an "aha" moment for me.
Q - Wow, that's very powerful! … On that note, what excites you most about recent developments in Data Science?
A - While machine learning has been around for a very long time, and the basic methods are well established, what is really new is the enormous scale of data sets that is prompting both new ways of implementing established algorithms, as well as novel approaches to solving familiar problems at scale. Academic data sets used in traditional machine learning used to be small. Now, the game has changed with data sets becoming so large and also live (as in streaming). For example, even the apparently simple task of computing similarity between users has warranted new algorithms when the user base is in billions. Along with algorithmic developments, new ways of introspecting data have also come into play. Visualizations play a huge role in all stages of data science from initial introspection to the interpretation of results. Modern day data science demands a multifaceted skill set that ranges from the ability to efficiently clean huge data sets, solid understanding of basic algorithms, excellent visualization skills, to creative ways of solving problems and in many cases, just seeing through the haze and applying common sense. All of this has made data science a very dynamic and colorful space to work in, which is what I like most about my current role.
I also believe that data science can play an important role in solving social problems. I am a mentor in the non-profit Bayes Impact program, and currently I am mentoring the fellows on two projects based in India and the US.
That must be fascinating! Thanks for sharing all that background. Let's switch gears and talk about your current role at OpenTable...
Q - What attracted you to the intersection of data science and Restaurants/Dining? Where can Data Science create most value?
A - I have always been a die-hard foodie, and even before joining OpenTable, I used the service frequently to make restaurant reservations. While doing so, I always wondered how nice it would be if the app somehow knew my dining habits and preferences and suggested restaurants to my liking. Specially, as an academic, I was traveling frequently, and I wanted the app to be my local foodie expert, rather than just a tool to book tables. Also, I felt that there should be a way to distill the reviews at a restaurant into a set of succinct insights that would tell me what this restaurant is all about at a glance, without having to read through all the reviews. If I were a restaurateur I would like to have my restaurant's data analyzed to inform myself of how my business is doing, in general, and in comparison to others. Now, as a data scientist at OpenTable I am helping build many of these data driven features. The idea is to transform OpenTable from a transactional to an experiential company, and that is where I think Data Science is going to create the most value.
Q - That's great! So what specifically led you to join OpenTable?
A - I have always wanted a to work in a space that will marry my data science skills with my passion and domain knowledge in the food and dining space. OpenTable is the world leader in dining reservations and has an extensive and rich data set to work with, so it was an obvious choice.
Q - What are the biggest areas of opportunity/questions you want to tackle?
A - Using data science to help transform OpenTable into a local dining expert who knows me very well, and can help me and others find the best dining experience wherever we travel is incredibly exciting. This entails a whole slew of tools from natural language processing, recommendation system engineering, predictions based on internal and external signals that have to work in synch to make that magical experience happen. We also want to use data science to create unprecedented value and tools for the restaurateurs who use our service.
Q - What learnings/skills from your time in academia will be most applicable in your new role?
A - Coming from academia, and especially with a background in math, computation, and data intensive field, I feel at home with munging through large data sets and have a strong footing in statistical methods and algorithms. Academia has also trained me to pick up a fresh research paper, and quickly implement its algorithm and adapt it to our use cases. Astrophysics is also very visually driven, so I have a knack for visualizing OpenTable data in novel and non-standard ways to extract insight. Another thing that research has taught me is the importance of experimentation. I’d love to play an important role in designing experiments to field test various flavors of our data science solutions.
Q - Makes sense! So what projects are you currently working on, and why/how are they interesting to you?
A - Broadly speaking, I work on extracting insights from reviews and past dining habits of diners using a whole suite of machine learning tools that include Natural Language Processing, Sentiment Analysis, Recommendation Systems, Clustering and classification algorithms, just to name a few.
Q - And how is data science helping? What techniques, models, software etc are you using?
A - I use Python a lot, relying heavily on Pandas, scikit-learn and gensim. For visualizations I use d3.js, Matplotlib, Bokeh. Recently, I have also been using the scala-based package Spark to implement machine learning solutions at scale.
Q - What has been the most surprising insight/development you have found?
A - There are many, but nothing we’re ready to share quite yet. Stay tuned.
Q - Will do! Final question on OpenTable … How do you/your group work with the rest of the organization?
A - We sit at the heart of various projects that transcend the boundaries of several teams here. From marketing, to mobile and web in the front end, to architecture, engineering and product, we work in close collaboration with a large number of teams across the organization and around the world.
Thanks for sharing all that detail - very interesting! Good luck with all your endeavors - sounds like a great time to be part of the OpenTable team! Finally, let's talk a bit about the future and share some advice … ...
Q - What does the future of Data Science look like?
A - It looks very promising, and I feel like we are really scratching the surface now. With time, this role will take a more defined shape, and will be able to tackle bigger and broader problems. We will have great power in harnessing the information in data to really impact society on various fronts. The internet of things will also bring data science into every day appliances and how they communicate with each other and the environment. As with any great power, comes great responsibility! I think applying data science in a responsible way will be the key to continued success in this new field.
Q - Any words of wisdom for Data Science students or practitioners starting out?
A - Don’t be afraid to attack a problem from non-standard angles, and be always in the know of new advances in the field. Pick problems that really matter and can have impact. Never stop learning. Share your learnings through open source code and blog posts.
Sudeep - Thank you ever so much for your time! Really enjoyed learning more about your background, your work in academia and your remit and objectives in applying data science at OpenTable. Good luck with all your ongoing projects!
P.S.If you enjoyed this interview and want to learn more about
- what it takes to become a data scientist
- what skills do I need
- what type of work is currently being done in the field