In this month’s post we catch up with Lisa Qian, a Data Scientist at Airbnb, to find out what it’s like to work as a data scientist. Read on to learn about the impact data science has on Airbnb’s success, the programming languages they use on the job, and what researchers need to know in order to succeed in a corporate role.
Q: WHAT ARE THE TOP PROS & CONS OF YOUR JOB?
A: Things happen very quickly and data scientists have a big impact (see answer to next question). At Airbnb, there are so many interesting problems to work on and so much interesting data to play with. The culture of the company also encourages us to work on lots of different things. I have been at Airbnb for less than two years and I have already worked on three completely different product teams. There’s really never a dull moment. This can also be a “con” of the job. Because there are so many interesting things to work on, I often wish that I had more time to go more in depth on a project. I’m often juggling multiple projects at once, and when I’m 90% done with one of them, I’ll just move on to something else. Coming from academia where one spends years and years on one project without leaving a single rock unturned (I did a PhD in physics), this has been a delightful, but sometimes frustrating, cultural transition.
Q: HOW MUCH OF AN IMPACT DO DATA SCIENTISTS HAVE ON AIRBNB’S OVERALL SUCCESS?
A: A ton! As a data scientist, I’m involved in every step of a product’s life cycle. For example, right now I am part of the Search team. I am heavily involved in research and strategizing where I use data to identify areas that we should invest in and come up with concrete product ideas to solve these problems. From there, if the solution is to come up with a data product, I might work with engineers to develop the product. I then design experiments to quantify the effect and impact of the product, and then run and analyze the experiment. Finally, I will take what I learned and provide insights and suggestions for the next product iteration. Every product team at Airbnb has engineers, designers, product managers, and one or more data scientists. You can imagine the impact data scientists have on the company!
Q: WHICH SKILLS OR PROGRAMMING LANGUAGES DO YOU MOST FREQUENTLY USE IN YOUR WORK, AND WHY?
A: At Airbnb, we all use Hive (which is similar to SQL) to query data and build derived tables. I use R to do analysis and build models. I use Hive and R every day of the job. A lot of data scientists use Python instead of R – it’s just a matter of what we were familiar with when we came in. There have also been recent efforts to use Spark to build large-scale machine learning models. I haven’t gotten a chance to try it out yet, but plan on doing so in the near future. It seems very powerful.
Q: WHAT KIND OF PERSON MAKES THE BEST DATA SCIENTIST?
A: Successful data scientists have a strong technical background, but the best data scientists also have great intuition about data. Rather than throwing every feature possible into a black box machine learning model and seeing what comes out, one should first think about if the data makes sense. Are the features meaningful, and do they reflect what you think they should mean? Given the way your data is distributed, which model should you be using? What does it mean if a value is missing, and what should you do with it? The answers to these questions differ depending on the problem you are solving, the way the data was logged, etc., and the best data scientists look for and adapt to these different scenarios.The best data scientists are also great at communicating, both to other data scientists and non-technical people. In order to be effective at Airbnb, our analyses have to be both technically rigorous and presented in a clear and actionable way to other members of the company.
Q: WHAT ADVICE WOULD YOU OFFER RESEARCHERS PREPARING FOR A POSITION AS A DATA SCIENTIST?
A: Beyond taking programming and statistics courses, I would recommend doing everything possible to get your hands dirty and work with real data. If you don’t have the time to do an internship, sign up to participate in hackathons or offer to help out a local startup by tackling a data problem they have. Courses and books are great for developing fundamental technical skills, but many data science skills can’t be properly developed in a classroom where data sets are well groomed.
This interview was first published on the website Master’s in Data Science; thanks to Josh Thompson for permission to reproduce it here.