Introduction

The case studies in this section of the course focus on how data scientists are helping shape the development of AI products. These are more advanced case studies and assume prior knowledge about general product data science topics (how to design an A/B test, develop metrics, drive product development through insights, etc.). Most of the focus is on what's unique about Gen AI data science vs standard product data science. That is, classical product case studies are not included here unless Gen AI leads to a different approach or a new twist.

Some recurring topics to keep in mind:

There are two kinds of new AI products: those that lead to a brand new user experience (e.g. Meta AI Studio, Notebooklm, etc.) and those that replace a current user experience (e.g. Google AI Overview replacing search, AI customer service replacing human customer service, etc.). In a job interview, as a first step categorize the type of AI product they are asking about. The approach is completely different.

For brand new experiences, at first the goal is giving them into the hands of tech savvy/power users. A big part of Gen AI DS work is how to do that. For the products that replace a current experience, the goal is identifying the lowest risk segments and starting from there. Either way, the early data will be biased and a big part of the job is how to infer the effect on the whole population using biased data.

Unlike in the past, new AI products have virtually infinite possible outputs. This adds new layers of complexity to the DS work. Firstly, this affects the launch strategy (we want to target users more likely to ask specific types of questions). Secondly, post-launch data analysis is not only about analyzing the common product metrics, but also about checking how the AI specifically did on corner cases/outliers. Identifying and analyzing these is a very relevant part of the job. In many cases, DS work follows these two tracks: AI model analysis and product metrics. Many interview questions require you to touch on both.

In standard product DS, the most important thing was typically identifying early actions that led to future retention. In DS Gen AI, that often translates into what kind of questions users ask that lead to retention. In most cases, the question characteristics (topics, complexity, asking follow ups, etc.) can be used to predict retention. Often, it's useful to separate users acting out of curiosity vs getting actual value. If you can do this, you are automatically identifying retained vs non-retained. So, prepare in advance by thinking about user experience variables that can be a proxy for curiosity vs getting value for a given product. Obviously, this is also a good strategy to identify users affected by novelty effect.

For a Gen AI DS job, you need to be familiar with how to build an AI product. Not that you will actually build it (unless it's a small company), but you should still know how to do it. This is similar to how a Fraud DS was unlikely to put ML models into production, but needed to have knowledge of ML. Building an AI product means a wrapper on top of a major LLM model, it obviously doesn't mean having to build an LLM.

Most important of all, it is true that many things have changed (e.g. a biased set of users is trying the new product first, average-based metrics might be better than percentile ones, AI products have virtually infinite outcomes, cost of bad outcomes in a model can be much higher, non-inferiority tests more popular, etc.). However, what has not changed is the end goal of the job: using data to move key product metrics by improving the user experience. So, while you need to be very clear about what's unique about Gen AI DS, you also don't want to lose track of the #1 reason they are hiring you: you should always be able to trace back your answer to how this will benefit the company's main metrics and what data you would use to justify your product conclusions. Also, the general product DS concepts are as important as ever, like for instance removing friction to incentivize positive actions (there is currently a lot of friction in AI products and DSs need to figure out how to smoothen this out).

ps. There are many case studies in this section of the course and ideally they should touch on any possible interview question topic. However, if there are additional Gen AI case studies that you would like to see added, email me at [email protected] with the subject line "New Gen AI case study request", using the same email as your account. I will then write the answer and add it to the course.

Complete and Continue