Sprint 4 — Product Integration
This is a continuation of a series that started here. The article is based on my own experience as an ML PM, yours may be different.
The last article was all about the documentation required to assist in productizing your work. What does it mean to productize something? And what about ML in a product?
Products are a special consumer of ML. When I start working with a data scientist that is not familiar with product, I try to teach them about the most important concepts for the integration of ML:
- Automated — The models have to run on their own
- Monitored — We need to know when their performance is degraded.
- Repeatable — The inference needs to be consistent.
- Scalable — Need to go from 1 user to N users
- Reliable — If the model can’t generate an out to a specific input, there must be a plan B.
The users of a product expect reliable, as-expected behavior. Every time. This may seem like an obvious concept but it is not. So let’s dissect the bulleted items.
Automated means that you will not be launching or manually connecting any of your SQL, python, R or other scripts that you used to create your output. So if you have a data cleaning step that relies on a person pulling in data from the correct table and tweaking code because the dates aren’t as expected. Well, your solution isn’t automated is it? If you work for a process driven product company like I do, a Dev Team that will take your 20 different scripts and put them into a software pipeline. That’s great…but it doesn’t let you off the hook. It will be easier for you and everyone else if you think about the implication of removing the human from the process of running your solution. Be aware of how you were interacting with the process to get the desired output.
Monitoring is a requirement. Don’t let anyone tell you any differently. The monitoring is how you will continue your interaction with the model. ML models degrade as soon as they are put into product. Why? Because you will be exposing them to data that doesn’t match your training and validation data from a population perspective (thus feature drift) or individual perspective (outliers). And, we haven’t even touched on the problem of bias. While you have developed the model in your controlled safe IDE, ask yourself “How would I know if it’s failed?” then determine what you need to measure …as if all that you have are these as clues on its performance. Because it will be.
Results that are not repeatable are not useful. Your definition of repeatable may not be the same as the user’s. You may know that 2 and 2.0001 and 1.9999999999999 are different but to the user these are the same. Even if they know that they are different inputs, they will expect the same results (or at least VERY close). Consider what will cause your solution to behave in a way that is or appears to be non-repeatable. Then, address it.
Scalable will mean different things depending on the type of product you are working on. It could mean that inferences will be required at a much higher throughput because there are many users, larger data sets, or both. As part of productization, your solution will need to be scaled to match the real world demands. You are key to supporting these efforts. Frequently I ask the DS teams that I work with, the following questions:
- Are any of the features you calculate unnecessarily complex?
- Do you have too many features?
- Can any of the pipeline — from data to inference — be made parallel?
- Can any of the data be processed in advance? Can just the delta, in your data, be processed?
While the Dev Team may be amazing at optimizing code, your insight into the design can make the difference between a scalable solution or not.
Reliable code doesn’t crash when the unexpected happens. When the model fails when data is missing, not all features can be calculated, or there is not enough historical information, or for a myriad of other reasons … then what’s going to happen? The method that is supposed to consume the output of the model is still expecting an input. So figure out how to impute the output of the model, or the input. Have a plan B.
That’s it for this sprint. Future sprints will include my thoughts on DS requirements, creating a good enough solution, and product iteration. Please LMK what topics are of interest to you.