We want models that perform well for not only whole datasets but also for every potential edge case it might encounter out in the wild. The problem when we strive to achieve that goal, it is easy to be overwhelmed by the number of possibilities and suffer from analysis paralysis.
A better and more realistic approach would be to increase the model’s performance one slice of data at a time.
The data distribution tab on the error analysis panel helps us identify what are the most common mistakes our models are making. A good idea is, then, to focus on improving model performance on these error classes iteratively for the next rounds of ML development.
When you click on the Data distribution tab, the Error analysis panel is divided in two.
On the left-hand part, you see the labels for your task. In our churn binary classifier, we see the two classes there: “Retained” and “Exited”. Furthermore, right beneath each tag, we see the performance, measured by aggregate metrics per class. Using aggregate metrics per class is particularly important when working with unbalanced datasets, where the model performance on the majority class might distort some of the metrics.
On the right-hand part, we see the different error classes. This is a flattened confusion matrix.
As a side note, notice that we have an unbalanced dataset and most of the data is from the
Retained class. Churn problems are indeed often imbalanced, after all, at any given time there are (hopefully) many more users that will continue using our platform than users churning.
Looking at the Error analysis panel, can you spot the most common mistake our model makes? Can you filter the data to have a closer look?
The most common error class our model is making is predicting users will be retained when in fact they churn. Maybe, for the next quarter, we would like to focus on improving the model performance in this error class.
Documenting error classes
Can you filter the dataset to show only samples our model predicted as
Retainedbut that the label was
Exitedand tag them with the name
Q4(so that the whole team knows that this is the priority for the next quarter)?
- Focus on one digestible chunk of the data at a time, and systematically improve the model’s performance gradually.
Updated 3 months ago