Exploring explainability

Global, local, and cohort explanations

So far in the tutorial, we’ve been conducting error analysis mainly by identifying slices of the data over which our model does not perform as well as we desired. However, so far we haven’t asked a fundamental question, which is why does our model behave the way it does?

To understand some of the driving forces behind our model’s predictions, we will make use of explainability techniques.

In broad strokes, explainability techniques give us justifications for our model’s predictions. These explanations can be local, cohort, or global and each one provides a distinct perspective to practitioners and businesses. Let’s explore these three layers of explainability for our banking chatbot model, following a bottom-up approach.

Local explanations

Local explanations provide insights into individual model predictions.

Using our chatbot, local explanations help us answer the question why did our model classify a specific sentence with a particular label?

To have a look at local explanations, click on any row of the data shown below the Error analysis panel. With Unbox, you have access to local explanations for all of the model’s predictions, powered by LIME, one of the most popular model-agnostic explainability techniques.

Let’s now understand what we see.

Each token receives a score. Values shown in shades of green indicate the tokens that contributed to the model’s prediction toward the correct direction. Values shown in shades of red indicate the tokens that contributed negatively to the model’s prediction, pushing it in the wrong direction. Therefore, it is important to remember that these values are always relative to the true label.

In this specific example, the user simply wanted to get a spare card (true label is getting_spare_card), but the model thought it was from the class card_linking. Notice that the words “card” and “linked” really pushed our model’s prediction in the wrong direction, that’s why our model got it wrong.

At the end of the day, the model’s prediction is a balance between features that push it in the right direction and features that nudge it in the wrong direction.

Error analysis needs to be a scientific process, incorporating hypothesizing and experimenting to its core. That’s one of the roles of the what-if analysis.

To conduct a what-if analysis with local explanations, we can simply click on Edit, modify the sentence and click on What-if, at the bottom of the page. For example, if we rephrase the original sentence to “How can I create another card for this account?”, what would our model do?

Now we can directly compare the two explanations. Notice that by simply rephrasing the problematic sentence, our model was able to correctly classify it. This might be an indication that our model could benefit if we augment our training set using synonym expressions.

👍

Comparing explanations

Feel free to explore some local explanations. Can you use the what-if analysis and play with the feature values to flip our model’s prediction in other samples?

📘

Actionable insights:

  • Help practitioners get to the root cause of problematic predictions their models are making;
  • Build confidence that the model is taking into consideration reasonable data to make its predictions and not simply over-indexing to certain tokens.

Cohort explanations

Now we move one layer up, to cohort explanations.

Cohort explanations are built by aggregating local explanations and help us understand which tokens or stopwords contributed the most to the (mis)predictions made by the model over a data cohort.

For example, for the messages about cards, what were the tokens and stopwords that contributed the most to our model’s mispredictions? What about our model’s correct predictions?

These kinds of questions can be easily answered with Unbox. The answers are going to be shown in the Feature importance tab of the Error analysis panel. However, we need to first filter the data cohort that we are interested in explaining.

👍

Identifying the most mispredictive tokens

Filter the data to show only rows for messages about cards. Then, head to the Feature importance tab on the Error analysis panel to look at the most predictive and mispredictive features. Hint: remember we created a tag for this exact query? Can filter data using a tag?

When you filter data cohorts, what we see in the Feature importance tab are the most predictive and most mispredictive tokens and stopwords for that specific data cohort.

👍

Using the mispredictive tokens

Click on one of the blocks shown to see what happens. Did you notice what happened to the data shown below the Error analysis panel?

When we click on one of the blocks shown, the data shown below the Error analysis panel displays only the rows that fall within that category.

If you want to, you can tag them to document error patterns and later create tests or generate synthetic data, download the rows, among other possibilities.

The same analysis can be done with the most predictive features, which are the features that contributed the most in the correct direction to the model’s predictions.

👍

Identifying most mispredictive tokens for an error class

First, clear all the filters in the filter bar. Now, can you have a look at the most mispredictive tokens and stopwords for the samples our model predicted as Refund_not_showing_up but for which the label was request_refund? Hint: you can filter the different error classes in the Data distribution tab of the Error analysis panel.

📘

Actionable insights:

  • Practitioners can identify multiple ways to improve their model’s performance. For example, they can identify underrepresented expressions on the dataset, which might be leading to model mistakes, or model over-indexing to certain tokens.

Global explanations

Global explanations help reveal which tokens and stopwords contributed the most to the (mis)predictions made by the model over a dataset.

To look at the global explanations, you need to clear all the filters from the filter bar and go to the Feature importance panel. Since no data cohort is selected, what is shown there are the most predictive and mispredictive tokens for our model across the whole dataset.

For example, let’s have a look at the most predictive features for our banking chatbot across the whole dataset.

We notice that “card” and “money” appear as the most predictive tokens. Curiously, they are also pointed out as the most mispredictive tokens. This overlap between predictive and mispredictive tokens is a sign that the model might be over-indexing them instead of learning the nuances that differentiate messages from the various classes.

As usual, clicking on the blocks in the Error analysis panel filters the data slice shown at the bottom. Thus, you can easily tag the displayed rows to document patterns and ensure reproducibility.


Did this page help you?