Reverse Engineering AI Algorithms with XAI

Is it possible to reverse-engineer AI algorithms to understand their process? Can we use a model agnostic method that will explain AI without looking into the code at all? The answer to both questions is yes, we can. One method is SHAP.

Explainable Artificial Intelligence will become one of the key services AI will provide. As we know it today, social media forms a representation of society, not a full-blown social model. Our opinions, consumer habits, browsing data, and location history constitute a formidable data source for AI models. The sum of all of the information about our daily activities is challenging to analyze

We publish reviews everywhere. We write reviews about books, movies, equipment, smartphones, cars, and sports—everything in our daily lives. In this chapter, we will analyze IMDb reviews of films. IMDb offers datasets of review information for commercial and non -commercial use.

We will thus focus on SHapley Additive exPlanations (SHAP), part of the Microsoft Azure Machine Learning model interpretability solution. We can use "interpret" or "explain" for explainable AI. Both terms mean that we are providing an explanation or an interpretation of a model. We are also reverse-engineering the process of an algorithm.

SHAP was derived from game theory. Lloyd Stowell Shapley gave his name to this game theory model in the 1950s. In game theory, each player decides to contribute to a coalition of players to produce a total value that will be superior to the sum of their individual values. The Shapley value is the marginal contribution of a given player. The goal is to find and explain the marginal contribution of each participant in a coalition of players.

This article goes through:

- the key principles of SHAP
- the mathematical representation of SHAP
- a snippet in Python to illustrate the method

Key SHAP principles

In this section, we will learn about Shapley values through the principles of symmetry, null players, and additivity

Symmetry

If all of the players in a game have the same contribution, their contributions will be symmetrical. Suppose that, for a flight, the plane cannot take off without a pilot and a copilot. They both have the same contribution.

Null player

A null player does not affect the outcome of a model. Suppose we now consider a player in a basketball team. During this game, the player was supposed to play offense. However, when playing offense, this player's leading partner is suddenly absent. The player is both surprised and lost. For some reason, this player will contribute nothing to the result of that specific game.

Additivity

Our null basketball player was lost in the previous null player game in the previous section. The team manager realizes the talent of another player and adds this player to the team. The new player will add value to the team.

Mathematical representation

In a coalition game (N,v) , we need to find the payoff division . The payoff division is a unique fair distribution of the marginal contribution of each player to that player .

As such, the Shapley value will satisfy the null player, additivity, and symmetry/asymmetry properties we described in the previous sections.

We use words as players. Words are features in movie reviews, for example. Words [AS1] [DR2] are also players in game theory.

Words are sequence sensitive, which makes them an interesting way to illustrate Shapley values.

Consider the following three words in a set, a coalition N :

N = {excellent, not, bad}.

They seem deceivingly easy to understand and interpret in a review. However, if we begin to permute these three words, we will get several sequences depending on the order they appear in. Each permutation represents a sequence of words that belong to S, a subset of three words that belong to N

S 1 ={excellent, not bad}

S 2 = {not excellent, bad}

S 3 = {not bad, excellent}

S 4 = {bad, not excellent}

S 5 = {bad, excellent, not}

S 6 = {excellent, bad, not}

We can draw several conclusions from these six subsets of N

The number of sequences or permutations of all of the elements of a set S is equal to S!.

S! means multiply the chosen whole number from S down to 1. In this case, the number of elements in S =3. The number of permutations of S!=3x2x1=6. This explains why there are six sequences of the three words we are analyzing.

In the first four subsets of S , the sequence of words completely changes the meaning of the phrase. This explains why we calculate the number of permutations.

In the last two subsets, the meaning of the sequence is confusing, although they could be part of a longer phrase. For example, s 5 = {bad, excellent, not} could be part of a longer phrase sequence such as {bad, excellent, not clear if I like this movie or not}

In this example, N contained three words, and we chose a subset S of N containing all three words. We could have selected a subset of S of N that only contained two words, such as S ={excellent, bad}

We would now like to know the contribution of a specific word i to the meaning and sentiment (good or bad) of the phrase. This contribution is the Shapley value .

The Shapley value for a player i is expressed as follows :

Let's translate each part of this mathematical expression into natural language:

is pronounced “phi” (the Shapley value) of i

means that the Shapley value “phi” of i in a coalition N with a value of v “equals” the terms that follow in the equation

At this point, we know that we are going to find the marginal contribution “phi” of i in a coalition N of value v . For example, we could like to know how much “excellent” contributed to the meaning of a phrase .

means that the elements of S will be included in N but not i . We want to compare a set with and without i. If N ={excellent, not, bad}, S could be {bad, not} with i =”excellent ".

If we compare S with and without “excellent,” we will find a different marginal contribution value v:

will divide the sum(represented by the symbol ∑) of the value of all the possible values of i in all of the subsets S

means that we are calculating a weight parameter to apply to the values. This weight multiplies all of the permutations of S! by the potential permutations of the remaining words that were not part of S. For example, if S =2 because it contains { bad, not} then NS=3-2=1. Then we take i out represented by -1 . You may be puzzled because this adds up to 0. However, keep in mind that 0!=1 .

We now calculate the value v of a subset S of N containing i with the same subset with i:

We now know the value of i for this permutation. For example, you can see the value of the word “excellent” in the following sentence that contains it, and the one after that doesn't :

v (“An excellent job”) – v (“a job”) means that “excellent” has a high marginal value in this case .

Once every possible subset S of N has been evaluated, we will divide it evenly and fairly by N!, which represents the number of sequences, of permutations we calculated :

We can now translate the Shapley value equation back into a mathematical expression :

Mathematical Conclusions

A key property of the Shapley value approach is that it is model-agnostic . We do not need to know how a model produced the value v. We only observe the model's inputs and outputs to explain the marginal contribution of a feature.

Reverse engineering information outputs in the media is a fundamental tool in our information era. Model-agnostic Explainable AI(XAI) is a productive way to understand how information is represented.

A Code Snippet

Let 's focus on the output of a SHAP program in Python.

My full open source code is on GitHub :

FullFull SHAP Open Source Example in Pythonhttps://github.com/PacktPublishing/Hands-On-Explainable-AI-XAI-with-Python/blob/master/Chapter04/SHAP_IMDB.ipynb

Installing SHAP in Python is done in one line :

! pip install shap

The code imports and processes the dataset which contains reviews for IMDB movies.

Further in the code, a linear regression model is applied to the dataset to determine if a review was positive or negative (sentiment analysis):

model = sklearn.linear_model.LogisticRegression(penalty="l1", C=0.1)

However, we are not interested in going into this model. In the Python code, it could have been any machine learning or deep learning model. SHAP is model-agnostic. SHAP does not look into the algorithm but analyzes the input/output.

Let 's go directly to the explanation part of the program to create an explainer and plot a result .

Creating the Linear Model Explainer

Let's create an explainer :

# @title Explain Linear Model
explainer = shap.LinearExplainer(model, X_train, feature_pertubation="interventional")

Now we retrieve the Shapley values of the testing dataset :

shap_values = explainer.shap_values(X_test)

We must now convert the test dataset into an array for the plot function :

X_test_array = X_test.toarray() # we need to pass a dense version for the plotting functions

The linear explainer is ready. We can now create the plot function.

Creating the Plot function

We will now add a plot function to the program and explain a review to understand the process .

The plot function begins with a form to choose a review number from the dataset.

Note: The program is a prototype. Make sure you enter small integers in the form since the test dataset size varies each time the program is run.

Now the program displays a SHAP plot:

# @title Explaining Reviews {display-mode: "form"}
review = 2#@param {type: "number"}
shap. initjs ()
ind = int(review)
shap.force_plot(
explainer.expected_value, shap_values[ind, :], X_test_array[ind,:],
feature_names=vectorizer.get_feature_names( )

In this post, the prediction is positive. The sample review is :

"I recommend that everybody go see this movie!"

Once a review number is chosen, the model's prediction for that review will be displayed with the Shapley's values that drove the prediction :

The Shapley values on the left (red in the color images) push the prediction to the right, to potentially positive results. In this example, the features (words) “recommend” and “see” contribute to a positive prediction :

The Shapley values on the right ( the blue color in the color images) push the prediction to the left, to potentially negative results .

The plot also shows a value name “base value” as you can see in Figure 4.4 :

The base value is the prediction the model would make if it did not take the features of the present dataset number output into account based on the whole dataset .

The SHAP plot we implemented, explains and interprets the results with Shapley values .

Summary

Explainable AI is a critical field in our information era.

Model agnostic explainable methods such as SHAP can reverse engineer any artificial intelligence model. It is important to explain how an AI algorithm reached a conclusion.

The first step to understanding SHAP is to explore the mathematical background of the method.

Finally, running a Python program and displaying a plot of a SHAP provides a tool to begin explaining the outputs of AI algorithms.