explain (verb)
- To make known
- To make plain or understandable
- To give the reason for or cause of
- To show the logical development or relationships of
Source: Merriam-Webster
Understanding is an unending activity by which, in constant change and variation, we come to terms with and reconcile ourselves to reality, that is, try to be at home in the world.
Source: Hannah Arendt, Understanding and Politics (1954)
- A scientific theory is like a computer program that predicts our observations, the experimental data.
- A useful theory is a compression of the data; comprehension is compression.
- You compress things into computer programs, into concise algorithmic descriptions.
- The simpler the theory, the better you understand something.
Source: Gregory Chaitin, The Limits of Reason (2006)
Occam’s Razor
Simpler explanations are preferred.
No one really understands quantum mechanics.
Source: Richard Feynman
Experimental science describes the world using guesses, known as hypotheses.
For a hypothesis to be scientific, it must be testable / falsifiable.
If we can design an experiment that could show the hypothesis to be false, it is scientific.
If we run a repeatable experiment and the hypothesis is not falsified, we say it is verified.
Widely accepted and verified hypotheses are often called scientific theories or scientific laws.
Science does not claim these laws are true, only that they have not yet been falsified.
We never are right. We can only be sure we are wrong.
Source: Richard Feynman in The Feynman Lectures on Physics
An example of mathematical axioms is the set of Euclid’s Axioms / Postulates in Euclidean geometry. These describe how points, lines, and shapes behave.
Models can be loosely categorized as:
Intrinsically interpretable models
“Black-box” models with post-hoc explainability, aka explainable models
One may want to explain:
The explanation may focus on:
The techniques used may be:
The field of explaining complex ML models is often called Explainable AI (XAI).
When explanation techniques work (they don’t always!), they can provide several benefits:
Warning
Explanation does not imply causation!
“Everyone who is serious in the field knows that most of today’s explainable A.I. is nonsense,”
Source: Zachary Lipton in “What’s Wrong with Explainable A.I.?” (2022)
Value/reward function \(v(S)\):
| Combination (\(S\)) | Value |
|---|---|
| \(\emptyset\) | 0 |
| \(\{A\}\) | 10 |
| \(\{B\}\) | 0 |
| \(\{C\}\) | 0 |
| \(\{A,B\}\) | 10 |
| \(\{A,C\}\) | 10 |
| \(\{B,C\}\) | 30 |
| \(\{A,B,C\}\) | 60 |
Total value to distribute:
\[ v(\{A,B,C\}) = 60 \]
There are \(3! = 6\) permutations, each representing a different order in which players join the team:
We compute contributions along each ordering.
Sum: \(10 + 0 + 50 = 60\)
Sum: \(60\)
Sum: \(60\)
Sum: \(60\)
Sum: \(60\)
Sum: \(60\)
Average each player’s contributions across permutations:
Player A \[ (10 + 10 + 10 + 30 + 10 + 30) / 6 = 16 + \frac{2}{3} \]
Player B \[ (0 + 50 + 0 + 0 + 50 + 30) / 6 = 21 + \frac{2}{3} \]
Player C \[ (50 + 0 + 50 + 30 + 0 + 0) / 6 = 21 + \frac{2}{3} \]
Sum of contributions = \(16 + \frac{2}{3} + 21 + \frac{2}{3} + 21 + \frac{2}{3} = 60\).
Let
\[ \phi_j = \sum_{S \subseteq N \setminus \{j\}} \frac{|S|!(|N|-|S|-1)!}{|N|!} [v(S \cup \{j\}) - v(S)] \]
In ML, the “players” are the input features, and the “payout” is the model’s prediction.
We want to explain the scalar output of \(\hat{f}(x)\) for a specific instance \(x\).
However, we cannot simply remove features from the model.
Instead, we consider the expected model output when only a subset of features \(S\) is known (taken from \(x\)), and the rest are randomly sampled from the data distribution.
Let \(x_S\) be the subset of \(x\) input to \(\hat{f}\) for combination \(S\) and \(X_{-S}\) be the random vector of removed features, to be sampled from the data distribution.
The value function is defined as: \[ v(S) = E_{X_{-S}}\!\left[\hat f(x_S, X_{-S})\right] \]
LIME is a local explainability method that approximates a complex model \(\hat f\) locally around a specific instance \(x\) with a simple, interpretable surrogate model \(g\).