iTranslated by AI
Nuances Between Interpretability and Explainability in Machine Learning
Although these two concepts are often used interchangeably, strictly speaking, there are nuances of difference as follows (or so I personally believe). If you have any thoughts or opinions, please let me know in the comments!
Interpretability
For machine learning models (which are often black boxes), being able to clarify the mechanism (itself) by which the model returns predictions.
Explainability
Regarding the predictions of a machine learning model, being able to explain why that specific prediction was returned.
High interpretability does not necessarily mean high explainability
In highly interpretable methods such as linear regression or decision trees, it is possible to calculate "features," which are the mechanisms contributing to the model's predictions. However, when an input with multiple variables having large feature values is fed into the model, it cannot explain how the prediction was reached (i.e., which variable ultimately had the most impact). Similarly, criteria such as how much a specific variable needs to change for the prediction to change cannot be explained either.
High explainability does not necessarily mean high interpretability
In recent neural network research, models have been proposed that output the reason for a prediction, such as in text, along with the prediction itself. Using such models results in high explainability because the reason for the prediction is explained for any given input. However, the underlying mechanism of why that specific reason was output remains a black box, resulting in low interpretability.
Discussion