Comparing Shapley Value Approximation Methods for Unsupervised Feature Importance
DOI:
https://doi.org/10.11576/dataninja-1158Keywords:
Shapley values, feature importance scores, unsupervised learningAbstract
Assigning importance scores to features is a common approach to gain insights about a prediction model’s behavior or even the data itself. Beyond explainability, such scores can also be of utility to conduct feature selection and make unlabeled high-dimensional data manageable. One way to derive scores is by adopting a game-theoretical view in which features are understood as agents that can form groups and cooperate for which they obtain a reward. Splitting the reward among the features appropriately yields the desired scores. The Shapley value is the most popular reward sharing solution. However, its exponential complexity renders it inapplicable for high-dimensional data unless an efficient approximation is available. We empirically compare selected approximation algorithms for quantifying feature importance on unlabeled data.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Patrick Kolpaczki
This work is licensed under a Creative Commons Attribution 4.0 International License.