Skip to content

[Summary] Add perturbation feature attribution methodsΒ #107

@gsarti

Description

@gsarti

πŸš€ Feature Request

The following is a non-exhaustive list of perturbation-based feature attribution methods that could be added to the library:

Method name Source In Captum Code implementation Status
(Layer) Feature Ablation1 - βœ… pytorch/captum
Occlusion Zeiler and Fergus '13 βœ… pytorch/captum βœ…
Shapley Value Sampling Castro et al. '09 βœ… pytorch/captum
Lime Ribeiro et al. '16 βœ… pytorch/captum βœ…
KernelShap Lundberg and Lee '17 βœ… pytorch/captum
Editing 2 - - -
Greedy Rationalization 3 Vafa et al. '21 - keyonvafa/sequential-rationales
Information Bottleneck Jiang et al. '20 - DFKI-NLP/thermostat
BayesLime Slack et al. '21 - dylan-slack/Modeling-Uncertainty-Local-Explainability
BayesSHAP Slack et al. '21 - dylan-slack/Modeling-Uncertainty-Local-Explainability
Input Reduction Feng et al. '18 - -
Input Marginalization Kim et al. '20 - -
Occlusion & Language Modeling Harbecke and Alt '20 - DFKI-NLP/OLM
Context Probing 4 CΓ­fka and Liutkus '22 - cifkao/context-probing
Weighted SHAP Kwon and Zou '22 - ykwon0407/WeightedSHAP
Value Zeroing Mohebbi et al. '23 - hmohebbi/ValueZeroing βœ…
Comprehensiveness-as-a-metric Zhou et al. '23 - YilunZhou/solvability-explainer
Sufficiency-as-a-metric Zhou et al. '23 - YilunZhou/solvability-explainer
Causal Tracing Meng et al. '22 - kmeng01/rome
Attention Knockout5 Geva et al. '23 - -
ReAGent Zhao et al. '24 - casszhao/ReAGent βœ…
SyntaxSHAP Amara et al. '24 - k-amara/syntax-shap

Notes:

  1. For more information on Editing, see point 3 in [Summary] Add metrics for feature attribution evaluationΒ #112 .

Footnotes

  1. Called ablation, but perform masking of features using a baseline.
  2. Editing replaces tokens with their nearest neighbors in the vocabulary embedding space and measures saliency as the drop in performance for the target. In the future, this can allow users to specify a custom editing strategy via an input Callable.
  3. Possibly overlapping with feature ablation up to some measure.
  4. Valid only for decoder-only models.
  5. Verify whether it would be exactly equivalent to Value Zeroing, include only if functionally different (alias otherwise).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is neededsummarySummarizes multiple sub-tasks

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions