[Summary] Add perturbation feature attribution methods

## 🚀 Feature Request

The following is a non-exhaustive list of perturbation-based feature attribution methods that could be added to the library:

<table>
<tr>
	<td> Method name </td>
	<td> Source </td>
	<td> In Captum </td>
	<td> Code implementation </td>
	<td> Status</td>
<tr>
	<td> <a href="https://captum.ai/api/feature_ablation.html">(Layer) Feature Ablation</a><a href="#fnote1" id="fnote1-ref" data-footnote-ref="" aria-describedby="footnote-label">1</a> </td>
	<td> - </td>
	<td> ✅ </td>
	<td> <a href="https://github.com/pytorch/captum"><code>pytorch/captum</code></a> </td>
	<td> </td>
<tr>
	<td> <a href="https://captum.ai/api/occlusion.html">Occlusion</a> </td>
	<td> <a href="https://arxiv.org/abs/1311.2901">Zeiler and Fergus '13</a> </td>
	<td> ✅ </td>
	<td> <a href="https://github.com/pytorch/captum"><code>pytorch/captum</code></a> </td>
	<td> ✅ </td>
<tr>
	<td> <a href="https://captum.ai/api/shapley_value_sampling.html"> Shapley Value Sampling </a> </td>
	<td> <a href="https://www.sciencedirect.com/science/article/abs/pii/S0305054808000804">Castro et al. '09</a> </td>
	<td> ✅ </td>
	<td> <a href="https://github.com/pytorch/captum"><code>pytorch/captum</code></a> </td>
	<td> </td>
<tr>
	<td> <a href="https://captum.ai/api/lime.html"> Lime </a> </td>
	<td> <a href="https://arxiv.org/abs/1602.04938">Ribeiro et al. '16</a> </td>
	<td> ✅ </td>
	<td> <a href="https://github.com/pytorch/captum"><code>pytorch/captum</code></a> </td>
	<td> ✅ </td>
<tr>
	<td> <a href="https://captum.ai/api/kernel_shap.html"> KernelShap </a> </td>
	<td> <a href="https://arxiv.org/abs/1705.07874"> Lundberg and Lee '17 </td>
	<td> ✅ </td>
	<td> <a href="https://github.com/pytorch/captum"><code>pytorch/captum</code></a> </td>
	<td> </td>
<tr>
	<td> Editing <a href="#fnote2" id="fnote2-ref" data-footnote-ref="" aria-describedby="footnote-label">2</a> </td>
	<td> - </td>
	<td> - </td>
	<td> - </td>
	<td> </td>
<tr>
	<td> Greedy Rationalization <a href="#fnote3" id="fnote3-ref" data-footnote-ref="" aria-describedby="footnote-label">3</a> </td>
	<td> <a href="https://arxiv.org/abs/2109.06387"> Vafa et al. '21 </td>
	<td> - </td>
	<td> <a href="https://github.com/keyonvafa/sequential-rationales"><code>keyonvafa/sequential-rationales</code></a> </td>
	<td> </td>
<tr>
	<td> Information Bottleneck </td>
	<td> <a href="https://aclanthology.org/2020.findings-emnlp.343/"> Jiang et al. '20 </td>
	<td> - </td>
	<td> <a href="https://github.com/DFKI-NLP/thermostat/blob/main/src/thermostat/explainers/iba.py"><code>DFKI-NLP/thermostat</code></a> </td>
	<td> </td>
<tr>
	<td> BayesLime </td>
	<td> <a href="https://arxiv.org/abs/2008.05030"> Slack et al. '21 </td>
	<td> - </td>
	<td> <a href="https://github.com/dylan-slack/Modeling-Uncertainty-Local-Explainability"><code>dylan-slack/Modeling-Uncertainty-Local-Explainability</code></a> </td>
	<td> </td>
<tr>
	<td> BayesSHAP </td>
	<td> <a href="https://arxiv.org/abs/2008.05030"> Slack et al. '21 </td>
	<td> - </td>
	<td> <a href="https://github.com/dylan-slack/Modeling-Uncertainty-Local-Explainability"><code>dylan-slack/Modeling-Uncertainty-Local-Explainability</code></a> </td>
	<td> </td>
<tr>
	<td> Input Reduction </td>
	<td> <a href="https://aclanthology.org/D18-1407/"> Feng et al. '18 </td>
	<td> - </td>
	<td> - </td>
	<td> </td>
<tr>
	<td> Input Marginalization </td>
	<td> <a href="https://www.aclweb.org/anthology/2020.emnlp-main.255"> Kim et al. '20 </td>
	<td> - </td>
	<td> - </td>
	<td> </td>
<tr>
	<td> Occlusion & Language Modeling </td>
	<td> <a href="https://aclanthology.org/2020.acl-srw.16/"> Harbecke and Alt '20 </td>
	<td> - </td>
	<td> <a href="https://github.com/DFKI-NLP/OLM"><code>DFKI-NLP/OLM</code></a> </td>
	<td> </td>
<tr>
	<td><a href="https://cifkao.github.io/context-probing/"> Context Probing </a> <a href="#fnote4" id="fnote4-ref" data-footnote-ref="" aria-describedby="footnote-label">4</a> </td>
	<td> <a href="https://arxiv.org/abs/2212.14815"> Cífka and Liutkus '22 </td>
	<td> - </td>
	<td> <a href="https://github.com/cifkao/context-probing"><code>cifkao/context-probing</code></a> </td>
	<td> </td>
<tr>
	<td> Weighted SHAP </td>
	<td> <a href="https://arxiv.org/abs/2209.13429"> Kwon and Zou '22</td>
	<td> - </td>
	<td> <a href="https://github.com/ykwon0407/WeightedSHAP"><code>ykwon0407/WeightedSHAP</code></a> </td>
	<td> </td>
<tr>
	<td> Value Zeroing </td>
	<td> <a href="http://arxiv.org/abs/2301.12971"> Mohebbi et al. '23</td>
	<td> - </td>
	<td> <a href="https://github.com/hmohebbi/ValueZeroing"><code>hmohebbi/ValueZeroing</code></a> </td>
	<td> ✅ </td>
<tr>
	<td> Comprehensiveness-as-a-metric </td>
	<td> <a href="https://arxiv.org/abs/2205.08696"> Zhou et al. '23</td>
	<td> - </td>
	<td> <a href="https://github.com/YilunZhou/solvability-explainer"><code>YilunZhou/solvability-explainer</code></a> </td>
	<td> </td>
<tr>
	<td> Sufficiency-as-a-metric </td>
	<td> <a href="https://arxiv.org/abs/2205.08696"> Zhou et al. '23</td>
	<td> - </td>
	<td> <a href="https://github.com/YilunZhou/solvability-explainer"><code>YilunZhou/solvability-explainer</code></a> </td>
	<td> </td>
<tr>
	<td> Causal Tracing </td>
	<td> <a href="https://rome.baulab.info/"> Meng et al. '22</td>
	<td> - </td>
	<td> <a href="https://github.com/kmeng01/rome#causal-tracing"><code>kmeng01/rome</code></a> </td>
	<td> </td>
<tr>
	<td> Attention Knockout<a href="#fnote4" id="fnote5-ref" data-footnote-ref="" aria-describedby="footnote-label">5</a> </td>
	<td> <a href="https://arxiv.org/abs/2304.14767"> Geva et al. '23</td>
	<td> - </td>
	<td> - </td>
	<td> </td>
<tr>
	<td> ReAGent</td>
	<td> <a href="https://arxiv.org/abs/2402.00794"> Zhao et al. '24</td>
	<td> - </td>
	<td> <a href="https://github.com/casszhao/ReAGent"><code>casszhao/ReAGent</code></a> </td>
	<td> ✅ </td>
<tr>
	<td> SyntaxSHAP</td>
	<td> <a href="https://arxiv.org/abs/2402.09259"> Amara et al. '24</td>
	<td> - </td>
	<td> <a href="https://github.com/k-amara/syntax-shap"><code>k-amara/syntax-shap</code></a> </td>
	<td> </td>
</table>

**Notes**:

1. For more information on Editing, see point 3 in #112 .

<section data-footnotes="" class="footnotes"><h2 id="footnote-label" class="sr-only"></h2>
<ol dir="auto">
<li id="fnote1"> Called ablation, but perform masking of features using a baseline.
</li>
<li id="fnote2"> Editing replaces tokens with their nearest neighbors in the vocabulary embedding space and measures saliency as the drop in performance for the target. In the future, this can allow users to specify a custom editing strategy via an input <code>Callable</code>.
</li>
<li id="fnote3"> Possibly overlapping with feature ablation up to some measure.
</li>
<li id="fnote4"> Valid only for decoder-only models.
</li>
<li id="fnote5"> Verify whether it would be exactly equivalent to Value Zeroing, include only if functionally different (alias otherwise).
</li>

Method name	Source	In Captum	Code implementation	Status
(Layer) Feature Ablation¹	-	✅	`pytorch/captum`
Occlusion	Zeiler and Fergus '13	✅	`pytorch/captum`	✅
Shapley Value Sampling	Castro et al. '09	✅	`pytorch/captum`
Lime	Ribeiro et al. '16	✅	`pytorch/captum`	✅
KernelShap	Lundberg and Lee '17	✅	`pytorch/captum`
Editing ²	-	-	-
Greedy Rationalization ³	Vafa et al. '21	-	`keyonvafa/sequential-rationales`
Information Bottleneck	Jiang et al. '20	-	`DFKI-NLP/thermostat`
BayesLime	Slack et al. '21	-	`dylan-slack/Modeling-Uncertainty-Local-Explainability`
BayesSHAP	Slack et al. '21	-	`dylan-slack/Modeling-Uncertainty-Local-Explainability`
Input Reduction	Feng et al. '18	-	-
Input Marginalization	Kim et al. '20	-	-
Occlusion & Language Modeling	Harbecke and Alt '20	-	`DFKI-NLP/OLM`
Context Probing ⁴	Cífka and Liutkus '22	-	`cifkao/context-probing`
Weighted SHAP	Kwon and Zou '22	-	`ykwon0407/WeightedSHAP`
Value Zeroing	Mohebbi et al. '23	-	`hmohebbi/ValueZeroing`	✅
Comprehensiveness-as-a-metric	Zhou et al. '23	-	`YilunZhou/solvability-explainer`
Sufficiency-as-a-metric	Zhou et al. '23	-	`YilunZhou/solvability-explainer`
Causal Tracing	Meng et al. '22	-	`kmeng01/rome`
Attention Knockout⁵	Geva et al. '23	-	-
ReAGent	Zhao et al. '24	-	`casszhao/ReAGent`	✅
SyntaxSHAP	Amara et al. '24	-	`k-amara/syntax-shap`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Summary] Add perturbation feature attribution methods #107

🚀 Feature Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Summary] Add perturbation feature attribution methods #107

Description

🚀 Feature Request

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions