Skip to content

smsnobin77/Awesome-Multimodal-Unlearning

Repository files navigation

Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks

  
 

Authors: Nobin Sarwar, Shubhashis Roy Dipta, Zheyuan Liu, Vaidehi Patil
Affiliations: University of Maryland, Baltimore County  ·  University of Notre Dame  ·  UNC Chapel Hill

We welcome issues for any related work not discussed and will consider inclusion in future updates.

🎉 Latest News

  • [2026-04-06] 🎉 Our Multimodal Unlearning Survey is accepted as Findings at ACL 2026. We will present this work at the ACL 2026 in San Diego, CA.
  • [2026-02-28] 🌐 We release the Project Page for our Multimodal Unlearning survey.
  • [2026-02-14] 🚀 We launch the Awesome Multimodal Unlearning repository to track methods, datasets, and benchmarks. Check it out: GitHub
  • [2026-01-26] 📄 Our survey on Multimodal Unlearning is released on TechRxiv. See the paper: TechRxiv

📌 Citation

If our work supports your research or applications, we would appreciate a ⭐ and a citation using the BibTeX below.

@inproceedings{sarwar2026mm-unlearning-survey,
  title     = {{Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks}},
  author    = {Sarwar, Nobin and Roy Dipta, Shubhashis and Liu, Zheyuan and Patil, Vaidehi},
  booktitle = {Findings of the Association for Computational Linguistics: ACL 2026},
  year      = {2026},
  month     = jul,
  publisher = {Association for Computational Linguistics},
  url       = {https://doi.org/10.36227/techrxiv.176945748.88280394/v1}
}

📚 Contents

🧭 Overview

Multimodal unlearning requires identifying effective intervention points within the model pipeline. Figure 2 illustrates methods spanning data-side, training-time, architecture-constrained, and decoding-time stages, producing an updated model (MFM′). Training-free approaches instead apply direct parameter or representation edits (Δ).


Figure 2: System-level intervention points for multimodal
unlearning across the model pipeline.

📊 Comparison with Existing Surveys

While several surveys address multimodal unlearning, most remain limited to unimodal or text–image settings and adopt algorithm-centric taxonomies that overlook practical intervention points. A unified cross-modal perspective remains lacking; we therefore summarize representative surveys across modalities and system-first taxonomy coverage.

Date Title Authors Paper System-first Text Image Video Audio
2023‑11 Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges Si et al. arXiv
2024‑02 Rethinking Machine Unlearning for Large Language Models Liu et al. arXiv
2024‑04 Digital Forgetting in Large Language Models: A Survey of Unlearning Methods Blanco-Justicia et al. arXiv
2024‑07 Machine Unlearning in Generative AI: A Survey Liu et al. arXiv
2025‑03 A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models Geng et al. arXiv
2025‑07 A Survey on Generative Model Unlearning: Fundamentals, Taxonomy, Evaluation, and Future Direction Feng et al. arXiv
2026‑01 Ours - -

📂 Taxonomy of Multimodal Unlearning

We organize multimodal unlearning via a system-first taxonomy across five intervention stages:

  • Data-Side Interventions (Section 3.1) – Modify inputs or data distributions to reduce learnability of target content.
  • Training-Time Edits (Section 3.2) – Update model parameters to suppress target behavior.
  • Architecture-Constrained Unlearning (Section 3.3) – Localized updates within layers or structures.
  • Training-Free Unlearning (Section 3.4) – Apply closed-form parameter or representation edits without retraining.
  • Decoding-Time Unlearning (Section 3.5) – Control generation without modifying model parameters.


Figure 1: Taxonomy of multimodal unlearning by intervention stage and control pathway.

📈 Benchmarks for Multimodal Unlearning

We organize representative multimodal unlearning benchmarks into three categories: unified suites, identity and privacy, and content and knowledge, based on their targets and evaluation focus (Table 3).

🧪 Unified Benchmark Suites

Date Title Paper
2025‑12 UMU-Bench: Closing the Modality Gap in Multimodal Unlearning Evaluation OpenReview
2025‑03 PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models arXiv
2024‑10 Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench arXiv
2024‑06 MU-Bench: A Multitask Multimodal Benchmark for Machine Unlearning arXiv

🔐 Identity and Privacy Unlearning

Date Title Paper
2025‑05 Alexa, can you forget me? Machine Unlearning Benchmark in Spoken Language Understanding arXiv
2024‑11 Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset arXiv
2024‑10 CLEAR: Character Unlearning in Textual and Visual Modalities arXiv

📚 Content and Knowledge Unlearning

Date Title Paper
2025‑05 Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation arXiv
2025‑02 SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning arXiv
2024‑10 Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning arXiv
2024‑06 Six-CD: Benchmarking Concept Removals for Text-to-image Diffusion Models arXiv
2024‑05 Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models arXiv
2024‑03 A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models arXiv
2024‑02 UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models arXiv

📏 Evaluation Metrics

Evaluation uses metric suites that assess forgetting, utility retention, robustness, and efficiency, as summarized in Figure 3. We defer detailed metric definitions and evaluation protocols to Appendix B.


Figure 3: Evaluation dimensions and representative
metrics for multimodal unlearning.

🧩 Applications of Multimodal Unlearning

Multimodal unlearning enables selective removal of specific identities, attributes, or concepts without full retraining while preserving overall capability and stability. Detailed use cases and representative studies are provided in Appendix E.


Figure 4: Core application scenarios of multimodal unlearning.

🧩 Open Challenges in Multimodal Unlearning

Multimodal unlearning faces challenges in theory, generalization, evaluation, robustness, utility trade-offs, and benchmarking, limiting reliable and scalable deployment.

Further discussion is provided in Appendix F, covering modality-specific limitations, evaluation considerations, and emerging research directions.


Figure 5: Key open challenges in multimodal unlearning.

📑 Curated Paper List

We curate 111 papers, including 55 on Vision–Language Models (VLMs) and 56 on Diffusion Models (DMs), covering developments up to August 2025. This collection reflects the rapid expansion of multimodal unlearning, where both paradigms follow complementary research directions.


Figure: Year-wise distribution of multimodal unlearning papers
across VLMs and DMs, 2022–2025 (through August 2025).

Vision-Language Models (VLMs)

Date Title Paper
2025‑12 UMU-Bench: Closing the Modality Gap in Multimodal Unlearning Evaluation paper
2025‑09 No Encore: Unlearning as Opt-Out in Music Generation arXiv
2025‑08 Unleashing Uncertainty: Efficient Machine Unlearning for Generative AI arXiv
2025‑08 Unlearning LLM-Based Speech Recognition Models paper
2025‑07 Unlearning the Noisy Correspondence Makes CLIP More Robust arXiv
2025‑07 PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning arXiv
2025‑07 Quantum-Inspired Audio Unlearning: Towards Privacy-Preserving Voice Biometrics arXiv
2025‑07 Do Not Mimic My Voice: Speaker Identity Unlearning for Zero-Shot Text-to-Speech arXiv
2025‑07 Automating Evaluation of Diffusion Model Unlearning with (Vision-) Language Model World Knowledge arXiv
2025‑06 Rethinking Post-Unlearning Behavior of Large Vision-Language Models arXiv
2025‑06 Quantifying Cross-Modality Memorization in Vision-Language Models arXiv
2025‑06 Lifting Data-Tracing Machine Unlearning to Knowledge-Tracing for Foundation Models arXiv
2025‑06 SUA: Stealthy Multimodal Large Language Model Unlearning Attack arXiv
2025‑06 Speech Unlearning arXiv
2025‑05 Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation arXiv
2025‑05 Alexa, can you forget me? Machine Unlearning Benchmark in Spoken Language Understanding arXiv
2025‑04 Prompting Forgetting: Unlearning in GANs via Textual Guidance arXiv
2025‑03 PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models arXiv
2025‑03 Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-Tuning and Can Be Mitigated by Machine Unlearning arXiv
2025‑02 Machine Unlearning in Audio: Bridging the Modality Gap via the Prune and Regrow Paradigm paper
2025‑02 MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models arXiv
2025‑02 SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning arXiv
2025‑02 Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language Models arXiv
2025‑02 SEMU: Singular Value Decomposition for Efficient Machine Unlearning arXiv
2025‑01 Zero-shot CLIP Class Forgetting via Text-image Space Adaptation paper
2025‑01 Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models arXiv
2024‑11 Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset arXiv
2024‑10 CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP arXiv
2024‑10 Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench arXiv
2024‑10 CLEAR: Character Unlearning in Textual and Visual Modalities arXiv
2024‑10 NegMerge: Sign-Consensual Weight Merging for Machine Unlearning arXiv
2024‑10 UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation arXiv
2024‑09 Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats arXiv
2024‑07 Zero-Shot Class Unlearning in CLIP with Synthetic Samples arXiv
2024‑07 Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning arXiv
2024‑07 Direct Unlearning Optimization for Robust and Safe Text-to-Image Models arXiv
2024‑07 Targeted Unlearning with Single Layer Unlearning Gradient arXiv
2024‑06 MU-Bench: A Multitask Multimodal Benchmark for Machine Unlearning arXiv
2024‑06 MUC: Machine Unlearning for Contrastive Learning with Black-box Evaluation arXiv
2024‑06 Can Textual Unlearning Solve Cross-Modality Safety Alignment? arXiv
2024‑05 Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models arXiv
2024‑05 Multi-Modal Recommendation Unlearning for Legal, Licensing, and Modality Constraints arXiv
2024‑05 Automatic Jailbreaking of the Text-to-Image Generative AI Systems arXiv
2024‑03 Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning arXiv
2024‑03 CLIP the Bias: How Useful is Balancing Data in Multimodal Learning? arXiv
2024‑02 EFUF: Efficient Fine-Grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models arXiv
2024‑02 Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning arXiv
2024‑02 Visual In-Context Learning for Large Vision-Language Models arXiv
2023‑11 MultiDelete for Multimodal Machine Unlearning arXiv
2023‑11 Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models arXiv
2023‑11 BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning arXiv
2023‑03 CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning arXiv
2023‑01 Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples arXiv
2022‑12 Editing Models with Task Arithmetic arXiv
2022‑12 Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning arXiv

Diffusion Models (DMs)

Date Title Paper
2025‑08 Sealing The Backdoor: Unlearning Adversarial Text Triggers In Diffusion Models Using Knowledge Distillation arXiv
2025‑08 UnGuide: Learning to Forget with LoRA-Guided Diffusion Models arXiv
2025‑08 Steering Guidance for Personalized Text-to-Image Diffusion Models arXiv
2025‑07 LoReUn: Data Itself Implicitly Provides Cues to Improve Machine Unlearning arXiv
2025‑07 Towards Resilient Safety-driven Unlearning for Diffusion Models against Downstream Fine-tuning arXiv
2025‑07 Image Can Bring Your Memory Back: A Novel Multi-Modal Guided Attack against Image Generation Model Unlearning arXiv
2025‑07 Concept Unlearning by Modeling Key Steps of Diffusion Process arXiv
2025‑06 Large-Scale Training Data Attribution for Music Generative Models via Unlearning arXiv
2025‑06 Video Unlearning via Low-Rank Refusal Vector arXiv
2025‑05 CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models arXiv
2025‑04 The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning arXiv
2025‑04 Backdoor Defense in Diffusion Models via Spatial Attention Unlearning arXiv
2025‑04 Sculpting Memory: Multi-Concept Forgetting in Diffusion Models via Dynamic Mask and Concept-Aware Optimization arXiv
2025‑03 Human Motion Unlearning arXiv
2025‑03 Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization arXiv
2025‑03 Data Unlearning in Diffusion Models arXiv
2025‑01 SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders arXiv
2024‑12 Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models arXiv
2024‑12 Boosting Alignment for Post-Unlearning Text-to-Image Generative Models arXiv
2024‑12 Learning to Forget using Hypernetworks arXiv
2024‑11 MUNBa: Machine Unlearning via Nash Bargaining arXiv
2024‑11 Model Integrity when Unlearning with T2I Diffusion Models arXiv
2024‑10 Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts arXiv
2024‑10 SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation arXiv
2024‑10 Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning arXiv
2024‑10 Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models arXiv
2024‑10 SteerDiff: Steering towards Safe Text-to-Image Diffusion Models arXiv
2024‑10 Dynamic Negative Guidance of Diffusion Models arXiv
2024‑09 Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models arXiv
2024‑09 Enhancing User-Centric Privacy Protection: An Interactive Framework through Diffusion Models and Machine Unlearning arXiv
2024‑09 Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models arXiv
2024‑08 Moderator: Moderating Text-to-Image Diffusion Models through Fine-grained Context-based Policies arXiv
2024‑08 Controllable Unlearning for Image-to-Image Generative Models via ε-Constrained Optimization arXiv
2024‑08 DiffZOO: A Purely Query-Based Black-Box Attack for Red-teaming Text-to-Image Generative Model via Zeroth Order Optimization arXiv
2024‑08 On the Limitations and Prospects of Machine Unlearning for Generative AI arXiv
2024‑07 Unlearning Concepts from Text-to-Video Diffusion Models arXiv
2024‑06 Diffusion Soup: Model Merging for Text-to-Image Diffusion Models arXiv
2024‑06 Six-CD: Benchmarking Concept Removals for Text-to-image Diffusion Models arXiv
2024‑05 FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing arXiv
2024‑05 Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient arXiv
2024‑04 Probing Unlearned Diffusion Models: A Transferable Adversarial Attack Perspective arXiv
2024‑04 SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models arXiv
2024‑03 A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models arXiv
2024‑03 CPR: Retrieval Augmented Generation for Copyright Protection arXiv
2024‑03 Hiding and Recovering Knowledge in Text-to-Image Diffusion Models via Learnable Prompts arXiv
2024‑02 Machine Unlearning for Image-to-Image Generative Models arXiv
2024‑02 UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models arXiv
2024‑02 Get What You Want, Not What You Don’t: Image Content Suppression for Text-to-Image Diffusion Models arXiv
2024‑01 Adaptive Median Smoothing: Adversarial Defense for Unlearned Text-to-Image Diffusion Models at Inference Time paper
2023‑11 MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning arXiv
2023‑10 SalUn: Empowering Machine Unlearning via Gradient-Based Weight Saliency in Both Image Classification and Generation arXiv
2023‑07 Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models arXiv
2023‑06 Training Data Attribution for Diffusion Models arXiv
2023‑03 Ablating Concepts in Text-to-Image Diffusion Models arXiv
2023‑03 Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models arXiv
2022‑09 Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis arXiv

🤝 Contributing

Please read the contributing.md before submitting a pull request.

📧 Contact

This repository is actively maintained and continuously updated 🚀. If you notice any issues or would like your work on Multimodal Unlearning included, please open an issue or contact us via email.

Corresponding author: Nobin Sarwar (smsarwar96@gmail.com)


✨ Star History

Star History Chart


⭐ If you find this repository useful, please consider starring it! ⭐

GitHub stars GitHub forks


Thanks for visiting ✨ Awesome Multimodal Unlearning

Repository Views

About

This repo presents a survey of multimodal unlearning across vision, language, video, and audio, covering papers, datasets, benchmarks, and open problems.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors