B-DENSE: Branching for Dense Ensemble Network Supervision Efficiency
An efficient supervision strategy that uses branching in dense ensemble networks to improve training efficiency while preserving predictive performance.
ICLR
Verifiability-First Agents: Provable Observability and Lightweight Audit Agents for Controlling Autonomous LLM Systems
An architecture with built-in attestations, audit agents, and challenge-response checks, alongside OPERA to benchmark detection and remediation of agent misalignment
AAAI
DAC-LoRA: Dynamic Adversarial Curriculum for Efficient and Robust Few-Shot Adaptation
A generalized framework that uses a dynamic, adversarial curriculum to make Vision-Language Models (VLMs) more robust against attacks, improving efficiency and few-shot adaptation
ICCV
DINOHash: Learning Adversarially Robust Perceptual Hashes from Self-Supervised Features
An open-source framework for robust perceptual image hashing, DINOHash enables secure and transformation-resilient provenance detection of AI-generated images.
ICML
SPD Attack - Prevention of AI Powered Image Editing by Image Immunization
An analysis of methods to safeguard images against misuse in image-to-image editing models through reproduction and extension of existing research across various models and datasets.
ICLR
From Teacher to Student: Tracking Memorization Through Model Distillation
An analysis of knowledge distillation effects on memorization in fine-tuned language models, showing that distillation from large teachers to smaller students mitigates memorization risks while improving efficiency.
ACL
Revisiting CroPA: A Reproducibility Study and Enhancements for Cross-Prompt Adversarial Transferability in Vision-Language Models
In this study, we conduct a comprehensive reproducibility study of "An Image is Worth 1000 Lies: Adversarial Transferability Across Prompts on Vision-Language Models" validating the Cross-Prompt Attack (CroPA), and also proposing several key improvements to the framework.
TMLR
[Re] CUDA: Curriculum of Data Augmentation for Long‐tailed Recognition
Using classwise degree of data augmentation to tackle class imbalance in long tailed dataset
MLRC
Riemann Sum Optimization for Accurate Integrated Gradients Computation
A mathematical framework to reduce computational complexity of Integrated Gradients
NeurIPS
Strengthening Interpretability: An Investigative Study of Integrated Gradient Methods
This study reproduces IDGI, showing reduced noise and better stability than Integrated Gradients, while analyzing the effect of step size.
TMLR
Rethinking Randomized Smoothing from the Perspective of Scalability
A study on randomized smoothing, analysed from the perspective of scalability as a challenge to its continued application
NeurIPS
Image-Alchemy: Advancing Subject Fidelity in Personalized Text-to-Image Generation
A two-stage personalization pipeline for personalized image generation using LoRA-based attention fine-tuning and segmentation-guided Img2Img synthesis.
ICLR
Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models
Tree Ring Watermarks are harder to detect in modern rectified flow-based models compared to traditional diffusion models, especially under image attacks.
ICLR
One Noise to Fool Them All: Universal Adversarial Defenses Against Image Editing
Image immunization involves adding undetectable noise in images to prevent editing via diffusion models. We further extended this to multiple images using a single noise.
CVPR