📝 Publications
🤖 LLM Unlearning
under review

Label Smoothing Improves Gradient Ascent in LLM Unlearning
Zirui Pang, Hao Zheng, Zhijie Deng, Ling Li, Zixin Zhong, Jiaheng Wei
- We identify the instability of Gradient Ascent in LLM unlearning.
- We propose Smoothed Gradient Ascent (SGA) with a tunable smoothing rate.
- SGA achieves more stable and effective unlearning across benchmarks.
under review

OFFSIDE: Benchmarking Unlearning Misinformation in Multimodal Large Language Models
Hao Zheng, Zirui Pang, Ling li, Zhijie Deng, Yuhan Pu, Zhaowei Zhu, Xiaobo Xia, Jiaheng Wei
- We present OFFSIDE, a benchmark for multimodal unlearning based on football transfer rumors.
- It provides real-world, manually curated data and four evaluation settings to test forgetting, utility, and robustness.
- Our results reveal that current methods fail to unlearn visual rumors and are vulnerable to recovery and prompt attacks.
under review

GUARD: Generation-time LLM Unlearning via Adaptive Restriction and Detection
Zhijie Deng, Chris Yuhao Liu, Zirui Pang, Xinlei He, Lei Feng, Qi Xuan, Zhaowei Zhu, Jiaheng Wei
- We propose GUARD, a generation-time unlearning framework for LLMs.
- It detects forget-related prompts and blocks forbidden tokens during generation.
- GUARD achieves effective forgetting without harming model utility.
🏞️ Label Noise
under review

When VLMs Meet Image Classification: Test Sets Renovation via Missing Label Identification
Zirui Pang, Haosheng Tan, Yuhan Pu, Zhijie Deng, Zhouan Shen, Keyu Hu, Jiaheng Wei
- We propose REVEAL, a framework that uses vision-language models to find and fix missing and noisy labels in image classification benchmarks.
- It ensembles multiple VLMs and human feedback to renovate test sets with soft, accurate labels.
- REVEAL greatly improves dataset quality and aligns closely with human judgments.