Zhang Jingyu YOU? Author Swipe

Last 10y

Certified Mitigation of Worst-Case LLM Copyright Infringement Open

Association for Computational Linguistics 2025, Khashabi, Daniel, Marone, Marc, Van Durme, Benjamin, Yu, Jiacan , et al. · 2025

The exposure of large language models (LLMs) to copyrighted material during pre-training raises concerns about unintentional copyright infringement post deployment. This has driven the development of "copyright takedown" methods—post-train…

Jailbreak Distillation: Renewable Safety Benchmarking Open

Association for Computational Linguistics 2025, Elgohary, Ahmed, Iftekhar, A. S. M., Jackson Kyle, Khashabi, Daniel , et al. · 2025

Large language models (LLMs) are rapidly deployed in critical applications, raising urgent needs for robust safety benchmarking. We propose Jailbreak Distillation (JBDistill), a novel benchmark construction framework that "distills" jailbr…