Zhang Jingyu
YOU?
Author Swipe
View article: Certified Mitigation of Worst-Case LLM Copyright Infringement
Certified Mitigation of Worst-Case LLM Copyright Infringement Open
The exposure of large language models (LLMs) to copyrighted material during pre-training raises concerns about unintentional copyright infringement post deployment. This has driven the development of "copyright takedown" methods—post-train…
View article: Jailbreak Distillation: Renewable Safety Benchmarking
Jailbreak Distillation: Renewable Safety Benchmarking Open
Large language models (LLMs) are rapidly deployed in critical applications, raising urgent needs for robust safety benchmarking. We propose Jailbreak Distillation (JBDistill), a novel benchmark construction framework that "distills" jailbr…