Zhouxing Shi
YOU?
Author Swipe
View article: PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch
PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch Open
Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone for aligning large language models (LLMs). However, its effectiveness depends on high-quality instruction data. Most existing alignment datasets are either private …
View article: Neural Network Verification with Branch-and-Bound for General Nonlinearities
Neural Network Verification with Branch-and-Bound for General Nonlinearities Open
Branch-and-bound (BaB) is among the most effective techniques for neural network (NN) verification. However, existing works on BaB for NN verification have mostly focused on NNs with piecewise linear activations, especially ReLU networks. …
View article: SoundnessBench: A Soundness Benchmark for Neural Network Verifiers
SoundnessBench: A Soundness Benchmark for Neural Network Verifiers Open
Neural network (NN) verification aims to formally verify properties of NNs, which is crucial for ensuring the behavior of NN-based models in safety-critical applications. In recent years, the community has developed many NN verifiers and b…
View article: Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control
Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control Open
We study the problem of learning verifiably Lyapunov-stable neural controllers that provably satisfy the Lyapunov asymptotic stability condition within a region-of-attraction (ROA). Unlike previous works that adopted counterexample-guided …
View article: Neural Network Verification with Branch-and-Bound for General Nonlinearities
Neural Network Verification with Branch-and-Bound for General Nonlinearities Open
Branch-and-bound (BaB) is among the most effective techniques for neural network (NN) verification. However, existing works on BaB for NN verification have mostly focused on NNs with piecewise linear activations, especially ReLU networks. …
View article: Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation Open
Learning-based neural network (NN) control policies have shown impressive empirical performance in a wide range of tasks in robotics and control. However, formal (Lyapunov) stability guarantees over the region-of-attraction (ROA) for NN co…
View article: Defending LLMs against Jailbreaking Attacks via Backtranslation
Defending LLMs against Jailbreaking Attacks via Backtranslation Open
Although many large language models (LLMs) have been trained to refuse harmful requests, they are still vulnerable to jailbreaking attacks which rewrite the original prompt to conceal its harmful intent. In this paper, we propose a new met…
View article: Red Teaming Language Model Detectors with Language Models
Red Teaming Language Model Detectors with Language Models Open
The prevalence and strong capability of large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive usage of LLMs, recent work has proposed algorithms to d…
View article: Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring
Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring Open
The strong general capabilities of Large Language Models (LLMs) bring potential ethical risks if they are unrestrictedly accessible to malicious users. Token-level watermarking inserts watermarks in the generated texts by altering the toke…
View article: Red Teaming Language Model Detectors with Language Models
Red Teaming Language Model Detectors with Language Models Open
The prevalence and strong capability of large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive usage of LLMs, recent works have proposed algorithms to…
View article: Effective Robustness against Natural Distribution Shifts for Models with Different Training Data
Effective Robustness against Natural Distribution Shifts for Models with Different Training Data Open
"Effective robustness" measures the extra out-of-distribution (OOD) robustness beyond what can be predicted from the in-distribution (ID) performance. Existing effective robustness evaluations typically use a single test set such as ImageN…
View article: Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation
Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation Open
Lipschitz constants are connected to many properties of neural networks, such as robustness, fairness, and generalization. Existing methods for computing Lipschitz constants either produce relatively loose upper bounds or are limited to sm…
View article: On the Convergence of Certified Robust Training with Interval Bound Propagation
On the Convergence of Certified Robust Training with Interval Bound Propagation Open
Interval Bound Propagation (IBP) is so far the base of state-of-the-art methods for training neural networks with certifiable robustness guarantees when potential adversarial perturbations present, while the convergence of IBP training rem…
View article: On the Sensitivity and Stability of Model Interpretations in NLP
On the Sensitivity and Stability of Model Interpretations in NLP Open
Recent years have witnessed the emergence of a variety of post-hoc interpretations that aim to uncover how natural language processing (NLP) models make predictions. Despite the surge of new interpretation methods, it remains an open probl…
View article: On the Sensitivity and Stability of Model Interpretations in NLP
On the Sensitivity and Stability of Model Interpretations in NLP Open
Recent years have witnessed the emergence of a variety of post-hoc interpretations that aim to uncover how natural language processing (NLP) models make predictions. Despite the surge of new interpretation methods, it remains an open probl…
View article: On the Faithfulness Measurements for Model Interpretations.
On the Faithfulness Measurements for Model Interpretations. Open
Recent years have witnessed the emergence of a variety of post-hoc interpretations that aim to uncover how natural language processing (NLP) models make predictions. Despite the surge of new interpretations, it remains an open problem how …
View article: Fast Certified Robust Training via Better Initialization and Shorter Warmup.
Fast Certified Robust Training via Better Initialization and Shorter Warmup. Open
Recently, bound propagation based certified adversarial defense have been proposed for training neural networks with certifiable robustness guarantees. Despite state-of-the-art (SOTA) methods including interval bound propagation (IBP) and …
View article: Fast Certified Robust Training with Short Warmup
Fast Certified Robust Training with Short Warmup Open
Recently, bound propagation based certified robust training methods have been proposed for training neural networks with certifiable robustness guarantees. Despite that state-of-the-art (SOTA) methods including interval bound propagation (…
View article: On the Adversarial Robustness of Visual Transformers
On the Adversarial Robustness of Visual Transformers Open
Following the success in advancing natural language processing and understanding, transformers are expected to bring revolutionary changes to computer vision. This work provides the first and comprehensive study on the robustness of vision…
View article: On the Adversarial Robustness of Vision Transformers
On the Adversarial Robustness of Vision Transformers Open
Following the success in advancing natural language processing and understanding, transformers are expected to bring revolutionary changes to computer vision. This work provides a comprehensive study on the robustness of vision transformer…
View article: Robust Text CAPTCHAs Using Adversarial Examples
Robust Text CAPTCHAs Using Adversarial Examples Open
CAPTCHA (Completely Automated Public Truing test to tell Computers and Humans Apart) is a widely used technology to distinguish real users and automated users such as bots. However, the advance of AI technologies weakens many CAPTCHA tests…
View article: Knowledge-Aided Open-Domain Question Answering
Knowledge-Aided Open-Domain Question Answering Open
Open-domain question answering (QA) aims to find the answer to a question from a large collection of documents.Though many models for single-document machine comprehension have achieved strong performance, there is still much room for impr…
View article: Automatic Perturbation Analysis on General Computational Graphs.
Automatic Perturbation Analysis on General Computational Graphs. Open
Linear relaxation based perturbation analysis for neural networks, which aims to compute tight linear bounds of output neurons given a certain amount of input perturbation, has become a core component in robustness verification and certifi…
View article: Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond
Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond Open
Linear relaxation based perturbation analysis (LiRPA) for neural networks, which computes provable linear bounds of output neurons given a certain amount of input perturbation, has become a core component in robustness verification and cer…
View article: Provable, Scalable and Automatic Perturbation Analysis on General Computational Graphs
Provable, Scalable and Automatic Perturbation Analysis on General Computational Graphs Open
Linear relaxation based perturbation analysis (LiRPA) for neural networks, which computes provable linear bounds of output neurons given a certain amount of input perturbation, has become a core component in robustness verification and cer…
View article: Robustness Verification for Transformers
Robustness Verification for Transformers Open
Robustness verification that aims to formally certify the prediction behavior of neural networks has become an important tool for understanding model behavior and obtaining safety guarantees. However, previous methods can usually only hand…
View article: Robustness to Modification with Shared Words in Paraphrase Identification
Robustness to Modification with Shared Words in Paraphrase Identification Open
Revealing the robustness issues of natural language processing models and improving their robustness is important to their performance under difficult situations. In this paper, we study the robustness of paraphrase identification models f…
View article: Robustness to Modification with Shared Words in Paraphrase Identification
Robustness to Modification with Shared Words in Paraphrase Identification Open
Revealing the robustness issues of natural language processing models and improving their robustness is important to their performance under difficult situations. In this paper, we study the robustness of paraphrase identification models f…
View article: Adversarial Examples with Difficult Common Words for Paraphrase Identification
Adversarial Examples with Difficult Common Words for Paraphrase Identification Open
Revealing the robustness issues of natural language processing models and improving their robustness is important to their performance under difficult situations. In this paper, we study the robustness of paraphrase identification models f…
View article: A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues
A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues Open
Discourse structures are beneficial for various NLP tasks such as dialogue understanding, question answering, sentiment analysis, and so on. This paper presents a deep sequential model for parsing discourse dependency structures of multi-p…