Explanipedia

Learning to Generate Answers with Citations via Factual Consistency Models Open

Rami Aly, Zhiqiang Tang, Samson Tan, George Karypis · 2024

Large Language Models (LLMs) frequently hallucinate, impeding their reliability in mission-critical situations. One approach to address this issue is to provide citations to relevant sources alongside generated content, enhancing the verif…

Lessons from the Trenches on Reproducible Evaluation of Language Models Open

Stella Biderman, Hailey Schoelkopf, Lintang Sutawika, Leo Gao, Jonathan Tow , et al. · 2024

Computer science Geology

Effective evaluation of language models remains an open challenge in NLP. Researchers and engineers face methodological issues such as the sensitivity of models to evaluation setup, difficulty of proper comparisons across methods, and the …

Extreme Miscalibration and the Illusion of Adversarial Robustness Open

Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha , et al. · 2024

Computer science Mathematics Psychology

Deep learning-based Natural Language Processing (NLP) models are vulnerable to adversarial attacks, where small perturbations can cause a model to misclassify. Adversarial Training (AT) is often used to increase model robustness. However, …

Automatic Feature Fairness in Recommendation via Adversaries Open

Hengchang Hu, Yiming Cao, Zhankui He, Samson Tan, Min‐Yen Kan · 2023

Computer science Mathematics Physics

Fairness is a widely discussed topic in recommender systems, but its practical implementation faces challenges in defining sensitive features while maintaining recommendation accuracy. We propose feature fairness as the foundation to achie…

Large Language Models of Code Fail at Completing Code with Potential Bugs Open

Tuan Dinh, Jinman Zhao, Samson Tan, Renato Negrinho, Leonard Lausen , et al. · 2023

Computer science Biology

Large language models of code (Code-LLMs) have recently brought tremendous advances to code completion, a fundamental feature of programming assistance and code intelligence. However, most existing works ignore the possible presence of bug…

NL-Augmenter 🦎 → 🐍 A Framework for Task-Sensitive Natural Language Augmentation Open

Kaustubh Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li , et al. · 2023

Computer science Chemistry

Data augmentation is an important method for evaluating the robustness of and enhancing the diversity of training data for natural language processing (NLP) models. In this paper, we present NL-Augmenter, a new participatory Python-based n…

TraVLR: Now You See It, Now You Don’t! A Bimodal Dataset for Evaluating Visio-Linguistic Reasoning Open

Keng Ji Chow, Samson Tan, Min‐Yen Kan · 2023

Computer science Sociology Economics

Numerous visio-linguistic (V+L) representation learning methods have been developed, yet existing datasets do not adequately evaluate the extent to which they represent visual and linguistic concepts in a unified space. We propose several …

ReCode: Robustness Evaluation of Code Generation Models Open

Shiqi Wang, Zheng Li, Haifeng Qian, Chenghao Yang, Zijian Wang , et al. · 2023

Computer science Biology

Shiqi Wang, Zheng Li, Haifeng Qian, Chenghao Yang, Zijian Wang, Mingyue Shang, Varun Kumar, Samson Tan, Baishakhi Ray, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Dan Roth, Bing Xiang. Proceedings of the 61st Annual Meet…

ReCode: Robustness Evaluation of Code Generation Models Open

Shiqi Wang, Zheng Li, Haifeng Qian, Chenghao Yang, Zijian Wang , et al. · 2022

Computer science Chemistry

Code generation models have achieved impressive performance. However, they tend to be brittle as slight edits to a prompt could lead to very different generations; these robustness properties, critical for user experience when deployed in …

BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog Systems Open

Guangsen Wang, Samson Tan, Shafiq Joty, Gang Wu, Jimmy Au , et al. · 2022

Computer science Engineering Physics

We present BotSIM, a data-efficient end-to-end Bot SIMulation toolkit for commercial text-based task-oriented dialog (TOD) systems. BotSIM consists of three major components: 1) a Generator that can infer semantic-level dialog acts and ent…

Whodunit? Learning to Contrast for Authorship Attribution Open

Bo Ai, Yuchen Wang, Yugin Tan, Samson Tan · 2022

Computer science Psychology History

Authorship attribution is the task of identifying the author of a given text. The key is finding representations that can differentiate between authors. Existing approaches typically use manually designed features that capture a dataset's …

Interpreting the Robustness of Neural NLP Models to Textual Perturbations Open

Yunxiang Zhang, Liangming Pan, Samson Tan, Min‐Yen Kan · 2022

Computer science Physics Chemistry

10.18653/v1/2022.findings-acl.315

BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog Systems Open

Guangsen Wang, Samson Tan, Shafiq Joty, Gang Wu, Jimmy Au , et al. · 2022

Computer science Engineering Physics

We present BotSIM, a data-efficient end-to-end Bot SIMulation framework for commercial task-oriented dialog (TOD) systems. BotSIM consists of three major components: 1) a Generator that can infer semantic-level dialog acts and entities fro…

Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP Open

Sabrina J. Mielke, Zaid Alyafeai, Elizabeth Salesky, Colin Raffel, Manan Dey , et al. · 2021

Computer science Philosophy

What are the units of text that we want to model? From bytes to multi-word expressions, text can be analyzed and generated at many granularities. Until recently, most natural language processing (NLP) models operated over words, treating t…

TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguistic Reasoning Open

Keng Ji Chow, Samson Tan, Min‐Yen Kan · 2021

Computer science Economics Chemistry

Numerous visio-linguistic (V+L) representation learning methods have been developed, yet existing datasets do not adequately evaluate the extent to which they represent visual and linguistic concepts in a unified space. We propose several …

Interpreting the Robustness of Neural NLP Models to Textual Perturbations Open

Yunxiang Zhang, Liangming Pan, Samson Tan, Min‐Yen Kan · 2021

Computer science Physics Chemistry

Modern Natural Language Processing (NLP) models are known to be sensitive to input perturbations and their performance can decrease when applied to real-world, noisy data. However, it is still unclear why models are less robust to some per…

Causally Estimating the Sensitivity of Neural NLP Models to Spurious Features Open

Yunxiang Zhang, Liangming Pan, Samson Tan, Min‐Yen Kan · 2021

Computer science Engineering Chemistry

Recent work finds modern natural language processing (NLP) models relying on spurious features for prediction. Mitigating such effects is thus important. Despite this need, there is no quantitative measure to evaluate or compare the effect…

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots Open

Samson Tan, Shafiq Joty · 2021

Computer science Mathematics Geography

Multilingual models have demonstrated impressive cross-lingual transfer performance. However, test sets like XNLI are monolingual at the example level. In multilingual communities, it is common for polyglots to code-mix when conversing wit…

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots Open

Samson Tan, Shafiq Joty · 2021

Computer science Philosophy

Multilingual models have demonstrated impressive cross-lingual transfer performance. However, test sets like XNLI are monolingual at the example level. In multilingual communities, it is common for polyglots to code-mix when conversing wit…

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots Open

Samson Tan, Shafiq Joty · 2021

Computer science Mathematics Philosophy

Multilingual models have demonstrated impressive cross-lingual transfer performance. However, test sets like XNLI are monolingual at the example level. In multilingual communities, it is common for polyglots to code-mix when conversing wit…

Robustness Gym: Unifying the NLP Evaluation Landscape Open

Karan Goel, Nazneen Fatema Rajani, Jesse Vig, Samson Tan, Jason Wu , et al. · 2021

Computer science Chemistry

Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems. Consequently, recent research has focused on testing the robustness of such models, resulting in a diverse s…

Mind Your Inflections! Improving NLP for Non-Standard English with Base-Inflection Encoding Open

Samson Tan, Shafiq Joty, Lav R. Varshney, Min‐Yen Kan · 2020

Computer science Chemistry Philosophy

Morphological inflection is a process of word formation where base words are modified to express different grammatical categories such as tense, case, voice, person, or number. World Englishes, such as Colloquial Singapore English (CSE) an…

Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding Open

Samson Tan, Shafiq Joty, Lav R. Varshney, Min‐Yen Kan · 2020

Computer science Mathematics Philosophy

Inflectional variation is a common feature of World Englishes such as Colloquial Singapore English and African American Vernacular English. Although comprehension by human readers is usually unimpaired by non-standard inflections, current …

It’s Morphin’ Time! Combating Linguistic Discrimination with Inflectional Perturbations Open

Samson Tan, Shafiq Joty, Min‐Yen Kan, Richard Socher · 2020

Computer science History Philosophy

Training on only perfect Standard English corpora predisposes pre-trained\nneural networks to discriminate against minorities from non-standard linguistic\nbackgrounds (e.g., African American Vernacular English, Colloquial Singapore\nEngli…

Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding Open

Samson Tan, Shafiq Joty, Lav R. Varshney, Min‐Yen Kan · 2020

Computer science Mathematics Philosophy

10.18653/v1/2020.emnlp-main.455

Samson Tan YOU? Author Swipe