Explanipedia

Drawing Conclusions from Draws: Rethinking Preference Semantics in Arena-Style LLM Evaluation Open

Raphael Tang, Crystina Zhang, Wenyan Li, Chon Iat Lai, Pontus Stenetorp , et al. · 2025

In arena-style evaluation of large language models (LLMs), two LLMs respond to a user query, and the user chooses the winning response or deems the "battle" a draw, resulting in an adjustment to the ratings of both models. The prevailing a…

Geo-R1: Unlocking VLM Geospatial Reasoning with Cross-View Reinforcement Learning Open

Chenhui Xu, Fuxun Yu, Michael J. Bianco, Jacob A. Kovarskiy, Raphael Tang , et al. · 2025

We introduce Geo-R1, a reasoning-centric post-training framework that unlocks geospatial reasoning in vision-language models by combining thinking scaffolding and elevating. In the scaffolding stage, Geo-R1 instills a ``geospatial thinking…

Lost in Embeddings: Information Loss in Vision-Language Models Open

Wenyan Li, Raphael Tang, Chengzu Li, Caiqi Zhang, Ivan Vulić , et al. · 2025

Vision--language models (VLMs) often process visual inputs through a pretrained vision encoder, followed by a projection into the language model's embedding space via a connector component. While crucial for modality fusion, the potential …

WhisTLE: Deeply Supervised, Text-Only Domain Adaptation for Pretrained Speech Recognition Transformers Open

Akshat Pandey, Karun Kumar, Raphael Tang · 2025

Pretrained automatic speech recognition (ASR) models such as Whisper perform well but still need domain adaptation to handle unseen vocabulary and parlance. In many real-world settings, collecting speech data is impractical, necessitating …

Geospatial Foundational Embedder: Top-1 Winning Solution on EarthVision Embed2Scale Challenge (CVPR 2025) Open

Zirui Xu, Raphael Tang, M. Bianco, Qi Zhang, Rishi Madhok , et al. · 2025

EarthVision Embed2Scale challenge (CVPR 2025) aims to develop foundational geospatial models to embed SSL4EO-S12 hyperspectral geospatial data cubes into embedding vectors that faciliatetes various downstream tasks, e.g., classification, r…

Multilingual Language Model Pretraining using Machine-translated Data Open

Jiayi Wang, Yao Lu, Maurice Weber, Max Ryabinin, David Ifeoluwa Adelani , et al. · 2025

Computer science Philosophy

High-resource languages such as English, enables the pretraining of high-quality large language models (LLMs). The same can not be said for most other languages as LLMs still underperform for non-English languages, likely due to a gap in t…

Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language Open

Jiayi Wang, Yao Lu, Maurice Weber, Max Ryabinin, Yihong Chen , et al. · 2024

Computer science Philosophy

English, as a very high-resource language, enables the pretraining of high-quality large language models (LLMs). The same cannot be said for most other languages, as leading LLMs still underperform for non-English languages, likely due to …

"Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time Open

Scott Rome, Tianwen Chen, Raphael Tang, Luwei Zhou, Ferhan Türe · 2024

Computer science Business

Customer service is how companies interface with their customers. It can\ncontribute heavily towards the overall customer satisfaction. However,\nhigh-quality service can become expensive, creating an incentive to make it as\ncost efficien…

FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture Open

Wenyan Li, Xinyu Zhang, Jiaang Li, Qiwei Peng, Raphael Tang , et al. · 2024

Computer science Geography

Food is a rich and varied dimension of cultural heritage, crucial to both individuals and social groups. To bridge the gap in the literature on the often-overlooked regional diversity in this domain, we introduce FoodieQA, a manually curat…

Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation Open

Raphael Tang, Xinyu Zhang, Lixinyu Xu, Yao Lu, Wenyan Li , et al. · 2024

Psychology Computer science

Diffusion models are the state of the art in text-to-image generation, but their perceptual variability remains understudied. In this paper, we examine how prompts affect image variability in black-box diffusion-based models. We propose W1…

Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning Open

Wenyan Li, Jiaang Li, Rita Ramos, Raphael Tang, Desmond Elliott · 2024

Computer science Chemistry

Recent advances in retrieval-augmented models for image captioning highlight the benefit of retrieving related captions for efficient, lightweight models with strong domain-transfer capabilities. While these models demonstrate the success …

Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models Open

Xinyu Zhang, Sebastian Hofstätter, Patrick Lewis, Raphael Tang, Jimmy Lin · 2023

Computer science Mathematics

Listwise rerankers based on large language models (LLM) are the zero-shot state-of-the-art. However, current works in this direction all depend on the GPT models, making it a single point of failure in scientific reproducibility. Moreover,…

What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations Open

Raphael Tang, Xinyu Zhang, Jimmy Lin, Ferhan Türe · 2023

Psychology Sociology History

Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond? To bypass their refusal to "speak," we study this research question by probing contextualized embeddings and exploring whether this bias is…

Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models Open

Raphael Tang, Xinyu Zhang, Xueguang Ma, Jimmy Lin, Ferhan Türe · 2023

Computer science Mathematics Geography

Large language models (LLMs) exhibit positional bias in how they use context, which especially complicates listwise ranking. To address this, we propose permutation self-consistency, a form of self-consistency over ranking list outputs of …

“Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors Open

Zhiying Jiang, Matthew Yang, Mikhail Tsirlin, Raphael Tang, Yiqin Dai , et al. · 2023

Computer science Mathematics

Deep neural networks (DNNs) are often used for text classification due to their high accuracy. However, DNNs can be computationally intensive, requiring millions of parameters and large amounts of labeled data, which can make them expensiv…

Operator Selection and Ordering in a Pipeline Approach to Efficiency Optimizations for Transformers Open

Xin Ji, Raphael Tang, Zhiying Jiang, Yaoliang Yu, Jimmy Lin · 2023

Computer science Mathematics Engineering

There exists a wide variety of efficiency methods for natural language processing (NLP) tasks, such as pruning, distillation, dynamic inference, quantization, etc. From a different perspective, we can consider an efficiency method as an op…

What the DAAM: Interpreting Stable Diffusion Using Cross Attention Open

Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang , et al. · 2023

Computer science Physics

Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin, Ferhan Ture. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 20…

Less is More: Parameter-Free Text Classification with Gzip Open

Zhiying Jiang, Matthew Y. R. Yang, Mikhail Tsirlin, Raphael Tang, Jimmy Lin · 2022

Computer science Mathematics Philosophy

Deep neural networks (DNNs) are often used for text classification tasks as they usually achieve high levels of accuracy. However, DNNs can be computationally intensive with billions of parameters and large amounts of labeled data, which c…

SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale Open

Raphael Tang, Karun Kumar, Gefei Yang, Akshat Pandey, Yajie Mao , et al. · 2022

Computer science Physics Political science

End-to-end automatic speech recognition systems represent the state of the art, but they rely on thousands of hours of manually annotated speech for training, as well as heavyweight computation for inference. Of course, this impedes commer…

What the DAAM: Interpreting Stable Diffusion Using Cross Attention Open

Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang , et al. · 2022

Computer science Philosophy

Large-scale diffusion neural networks represent a substantial milestone in text-to-image generation, but they remain poorly understood, lacking interpretability analyses. In this paper, we perform a text-image attribution analysis on Stabl…

Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers Open

Xin Ji, Raphael Tang, Zhiying Jiang, Yaoliang Yu, Jimmy Lin · 2022

Computer science Mathematics Chemistry

There exists a wide variety of efficiency methods for natural language processing (NLP) tasks, such as pruning, distillation, dynamic inference, quantization, etc. We can consider an efficiency method as an operator applied on a model. Nat…

SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale Open

Raphael Tang, Karun Kumar, Gefei Yang, Akshat Pandey, Yajie Mao , et al. · 2022

Computer science Geography

Raphael Tang, Karun Kumar, Gefei Yang, Akshat Pandey, Yajie Mao, Vladislav Belyaev, Madhuri Emmadi, Craig Murray, Ferhan Ture, Jimmy Lin. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Trac…

Voice Query Auto Completion Open

Raphael Tang, Karun Kumar, Kendra Chalkley, Ji Xin, Liming Zhang , et al. · 2021

Computer science Philosophy

Query auto completion (QAC) is the task of predicting a search engine user’s final query from their intermediate, incomplete query. In this paper, we extend QAC to the streaming voice search setting, where automatic speech recognition syst…

BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression Open

Ji Xin, Raphael Tang, Yaoliang Yu, Jimmy Lin · 2021

Computer science Physics Philosophy

The slow speed of BERT has motivated much research on accelerating its inference, and the early exiting idea has been proposed to make trade-offs between model quality and efficiency. This paper aims to address two weaknesses of previous w…

The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing Open

Ji Xin, Raphael Tang, Yaoliang Yu, Jimmy Lin · 2021

Computer science Philosophy Engineering

Ji Xin, Raphael Tang, Yaoliang Yu, Jimmy Lin. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

Raphael Tang YOU? Author Swipe