Philipp Koehn
YOU?
Author Swipe
View article: HadaSmileNet: Hadamard fusion of handcrafted and deep-learning features for enhancing facial emotion recognition of genuine smiles
HadaSmileNet: Hadamard fusion of handcrafted and deep-learning features for enhancing facial emotion recognition of genuine smiles Open
The distinction between genuine and posed emotions represents a fundamental pattern recognition challenge with significant implications for data mining applications in social sciences, healthcare, and human-computer interaction. While rece…
View article: Preliminary Ranking of WMT25 General Machine Translation Systems
Preliminary Ranking of WMT25 General Machine Translation Systems Open
We present the preliminary rankings of machine translation (MT) systems submitted to the WMT25 General Machine Translation Shared Task, as determined by automatic evaluation metrics. Because these rankings are derived from automatic evalua…
View article: HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation
HiMATE: A Hierarchical Multi-Agent Framework for Machine Translation Evaluation Open
The advancement of Large Language Models (LLMs) enables flexible and interpretable automatic evaluations. In the field of machine translation evaluation, utilizing LLMs with translation error annotations based on Multidimensional Quality M…
View article: Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation
Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation Open
Following last year, we have continued to host the WMT translation shared task this year, the second edition of the Discourse-Level Literary Translation. We focus on three language directions: Chinese-English, Chinese-German, and Chinese-R…
View article: X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale Open
Large language models (LLMs) have achieved remarkable success across various NLP tasks with a focus on English due to English-centric pre-training and limited multilingual data. In this work, we focus on the problem of translation, and whi…
View article: Measuring family influence from the non-family employee perspective: The perceived family influence scale (PFIS)
Measuring family influence from the non-family employee perspective: The perceived family influence scale (PFIS) Open
To further our understanding of family influence in family businesses, this study introduces the Perceived Family Influence Scale (PFIS). Departing from existing owner-centric methodologies, the PFIS uses social constructivism theory to ca…
View article: Preliminary WMT24 Ranking of General MT Systems and LLMs
Preliminary WMT24 Ranking of General MT Systems and LLMs Open
This is the preliminary ranking of WMT24 General MT systems based on automatic metrics. The official ranking will be a human evaluation, which is superior to the automatic ranking and supersedes it. The purpose of this report is not to int…
View article: Learn and Unlearn: Addressing Misinformation in Multilingual LLMs
Learn and Unlearn: Addressing Misinformation in Multilingual LLMs Open
This paper investigates the propagation of harmful information in multilingual large language models (LLMs) and evaluates the efficacy of various unlearning methods. We demonstrate that fake information, regardless of the language it is in…
View article: Recovering document annotations for sentence-level bitext
Recovering document annotations for sentence-level bitext Open
Data availability limits the scope of any given task. In machine translation, historical models were incapable of handling longer contexts, so the lack of document-level datasets was less noticeable. Now, despite the emergence of long-sequ…
View article: DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation Open
Non-autoregressive Transformers (NATs) are recently applied in direct speech-to-speech translation systems, which convert speech across different languages without intermediate text data. Although NATs generate high-quality outputs and off…
View article: Streaming Sequence Transduction through Dynamic Compression
Streaming Sequence Transduction through Dynamic Compression Open
We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams. STAR dynamically segments input streams to create compressed anchor…
View article: The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts Open
As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LL…
View article: Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs
Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs Open
Translating literary works has perennially stood as an elusive dream in machine translation (MT), a journey steeped in intricate challenges. To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the…
View article: Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles
Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles Open
Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning. However, even though zero-shot translations a…
View article: Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models Open
Text generation models are notoriously vulnerable to errors in the training data. With the wide-spread availability of massive amounts of web-crawled data becoming more commonplace, how can we enhance the robustness of models trained on a …
View article: Condensing Multilingual Knowledge with Lightweight Language-Specific Modules
Condensing Multilingual Knowledge with Lightweight Language-Specific Modules Open
Incorporating language-specific (LS) modules is a proven method to boost performance in multilingual machine translation. This approach bears similarity to Mixture-of-Experts (MoE) because it does not inflate FLOPs. However, the scalabilit…
View article: Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer Open
We introduce and demonstrate how to effectively train multilingual machine translation models with pixel representations. We experiment with two different data settings with a variety of language and script coverage, demonstrating improved…
View article: Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs
Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs Open
Longyue Wang, Zhaopeng Tu, Yan Gu, Siyou Liu, Dian Yu, Qingsong Ma, Chenyang Lyu, Liting Zhou, Chao-Hong Liu, Yufeng Ma, Weiyu Chen, Yvette Graham, Bonnie Webber, Philipp Koehn, Andy Way, Yulin Yuan, Shuming Shi. Proceedings of the Eighth …
View article: Condensing Multilingual Knowledge with Lightweight Language-Specific Modules
Condensing Multilingual Knowledge with Lightweight Language-Specific Modules Open
Incorporating language-specific (LS) modules or Mixture-of-Experts (MoE) are proven methods to boost performance in multilingual model performance, but the scalability of these approaches to hundreds of languages or experts tends to be har…
View article: Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA
Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA Open
While large language models have made remarkable advancements in natural language generation, their potential in machine translation, especially when fine-tuned, remains under-explored. In our study, we conduct comprehensive experiments, e…