Explanipedia

Why Do More Experts Fail? A Theoretical Analysis of Model Merging Open

Zijing Wang, Xiatinghan Xu, Yongkang Liu, Yiqun Zhang, Peiqin Lin , et al. · 2025

Model merging dramatically reduces storage and computational resources by combining multiple expert models into a single multi-task model. Although recent model merging methods have shown promising results, they struggle to maintain perfor…

SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation Open

Zhiyuan Peng, Xin Yin, Rui Qian, Peiqin Lin, Yongkang Liu , et al. · 2025

Large language models (LLMs) have transformed code generation. However, most existing approaches focus on mainstream languages such as Python and Java, neglecting the Solidity language, the predominant programming language for Ethereum sma…

Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu Open

Renhao Pei, Yihong Liu, Peiqin Lin, François Yvon, Hinrich Schütze · 2025

In-context machine translation (MT) with large language models (LLMs) is a promising approach for low-resource MT, as it can readily take advantage of linguistic resources such as grammar books and dictionaries. Such resources are usually …

SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model Open

Jiayang Yu, Yihang Zhang, Bin Wang, Peiqin Lin, Yongkang Liu , et al. · 2025

Fine-tuning is a key approach for adapting language models to specific downstream tasks, but updating all model parameters becomes impractical as model sizes increase. Parameter-Efficient Fine-Tuning (PEFT) methods, such as Low-Rank Adapta…

GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models Open

Hengyu Luo, Zihao Li, Joseph Attieh, Sanjaya Devkota, Ona De Gibert , et al. · 2025

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Open

Peiqin Lin, André L. Martins, Hinrich Schuetze · 2025

A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models Open

Peiqin Lin, André L. Martins, Hinrich Schuetze · 2025

SolEval: Benchmarking Large Language Models for Repository-level Solidity Smart Contract Generation Open

Zhiyuan Peng, Xin Yin, Rui Qian, Peiqin Lin, Yongkang Liu , et al. · 2025

Construction-Based Reduction of Translationese for Low-Resource Languages: A Pilot Study on Bavarian Open

Peiqin Lin, Marion Thaler, Daniela Goschala, Amir Hossein Kargaran, Yihong Liu , et al. · 2025

SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model Open

Jiayang Yu, Yihang Zhang, Bin Wang, Peiqin Lin, Yongkang Liu , et al. · 2025

EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models Open

Shaoxiong Ji, Zihao Li, Indraneil Paul, Jouni Paavola, Peiqin Lin , et al. · 2024

In this work, we introduce EMMA-500, a large-scale multilingual language model continue-trained on texts across 546 languages designed for enhanced multilingual performance, focusing on improving language coverage for low-resource language…

A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models Open

Peiqin Lin, André F. T. Martins, Hinrich Schütze · 2024

Recent studies have highlighted the potential of exploiting parallel corpora to enhance multilingual large language models, improving performance in both bilingual tasks, e.g., machine translation, and general-purpose tasks, e.g., text cla…

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Open

Peiqin Lin, André F. T. Martins, Hinrich Schütze · 2024

Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of English. However, adapting these metho…

MaLA-500: Massive Language Adaptation of Large Language Models Open

Peiqin Lin, Shaoxiong Ji, Jörg Tiedemann, André F. T. Martins, Hinrich Schütze · 2024

Large language models (LLMs) have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource l…

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark Open

Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen , et al. · 2023

We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate a…

OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining Open

Yihong Liu, Peiqin Lin, Mingyang Wang, Hinrich Schütze · 2023

Instead of pretraining multilingual language models from scratch, a more efficient method is to adapt existing pretrained language models (PLMs) to new languages via vocabulary extension and continued pretraining. However, this method usua…

Universal NER v1 Open

Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen , et al. · 2023

This is the first release of Universal NER: UNER v1

mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models Open

Peiqin Lin, Chengzhi Hu, Zheyu Zhang, André F. T. Martins, Hinrich Schütze · 2023

Recent multilingual pretrained language models (mPLMs) have been shown to encode strong language-specific signals, which are not explicitly provided during pretraining. It remains an open question whether it is feasible to employ mPLMs to …

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages Open

Ayyoob Imani, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet , et al. · 2023

The NLP community has mainly focused on scaling Large Language Models (LLMs) vertically, i.e., making them better for about 100 languages. We instead scale LLMs horizontally: we create, through continued pretraining, Glot500-m, an LLM that…

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages Open

Ayyoob ImaniGooghari, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet , et al. · 2023

The NLP community has mainly focused on scaling Large Language Models (LLMs) vertically, i.e., making them better for about 100 languages. We instead scale LLMs horizontally: we create, through continued pretraining, Glot500-m, an LLM that…

Modeling Content-Emotion Duality via Disentanglement for Empathetic Conversation Open

Peiqin Lin, Jiashuo Wang, Hinrich Schütze, Wenjie Li · 2022

The task of empathetic response generation aims to understand what feelings a speaker expresses on his/her experiences and then reply to the speaker appropriately. To solve the task, it is essential to model the content-emotion duality of …

Empathetic Response Generation through Graph-based Multi-hop Reasoning on Emotional Causality Open

Jiashuo Wang, Wenjie Li, Peiqin Lin, Feiteng Mu · 2021

Hierarchical Attention Network with Pairwise Loss for Chinese Zero Pronoun Resolution Open

Peiqin Lin, Meng Yang · 2020

Recent neural network methods for Chinese zero pronoun resolution didn't take bidirectional attention between zero pronouns and candidate antecedents into consideration, and simply treated the task as a classification task, ignoring the re…

A Shared-Private Representation Model with Coarse-to-Fine Extraction for Target Sentiment Analysis Open

Peiqin Lin, Meng Yang · 2020

Target sentiment analysis aims to detect opinion targets along with recognizing their sentiment polarities from a sentence. Some models with span-based labeling have achieved promising results in this task. However, the relation between th…

Deep Mask Memory Network with Semantic Dependency and Context Moment for Aspect Level Sentiment Classification Open

Peiqin Lin, Meng Yang, Jianhuang Lai · 2019

Aspect level sentiment classification aims at identifying the sentiment of each aspect term in a sentence. Deep memory networks often use location information between context word and aspect to generate the memory. Although improved result…

Peiqin Lin YOU? Author Swipe