Peiqin Lin
YOU?
Author Swipe
View article: Why Do More Experts Fail? A Theoretical Analysis of Model Merging
Why Do More Experts Fail? A Theoretical Analysis of Model Merging Open
Model merging dramatically reduces storage and computational resources by combining multiple expert models into a single multi-task model. Although recent model merging methods have shown promising results, they struggle to maintain perfor…
View article: SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation
SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation Open
Large language models (LLMs) have transformed code generation. However, most existing approaches focus on mainstream languages such as Python and Java, neglecting the Solidity language, the predominant programming language for Ethereum sma…
View article: Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu
Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu Open
In-context machine translation (MT) with large language models (LLMs) is a promising approach for low-resource MT, as it can readily take advantage of linguistic resources such as grammar books and dictionaries. Such resources are usually …
View article: SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model
SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model Open
Fine-tuning is a key approach for adapting language models to specific downstream tasks, but updating all model parameters becomes impractical as model sizes increase. Parameter-Efficient Fine-Tuning (PEFT) methods, such as Low-Rank Adapta…
View article: GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models
GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models Open
View article: XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Open
View article: A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models Open
View article: SolEval: Benchmarking Large Language Models for Repository-level Solidity Smart Contract Generation
SolEval: Benchmarking Large Language Models for Repository-level Solidity Smart Contract Generation Open
View article: Construction-Based Reduction of Translationese for Low-Resource Languages: A Pilot Study on Bavarian
Construction-Based Reduction of Translationese for Low-Resource Languages: A Pilot Study on Bavarian Open
View article: SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model
SSMLoRA: Enhancing Low-Rank Adaptation with State Space Model Open
View article: EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models Open
In this work, we introduce EMMA-500, a large-scale multilingual language model continue-trained on texts across 546 languages designed for enhanced multilingual performance, focusing on improving language coverage for low-resource language…
View article: A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models Open
Recent studies have highlighted the potential of exploiting parallel corpora to enhance multilingual large language models, improving performance in both bilingual tasks, e.g., machine translation, and general-purpose tasks, e.g., text cla…
View article: XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples Open
Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of English. However, adapting these metho…
View article: MaLA-500: Massive Language Adaptation of Large Language Models
MaLA-500: Massive Language Adaptation of Large Language Models Open
Large language models (LLMs) have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource l…
View article: Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark
Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark Open
We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate a…
View article: OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining Open
Instead of pretraining multilingual language models from scratch, a more efficient method is to adapt existing pretrained language models (PLMs) to new languages via vocabulary extension and continued pretraining. However, this method usua…
View article: Universal NER v1
Universal NER v1 Open
This is the first release of Universal NER: UNER v1
View article: mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models Open
Recent multilingual pretrained language models (mPLMs) have been shown to encode strong language-specific signals, which are not explicitly provided during pretraining. It remains an open question whether it is feasible to employ mPLMs to …
View article: Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages Open
The NLP community has mainly focused on scaling Large Language Models (LLMs) vertically, i.e., making them better for about 100 languages. We instead scale LLMs horizontally: we create, through continued pretraining, Glot500-m, an LLM that…
View article: Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages Open
The NLP community has mainly focused on scaling Large Language Models (LLMs) vertically, i.e., making them better for about 100 languages. We instead scale LLMs horizontally: we create, through continued pretraining, Glot500-m, an LLM that…
View article: Modeling Content-Emotion Duality via Disentanglement for Empathetic Conversation
Modeling Content-Emotion Duality via Disentanglement for Empathetic Conversation Open
The task of empathetic response generation aims to understand what feelings a speaker expresses on his/her experiences and then reply to the speaker appropriately. To solve the task, it is essential to model the content-emotion duality of …
View article: Empathetic Response Generation through Graph-based Multi-hop Reasoning on Emotional Causality
Empathetic Response Generation through Graph-based Multi-hop Reasoning on Emotional Causality Open
View article: Hierarchical Attention Network with Pairwise Loss for Chinese Zero Pronoun Resolution
Hierarchical Attention Network with Pairwise Loss for Chinese Zero Pronoun Resolution Open
Recent neural network methods for Chinese zero pronoun resolution didn't take bidirectional attention between zero pronouns and candidate antecedents into consideration, and simply treated the task as a classification task, ignoring the re…
View article: A Shared-Private Representation Model with Coarse-to-Fine Extraction for Target Sentiment Analysis
A Shared-Private Representation Model with Coarse-to-Fine Extraction for Target Sentiment Analysis Open
Target sentiment analysis aims to detect opinion targets along with recognizing their sentiment polarities from a sentence. Some models with span-based labeling have achieved promising results in this task. However, the relation between th…
View article: Deep Mask Memory Network with Semantic Dependency and Context Moment for Aspect Level Sentiment Classification
Deep Mask Memory Network with Semantic Dependency and Context Moment for Aspect Level Sentiment Classification Open
Aspect level sentiment classification aims at identifying the sentiment of each aspect term in a sentence. Deep memory networks often use location information between context word and aspect to generate the memory. Although improved result…