Tiejun Zhao
YOU?
Author Swipe
View article: Beyond Global Emotion: Fine-Grained Emotional Speech Synthesis with Dynamic Word-Level Modulation
Beyond Global Emotion: Fine-Grained Emotional Speech Synthesis with Dynamic Word-Level Modulation Open
Emotional text-to-speech (E-TTS) is central to creating natural and trustworthy human-computer interaction. Existing systems typically rely on sentence-level control through predefined labels, reference audio, or natural language prompts. …
View article: Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning Open
The advancement of Large Language Models (LLMs) has spurred significant interest in Role-Playing Agents (RPAs) for applications such as emotional companionship and virtual interaction. However, recent RPAs are often built on explicit dialo…
View article: Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design Open
Speculative decoding and quantization effectively accelerate memory-bound inference of large language models. Speculative decoding mitigates the memory bandwidth bottleneck by verifying multiple tokens within a single forward pass, which i…
View article: Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory
Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory Open
The evaluation of large language models (LLMs) via benchmarks is widespread, yet inconsistencies between different leaderboards and poor separability among top models raise concerns about their ability to accurately reflect authentic model…
View article: Empowering LLMs in Task-Oriented Dialogues: A Domain-Independent Multi-Agent Framework and Fine-Tuning Strategy
Empowering LLMs in Task-Oriented Dialogues: A Domain-Independent Multi-Agent Framework and Fine-Tuning Strategy Open
Task-oriented dialogue systems based on Large Language Models (LLMs) have gained increasing attention across various industries and achieved significant results. Current approaches condense complex procedural workflows into a single agent …
View article: MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training
MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training Open
Complex instruction-following with elaborate constraints is imperative for Large Language Models (LLMs). While existing methods have constructed data for complex instruction alignment, they all rely on a more advanced model, especially GPT…
View article: Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through Comprehensive Analysis
Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through Comprehensive Analysis Open
The o1-Like LLMs are transforming AI by simulating human cognitive processes, but their performance in multilingual machine translation (MMT) remains underexplored. This study examines: (1) how o1-Like LLMs perform in MMT tasks and (2) wha…
View article: MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training
MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training Open
View article: Benchmarking LLMs for Translating Classical Chinese Poetry: Evaluating Adequacy, Fluency, and Elegance
Benchmarking LLMs for Translating Classical Chinese Poetry: Evaluating Adequacy, Fluency, and Elegance Open
View article: A Knowledge-Fused Maximum Mean Discrepancy for Cross-Lingual Named Entity Recognition
A Knowledge-Fused Maximum Mean Discrepancy for Cross-Lingual Named Entity Recognition Open
View article: Make Imagination Clearer! Stable Diffusion-based Visual Imagination for Multimodal Machine Translation
Make Imagination Clearer! Stable Diffusion-based Visual Imagination for Multimodal Machine Translation Open
View article: A Knowledge-Fused Maximum Mean Discrepancy for Cross-Lingual Named Entity Recognition
A Knowledge-Fused Maximum Mean Discrepancy for Cross-Lingual Named Entity Recognition Open
View article: Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning
Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning Open
View article: LLM-based Translation Inference with Iterative Bilingual Understanding
LLM-based Translation Inference with Iterative Bilingual Understanding Open
View article: Empowering Beyond English-Centric Machine Translation on Llms by Multilingual Fusion Instruction Tuning
Empowering Beyond English-Centric Machine Translation on Llms by Multilingual Fusion Instruction Tuning Open
View article: ASMem: Anchor Sparse Memory for Multi-Domain Knowledge Editing of Large Language Models
ASMem: Anchor Sparse Memory for Multi-Domain Knowledge Editing of Large Language Models Open
View article: LLM-based Discriminative Reasoning for Knowledge Graph Question Answering
LLM-based Discriminative Reasoning for Knowledge Graph Question Answering Open
Large language models (LLMs) based on generative pre-trained Transformer have achieved remarkable performance on knowledge graph question-answering (KGQA) tasks. However, LLMs often produce ungrounded subgraph planning or reasoning results…
View article: Make Imagination Clearer! Stable Diffusion-based Visual Imagination for Multimodal Machine Translation
Make Imagination Clearer! Stable Diffusion-based Visual Imagination for Multimodal Machine Translation Open
Visual information has been introduced for enhancing machine translation (MT), and its effectiveness heavily relies on the availability of large amounts of bilingual parallel sentence pairs with manual image annotations. In this paper, we …
View article: UCFA‐Net: A U‐shaped cross‐fusion network with attention mechanism for enhanced polyp segmentation
UCFA‐Net: A U‐shaped cross‐fusion network with attention mechanism for enhanced polyp segmentation Open
Enhancing the precision of computer‐assisted polyp segmentation and delineation during colonoscopies assists in the removal of potentially precancerous tissue, thus reducing the risk of malignant transformation. Most of the current medical…
View article: LLM-based Translation Inference with Iterative Bilingual Understanding
LLM-based Translation Inference with Iterative Bilingual Understanding Open
The remarkable understanding and generation capabilities of large language models (LLMs) have greatly improved translation performance. However, incorrect understanding of the sentence to be translated can degrade translation quality. To a…
View article: Mitigating the Bias of Large Language Model Evaluation
Mitigating the Bias of Large Language Model Evaluation Open
Recently, there has been a trend of evaluating the Large Language Model (LLM) quality in the flavor of LLM-as-a-Judge, namely leveraging another LLM to evaluate the current output quality. However, existing judges are proven to be biased, …
View article: Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving
Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving Open
Different from the traditional translation tasks, classical Chinese poetry translation requires both adequacy and fluency in translating culturally and historically significant content and linguistic poetic elegance. Large language models …
View article: STAR: Scale-wise Text-conditioned AutoRegressive image generation
STAR: Scale-wise Text-conditioned AutoRegressive image generation Open
We introduce STAR, a text-to-image model that employs a scale-wise auto-regressive paradigm. Unlike VAR, which is constrained to class-conditioned synthesis for images up to 256$\times$256, STAR enables text-driven image generation up to 1…
View article: DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms
DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms Open
Recently, large language models (LLMs) enhanced by self-reflection have achieved promising performance on machine translation. The key idea is guiding LLMs to generate translation with human-like feedback. However, existing self-reflection…
View article: Morphological Identification Characteristics of Basil (Ocimum spp.) in Tabanan Regency, Bali, Indonesia
Morphological Identification Characteristics of Basil (Ocimum spp.) in Tabanan Regency, Bali, Indonesia Open
Basil (Ocimum spp.) is an aromatic plant and is the wealthiest essential oil-producing genera from the Lamiaceae family. Due to the various phytochemical compounds or secondary metabolites, Basil has the potential of medicinal plant germpl…
View article: DesignProbe: A Graphic Design Benchmark for Multimodal Large Language Models
DesignProbe: A Graphic Design Benchmark for Multimodal Large Language Models Open
A well-executed graphic design typically achieves harmony in two levels, from the fine-grained design elements (color, font and layout) to the overall design. This complexity makes the comprehension of graphic design challenging, for it ne…
View article: Dual Instruction Tuning with Large Language Models for Mathematical Reasoning
Dual Instruction Tuning with Large Language Models for Mathematical Reasoning Open
Recent advancements highlight the success of instruction tuning with large language models (LLMs) utilizing Chain-of-Thought (CoT) data for mathematical reasoning tasks. Despite the fine-tuned LLMs, challenges persist, such as incorrect, m…
View article: Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving
Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving Open
Most Bilingual Lexicon Induction (BLI) methods retrieve word translation pairs by finding the closest target word for a given source word based on cross-lingual word embeddings (WEs). However, we find that solely retrieving translation fro…
View article: Self-Evaluation of Large Language Model based on Glass-box Features
Self-Evaluation of Large Language Model based on Glass-box Features Open
The proliferation of open-source Large Language Models (LLMs) underscores the pressing need for evaluation methods. Existing works primarily rely on external evaluators, focusing on training and prompting strategies. However, a crucial asp…
View article: Hierarchical Latent Alignment for Non-Autoregressive Generation under High Compression Ratio
Hierarchical Latent Alignment for Non-Autoregressive Generation under High Compression Ratio Open
Non-autoregressive generation has attracted more and more attention due to its fast decoding speed. Latent alignment objectives, such as CTC, are designed to capture the monotonic alignments between the predicted and output tokens, which h…