Chanjun Park
YOU?
Author Swipe
View article: Code Redteaming: Probing Ethical Sensitivity of LLMs Through Natural Language Embedded in Code
Code Redteaming: Probing Ethical Sensitivity of LLMs Through Natural Language Embedded in Code Open
Large language models are increasingly used in code generation and developer tools, yet their robustness to ethically problematic natural language embedded in source code is underexplored. In this work, we study content-safety vulnerabilit…
View article: Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks Open
Large Language Models are commonly judged by their scores on standard benchmarks, yet such scores often overstate real capability since they mask the mix of skills a task actually demands. For example, ARC is assumed to test reasoning, whi…
View article: Mixture-of-Clustered-Experts: Advancing Expert Specialization and Generalization in Instruction Tuning
Mixture-of-Clustered-Experts: Advancing Expert Specialization and Generalization in Instruction Tuning Open
A sparse Mixture-of-Experts (MoE) architecture has emerged as a highly scalable solution by conditionally activating sub-modules without a proportional increase in computational costs. However, improving expert specialization to enhance pe…
View article: Evaluating the Influence of Demographic Identity in the Medical Use of Large Language Models
Evaluating the Influence of Demographic Identity in the Medical Use of Large Language Models Open
As large language models (LLMs) are increasingly adopted in medical decision-making, concerns about demographic biases in AIgenerated recommendations remain unaddressed. In this study, we systematically investigate how demographic attribut…
View article: From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems
From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems Open
Retrieval-Augmented Generation (RAG) has emerged as a crucial framework in natural language processing (NLP), improving factual consistency and reducing hallucinations by integrating external document retrieval with large language models (…
View article: Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval
Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval Open
Automatic Term Extraction (ATE) identifies domain-specific expressions that are crucial for downstream tasks such as machine translation and information retrieval. Although large language models (LLMs) have significantly advanced various N…
View article: MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation Open
Retrieval-Augmented Generation (RAG) has gained prominence as an effective method for enhancing the generative capabilities of Large Language Models (LLMs) through the incorporation of external knowledge. However, the evaluation of RAG sys…
View article: Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning
Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning Open
Multiagent collaboration has emerged as a promising framework for enhancing the reasoning capabilities of large language models (LLMs). Despite improvements in reasoning, the approach introduces substantial computational overhead resulting…
View article: ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction
ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction Open
Recent efforts in LLM alignment have focused on constructing large-scale preference datasets via human or Artificial Intelligence (AI) annotators. However, such approaches rely on instance-wise supervision, incurring substantial annotation…
View article: LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs Open
While large language models (LLMs) excel in generating coherent and contextually rich outputs, their capacity to efficiently handle long-form contexts is limited by fixed-length position embeddings. Additionally, the computational cost of …
View article: Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models
Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models Open
One of the key strengths of Large Language Models (LLMs) is their ability to interact with humans by generating appropriate responses to given instructions. This ability, known as instruction-following capability, has established a foundat…
View article: LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models
LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models Open
Creating high-quality, large-scale datasets for large language models (LLMs) often relies on resource-intensive, GPU-accelerated models for quality filtering, making the process time-consuming and costly. This dependence on GPUs limits acc…
View article: Representing the Under-Represented: Cultural and Core Capability Benchmarks for Developing Thai Large Language Models
Representing the Under-Represented: Cultural and Core Capability Benchmarks for Developing Thai Large Language Models Open
The rapid advancement of large language models (LLMs) has highlighted the need for robust evaluation frameworks that assess their core capabilities, such as reasoning, knowledge, and commonsense, leading to the inception of certain widely-…
View article: InstaTrans: An Instruction-Aware Translation Framework for Non-English Instruction Datasets
InstaTrans: An Instruction-Aware Translation Framework for Non-English Instruction Datasets Open
It is challenging to generate high-quality instruction datasets for non-English languages due to tail phenomena, which limit performance on less frequently observed data. To mitigate this issue, we propose translating existing high-quality…
View article: Rethinking KenLM: Good and Bad Model Ensembles for Efficient Text Quality Filtering in Large Web Corpora
Rethinking KenLM: Good and Bad Model Ensembles for Efficient Text Quality Filtering in Large Web Corpora Open
With the increasing demand for substantial amounts of high-quality data to train large language models (LLMs), efficiently filtering large web corpora has become a critical challenge. For this purpose, KenLM, a lightweight n-gram-based lan…
View article: ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction
ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction Open
We explore and improve the capabilities of LLMs to generate data for grammatical error correction (GEC). When merely producing parallel sentences, their patterns are too simplistic to be valuable as a corpus. To address this issue, we prop…
View article: Analysis of the Effectiveness of Model, Data, and User-Centric Approaches for Chat Application: A Case Study of BlenderBot 2.0
Analysis of the Effectiveness of Model, Data, and User-Centric Approaches for Chat Application: A Case Study of BlenderBot 2.0 Open
BlenderBot 2.0 represents a significant advancement in open-domain chatbots by incorporating real-time information and retaining user information across multiple sessions through an internet search module. Despite its innovations, there ar…
View article: Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark Open
This paper introduces the Open Ko-LLM Leaderboard and the Ko-H5 Benchmark as vital tools for evaluating Large Language Models (LLMs) in Korean. Incorporating private test sets while mirroring the English Open LLM Leaderboard, we establish …
View article: CharacterGPT: A Persona Reconstruction Framework for Role-Playing Agents
CharacterGPT: A Persona Reconstruction Framework for Role-Playing Agents Open
The recent introduction of the Assistants API highlights its potential for large language models (LLMs) in role-playing agents (RPA). However, maintaining consistent character personas remains a significant challenge due to variability in …
View article: Translation of Multifaceted Data without Re-Training of Machine Translation Systems
Translation of Multifaceted Data without Re-Training of Machine Translation Systems Open
Translating major language resources to build minor language resources becomes a widely-used approach. Particularly in translating complex data points composed of multiple components, it is common to translate each component separately. Ho…
View article: Towards Harnessing the Most of ChatGPT for Korean Grammatical Error Correction
Towards Harnessing the Most of ChatGPT for Korean Grammatical Error Correction Open
In this study, we conduct a pioneering and comprehensive examination of ChatGPT’s (GPT-3.5 Turbo) capabilities within the realm of Korean Grammatical Error Correction (K-GEC). Given the Korean language’s agglutinative nature and its rich l…