Junjie Ye
YOU?
Author Swipe
View article: Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Open
Large language models (LLMs) acquire substantial world knowledge during pre-training, which is further shaped by post-training techniques such as supervised fine-tuning (SFT). However, the impact of SFT on a model's knowledge remains under…
View article: Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning
Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning Open
The capability of the reward model (RM) is crucial for the success of Reinforcement Learning from Human Feedback (RLHF) in aligning with human preferences. However, as training progresses, the output space distribution of the policy model …
View article: SMART: Advancing Scalable Map Priors for Driving Topology Reasoning
SMART: Advancing Scalable Map Priors for Driving Topology Reasoning Open
Topology reasoning is crucial for autonomous driving as it enables comprehensive understanding of connectivity and relationships between lanes and traffic elements. While recent approaches have shown success in perceiving driving topology …
View article: Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training
Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training Open
The GPT-4 technical report suggests that downstream performance can be predicted from pre-training signals, but offers little methodological detail on how to quantify this. This work address this gap by modeling knowledge retention, the ca…
View article: Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Open
Large Language Models (LLMs) agents are increasingly pivotal for addressing complex tasks in interactive environments. Existing work mainly focuses on enhancing performance through behavior cloning from stronger experts, yet such approache…
View article: Discontinuous Galerkin simulation of sliding geometries using a point-to-point interpolation technique
Discontinuous Galerkin simulation of sliding geometries using a point-to-point interpolation technique Open
View article: ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use Open
Effective evaluation of multi-hop tool use is critical for analyzing the understanding, reasoning, and function-calling capabilities of large language models (LLMs). However, progress has been hindered by a lack of reliable evaluation data…
View article: CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios
CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios Open
View article: Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Open
View article: High-fidelity sliding mesh method for moving geometries in turbomachinery applications
High-fidelity sliding mesh method for moving geometries in turbomachinery applications Open
View article: ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use Open
View article: TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use Open
View article: TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use Open
Large language models (LLMs) achieve remarkable advancements by leveraging tools to interact with environments, a critical step toward generalized AI. However, the standard supervised fine-tuning (SFT) approach, which relies on large-scale…
View article: Learning from Massive Human Videos for Universal Humanoid Pose Control
Learning from Massive Human Videos for Universal Humanoid Pose Control Open
Scalable learning of humanoid robots is crucial for their deployment in real-world applications. While traditional approaches primarily rely on reinforcement learning or teleoperation to achieve whole-body control, they are often limited b…
View article: 60 Data Points are Sufficient to Fine-Tune LLMs for Question-Answering
60 Data Points are Sufficient to Fine-Tune LLMs for Question-Answering Open
Large language models (LLMs) encode extensive world knowledge through pre-training on massive datasets, which can then be fine-tuned for the question-answering (QA) task. However, effective strategies for fine-tuning LLMs for the QA task r…
View article: Variation of volatile flavor substances in salt-baked chicken during processing
Variation of volatile flavor substances in salt-baked chicken during processing Open
Salt-baked chicken was a traditional delicacy of Guangdong, but the effects of volatile flavor compounds at different stages of its processing remained unclear. This study utilized sensory analysis, e-nose, and GC-MS to investigate the cha…
View article: RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation
RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation Open
This work proposes a retrieve-and-transfer framework for zero-shot robotic manipulation, dubbed RAM, featuring generalizability across various objects, environments, and embodiments. Unlike existing approaches that learn manipulation from …
View article: SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance Open
As the development of large language models (LLMs) rapidly advances, securing these models effectively without compromising their utility has become a pivotal area of research. However, current defense strategies against jailbreak attacks …
View article: Beyond Boundaries: Learning a Universal Entity Taxonomy across Datasets and Languages for Open Named Entity Recognition
Beyond Boundaries: Learning a Universal Entity Taxonomy across Datasets and Languages for Open Named Entity Recognition Open
Open Named Entity Recognition (NER), which involves identifying arbitrary types of entities from arbitrary domains, remains challenging for Large Language Models (LLMs). Recent studies suggest that fine-tuning LLMs on extensive NER data ca…
View article: MetaRM: Shifted Distributions Alignment via Meta-Learning
MetaRM: Shifted Distributions Alignment via Meta-Learning Open
The success of Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM). However, as the training process progresses, the output distribution of the po…
View article: CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models Open
Adversarial misuse, particularly through `jailbreaking' that circumvents a model's safety and ethical protocols, poses a significant challenge for Large Language Models (LLMs). This paper delves into the mechanisms behind such successful a…
View article: LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition Open
Despite the impressive capabilities of large language models (LLMs), their performance on information extraction tasks is still not entirely satisfactory. However, their remarkable rewriting capabilities and extensive world knowledge offer…
View article: LLM can Achieve Self-Regulation via Hyperparameter Aware Generation
LLM can Achieve Self-Regulation via Hyperparameter Aware Generation Open
In the realm of Large Language Models (LLMs), users commonly employ diverse decoding strategies and adjust hyperparameters to control the generated text. However, a critical question emerges: Are LLMs conscious of the existence of these de…
View article: ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages Open
Tool learning is widely acknowledged as a foundational approach or deploying large language models (LLMs) in real-world scenarios. While current research primarily emphasizes leveraging tools to augment LLMs, it frequently neglects emergin…
View article: A Study on the Rock-Breaking Characteristics of an Arcing-Blade Cutter under Different Cutting Parameters
A Study on the Rock-Breaking Characteristics of an Arcing-Blade Cutter under Different Cutting Parameters Open
To analyze the rock-breaking characteristics of an arcing-blade cutter in cutting red sandstone, a two-cutter cutting model was established based on the finite element method. Then, the cutting processes of the arcing-blade cutter at penet…
View article: MouSi: Poly-Visual-Expert Vision-Language Models
MouSi: Poly-Visual-Expert Vision-Language Models Open
Current large vision-language models (VLMs) often encounter challenges such as insufficient capabilities of a single visual component and excessively long visual tokens. These issues can limit the model's effectiveness in accurately interp…
View article: Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback Open
The success of AI assistants based on Language Models (LLMs) hinges on Reinforcement Learning from Human Feedback (RLHF) to comprehend and align with user intentions. However, traditional alignment algorithms, such as PPO, are hampered by …
View article: RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning Open
Tool learning has generated widespread interest as a vital means of interaction between Large Language Models (LLMs) and the physical world. Current research predominantly emphasizes LLMs' capacity to utilize tools in well-structured envir…
View article: ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios Open
Existing evaluations of tool learning primarily focus on validating the alignment of selected tools for large language models (LLMs) with expected outcomes. However, these approaches rely on a limited set of scenarios where answers can be …
View article: Discontinuous Galerkin Simulation of Sliding Geometries Using a Point-to-Point Interpolation Technique
Discontinuous Galerkin Simulation of Sliding Geometries Using a Point-to-Point Interpolation Technique Open