Chengao Li
YOU?
Author Swipe
View article: Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models
Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models Open
Reinforcement Learning from Human Feedback (RLHF) has emerged as a powerful technique for aligning large language models (LLMs) with human preferences. However, effectively aligning LLMs with diverse human preferences remains a significant…
View article: Make sport-related self-control better: Ritualized behavior in Chinese athletes
Make sport-related self-control better: Ritualized behavior in Chinese athletes Open
Research suggests that ritualized behavior leads individuals to gain self-control, thereby influencing their performance. Although ritualized behavior is most widely applied to athletes, these studies have been found to have no clear quant…
View article: Controlling Large Language Models Through Concept Activation Vectors
Controlling Large Language Models Through Concept Activation Vectors Open
As large language models (LLMs) are widely deployed across various domains, the ability to control their generated outputs has become more critical. This control involves aligning LLMs outputs with human values and ethical principles or cu…
View article: Controlling Large Language Models Through Concept Activation Vectors
Controlling Large Language Models Through Concept Activation Vectors Open
As large language models (LLMs) are widely deployed across various domains, the ability to control their generated outputs has become more critical. This control involves aligning LLMs outputs with human values and ethical principles or cu…