Aswin Rrv
YOU?
Author Swipe
View article: Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter? Open
Solving grid puzzles involves a significant amount of logical reasoning. Hence, it is a good domain to evaluate the reasoning capability of a model which can then guide us to improve the reasoning ability of models. However, most existing …
View article: Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies
Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies Open
This study explores the sycophantic tendencies of Large Language Models (LLMs), where these models tend to provide answers that match what users want to hear, even if they are not entirely correct. The motivation behind this exploration st…
View article: Triple Preference Optimization: Achieving Better Alignment using a Single Step Optimization
Triple Preference Optimization: Achieving Better Alignment using a Single Step Optimization Open
Reinforcement Learning with Human Feedback (RLHF) enhances the alignment of Large Language Models (LLMs). However, its limitations have led to the development of Direct Preference Optimization (DPO), an RL-free approach designed to overcom…