Kuleen Sasse
YOU?
Author Swipe
View article: When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior
When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior Open
View article: When Helpfulness Backfires: LLMs and the Risk of Misinformation Due to Sycophantic Behavior
When Helpfulness Backfires: LLMs and the Risk of Misinformation Due to Sycophantic Behavior Open
View article: Sparse Autoencoder Features for Classifications and Transferability
Sparse Autoencoder Features for Classifications and Transferability Open
Sparse Autoencoders (SAEs) provide potentials for uncovering structured, human-interpretable representations in Large Language Models (LLMs), making them a crucial tool for transparent and controllable AI systems. We systematically analyze…
View article: Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats
Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats Open
View article: Sparse Autoencoder Features for Classifications and Transferability
Sparse Autoencoder Features for Classifications and Transferability Open
View article: Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats
Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats Open
WARNING: This paper contains content that maybe upsetting or offensive to some readers. Dog whistles are coded expressions with dual meanings: one intended for the general public (outgroup) and another that conveys a specific message to an…
View article: debiaSAE: Benchmarking and Mitigating Vision-Language Model Bias
debiaSAE: Benchmarking and Mitigating Vision-Language Model Bias Open
As Vision Language Models (VLMs) gain widespread use, their fairness remains under-explored. In this paper, we analyze demographic biases across five models and six datasets. We find that portrait datasets like UTKFace and CelebA are the b…
View article: Disease Entity Recognition and Normalization is Improved with Large Language Model Derived Synthetic Normalized Mentions
Disease Entity Recognition and Normalization is Improved with Large Language Model Derived Synthetic Normalized Mentions Open
Background: Machine learning methods for clinical named entity recognition and entity normalization systems can utilize both labeled corpora and Knowledge Graphs (KGs) for learning. However, infrequently occurring concepts may have few men…
View article: Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation
Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation Open
Background: Large language models (LLMs) are trained to follow directions, but this introduces a vulnerability to blindly comply with user requests even if they generate wrong information. In medicine, this could accelerate the generation …
View article: The Risks of Medical Misinformation Generation in Large Language Models
The Risks of Medical Misinformation Generation in Large Language Models Open
View article: Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models
Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models Open
Recently, work in NLP has shifted to few-shot (in-context) learning, with large language models (LLMs) performing well across a range of tasks. However, while fairness evaluations have become a standard for supervised methods, little is kn…