Mark Dredze
YOU?
Author Swipe
View article: Automated image analysis of instagram posts: Implications for risk perception and communication in public health using a case study of #HIV
Automated image analysis of instagram posts: Implications for risk perception and communication in public health using a case study of #HIV Open
People's perceptions about health risks, including their risk of acquiring HIV, are impacted in part by who they see portrayed as at risk in the media. Viewers in these cases are asking themselves "do those portrayed as at risk look like m…
View article: The Charlie Sheen Effect on Rapid In-home Human Immunodeficiency Virus Test Sales
The Charlie Sheen Effect on Rapid In-home Human Immunodeficiency Virus Test Sales Open
View article: Can a selfie promote public engagement with skin cancer?
Can a selfie promote public engagement with skin cancer? Open
View article: Demo: Statistically Significant Results On Biases and Errors of LLMs Do Not Guarantee Generalizable Results
Demo: Statistically Significant Results On Biases and Errors of LLMs Do Not Guarantee Generalizable Results Open
Recent research has shown that hallucinations, omissions, and biases are prevalent in everyday use-cases of LLMs. However, chatbots used in medical contexts must provide consistent advice in situations where non-medical factors are involve…
View article: Explaining Twitter’s inability to effectively moderate content during the COVID-19 pandemic
Explaining Twitter’s inability to effectively moderate content during the COVID-19 pandemic Open
Social media platforms routinely face pressure to restrict harmful content while protecting free speech; however, prior theory suggests that platform design might undermine the efficacy of content moderation. During the COVID-19 pandemic, …
View article: Waldo: Automated discovery of adverse events from unstructured self reports
Waldo: Automated discovery of adverse events from unstructured self reports Open
Adverse event (AE) detection is labor-intensive and costly given the task is to find rare events. Automated solutions to enhance efficiency, reduce costs, and capture unnoticed safety signals are needed. To develop and evaluate an automate…
View article: The power of social media activism in the #YesAllWomen Movement
The power of social media activism in the #YesAllWomen Movement Open
Social media has played a significant role in activism, with hashtags becoming a powerful tool for community organizing and raising awareness about social and political issues. Twitter, now X, supported the rise of hashtag activism. The ha…
View article: Dashboard Intervention for Tracking Digital Social Media Activity in the Clinical Care of Individuals With Mood and Anxiety Disorders: Randomized Trial
Dashboard Intervention for Tracking Digital Social Media Activity in the Clinical Care of Individuals With Mood and Anxiety Disorders: Randomized Trial Open
Background Digital social activity, defined as interactions on social media and electronic communication platforms, has become increasingly important. Social factors impact mental health and can contribute to depression and anxiety. Theref…
View article: Racial bias in clinician assessment of patient credibility: Evidence from electronic health records
Racial bias in clinician assessment of patient credibility: Evidence from electronic health records Open
Objective Black patients disproportionately report feeling disbelieved or having concerns dismissed in medical encounters, suggesting potential racial bias in clinicians’ assessment of patient credibility. Because this bias may be evident …
View article: Understanding and Mitigating Risks of Generative AI in Financial Services
Understanding and Mitigating Risks of Generative AI in Financial Services Open
To responsibly develop Generative AI (GenAI) products, it is critical to define the scope of acceptable inputs and outputs. What constitutes a "safe" response is an actively debated question. Academic work puts an outsized focus on evaluat…
View article: MedScore: Generalizable Factuality Evaluation of Free-Form Medical Answers by Domain-adapted Claim Decomposition and Verification
MedScore: Generalizable Factuality Evaluation of Free-Form Medical Answers by Domain-adapted Claim Decomposition and Verification Open
While Large Language Models (LLMs) can generate fluent and convincing responses, they are not necessarily correct. This is especially apparent in the popular decompose-then-verify factuality evaluation pipeline, where LLMs evaluate generat…
View article: Results from a Randomized Trial of a Dashboard Intervention for Tracking Digital Social Media Activity in Clinical Care of Individuals with Mood and Anxiety Disorders (Preprint)
Results from a Randomized Trial of a Dashboard Intervention for Tracking Digital Social Media Activity in Clinical Care of Individuals with Mood and Anxiety Disorders (Preprint) Open
BACKGROUND Digital social activity, defined as interactions on social media and electronic communication platforms, has become increasingly important. Social factors impact mental health and can contribute to depression and anxiety. There…
View article: Can one size fit all?: Measuring Failure in Multi-Document Summarization Domain Transfer
Can one size fit all?: Measuring Failure in Multi-Document Summarization Domain Transfer Open
ive multi-document summarization (MDS) is the task of automatically summarizing information in multiple documents, from news articles to conversations with multiple speakers. The training approaches for current MDS models can be grouped in…
View article: Testing a Dashboard Intervention for Tracking Digital Social Media Activity in Clinical Care of Individuals With Mood and Anxiety Disorders: Protocol and Design Considerations for a Pragmatic Randomized Trial
Testing a Dashboard Intervention for Tracking Digital Social Media Activity in Clinical Care of Individuals With Mood and Anxiety Disorders: Protocol and Design Considerations for a Pragmatic Randomized Trial Open
Background Mood and anxiety disorders are prevalent mental health diagnoses. Numerous studies have shown that measurement-based care, which is used to monitor patient symptoms, functioning, and treatment progress and help guide clinical de…
View article: Automated identification of incidental hepatic steatosis on Emergency Department imaging using large language models.
Automated identification of incidental hepatic steatosis on Emergency Department imaging using large language models. Open
Large language models can assist in identifying incidental conditions from imaging reports that otherwise may be missed opportunities for early disease intervention. Large language models are a democratization of natural language processin…
View article: Explaining Twitter’s inability to reduce vaccine misinformation during the COVID-19 pandemic
Explaining Twitter’s inability to reduce vaccine misinformation during the COVID-19 pandemic Open
Users dissatisfied with exposure to objectionable online content have begun to migrate en masse to new social media platforms. These new platforms share architectural features with legacy platforms, but offer content moderation services th…
View article: RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models
RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models Open
View article: Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats
Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats Open
View article: A Novel Multi-Document Retrieval Benchmark: Journalist Source-Selection in Newswriting
A Novel Multi-Document Retrieval Benchmark: Journalist Source-Selection in Newswriting Open
View article: LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition
LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition Open
View article: Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models
Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models Open
View article: DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation
DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation Open
View article: Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions Open
View article: Evaluating the Evaluators: Are readability metrics good measures of readability?
Evaluating the Evaluators: Are readability metrics good measures of readability? Open
View article: DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation
DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation Open
The decompose-then-verify strategy for verification of Large Language Model (LLM) generations decomposes claims that are then independently verified. Decontextualization augments text (claims) to ensure it can be verified outside of the or…
View article: Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats
Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats Open
WARNING: This paper contains content that maybe upsetting or offensive to some readers. Dog whistles are coded expressions with dual meanings: one intended for the general public (outgroup) and another that conveys a specific message to an…
View article: Are Clinical T5 Models Better for Clinical Text?
Are Clinical T5 Models Better for Clinical Text? Open
Large language models with a transformer-based encoder/decoder architecture, such as T5, have become standard platforms for supervised tasks. To bring these technologies to the clinical domain, recent work has trained new or adapted existi…
View article: Give me Some Hard Questions: Synthetic Data Generation for Clinical QA
Give me Some Hard Questions: Synthetic Data Generation for Clinical QA Open
Clinical Question Answering (QA) systems enable doctors to quickly access patient information from electronic health records (EHRs). However, training these systems requires significant annotated data, which is limited due to the expertise…
View article: Gender Bias in Decision-Making with Large Language Models: A Study of Relationship Conflicts
Gender Bias in Decision-Making with Large Language Models: A Study of Relationship Conflicts Open
Large language models (LLMs) acquire beliefs about gender from training data and can therefore generate text with stereotypical gender attitudes. Prior studies have demonstrated model generations favor one gender or exhibit stereotypes abo…
View article: Can Optimization Trajectories Explain Multi-Task Transfer?
Can Optimization Trajectories Explain Multi-Task Transfer? Open
Despite the widespread adoption of multi-task training in deep learning, little is understood about how multi-task learning (MTL) affects generalization. Prior work has conjectured that the negative effects of MTL are due to optimization c…