Joseph Gatto
YOU?
Author Swipe
View article: REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction
REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction Open
Event argument extraction identifies arguments for predefined event roles in text. Existing work evaluates this task with exact match (EM), where predicted arguments must align exactly with annotated spans. While suitable for span-based mo…
View article: In-Context Learning for Preserving Patient Privacy: A Framework for Synthesizing Realistic Patient Portal Messages
In-Context Learning for Preserving Patient Privacy: A Framework for Synthesizing Realistic Patient Portal Messages Open
Since the COVID-19 pandemic, clinicians have seen a large and sustained influx in patient portal messages, significantly contributing to clinician burnout. To the best of our knowledge, there are no large-scale public patient portal messag…
View article: Explicit, Implicit, and Scattered: Revisiting Event Extraction to Capture Complex Arguments
Explicit, Implicit, and Scattered: Revisiting Event Extraction to Capture Complex Arguments Open
Prior works formulate the extraction of event-specific arguments as a span extraction problem, where event arguments are explicit -- i.e. assumed to be contiguous spans of text in a document. In this study, we revisit this definition of Ev…
View article: Depth $F_1$: Improving Evaluation of Cross-Domain Text Classification by Measuring Semantic Generalizability
Depth $F_1$: Improving Evaluation of Cross-Domain Text Classification by Measuring Semantic Generalizability Open
Recent evaluations of cross-domain text classification models aim to measure the ability of a model to obtain domain-invariant performance in a target domain given labeled samples in a source domain. The primary strategy for this evaluatio…
View article: Theme-Driven Keyphrase Extraction to Analyze Social Media Discourse
Theme-Driven Keyphrase Extraction to Analyze Social Media Discourse Open
Social media platforms are vital resources for sharing self-reported health experiences, offering rich data on various health topics. Despite advancements in Natural Language Processing (NLP) enabling large-scale social media data analysis…
View article: Do LLMs Find Human Answers To Fact-Driven Questions Perplexing? A Case Study on Reddit
Do LLMs Find Human Answers To Fact-Driven Questions Perplexing? A Case Study on Reddit Open
Large language models (LLMs) have been shown to be proficient in correctly answering questions in the context of online discourse. However, the study of using LLMs to model human-like answers to fact-driven social media questions is still …
View article: Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse
Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse Open
In this paper, we develop an LLM-powered framework for the curation and evaluation of emerging opinion mining in online health communities. We formulate emerging opinion mining as a pairwise stance detection problem between (title, comment…
View article: Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types
Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types Open
Event Argument Extraction (EAE) is an extremely difficult information extraction problem -- with significant limitations in few-shot cross-domain (FSCD) settings. A common solution to FSCD modeling is data augmentation. Unfortunately, exis…
View article: Chain-of-Thought Embeddings for Stance Detection on Social Media
Chain-of-Thought Embeddings for Stance Detection on Social Media Open
Stance detection on social media is challenging for Large Language Models (LLMs), as emerging slang and colloquial language in online conversations often contain deeply implicit stance labels. Chain-of-Thought (COT) prompting has recently …
View article: Not Enough Labeled Data? Just Add Semantics: A Data-Efficient Method for Inferring Online Health Texts
Not Enough Labeled Data? Just Add Semantics: A Data-Efficient Method for Inferring Online Health Texts Open
User-generated texts available on the web and social platforms are often long and semantically challenging, making them difficult to annotate. Obtaining human annotation becomes increasingly difficult as problem domains become more special…
View article: Text Encoders Lack Knowledge: Leveraging Generative LLMs for Domain-Specific Semantic Textual Similarity
Text Encoders Lack Knowledge: Leveraging Generative LLMs for Domain-Specific Semantic Textual Similarity Open
Amidst the sharp rise in the evaluation of large language models (LLMs) on various tasks, we find that semantic textual similarity (STS) has been under-explored. In this study, we show that STS can be cast as a text generation problem whil…
View article: HealthE: Recognizing Health Advice & Entities in Online Health Communities
HealthE: Recognizing Health Advice & Entities in Online Health Communities Open
The task of extracting and classifying entities is at the core of important Health-NLP systems such as misinformation detection, medical dialogue modeling, and patient-centric information tools. Granular knowledge of textual entities allow…
View article: Scope of Pre-trained Language Models for Detecting Conflicting Health Information
Scope of Pre-trained Language Models for Detecting Conflicting Health Information Open
An increasing number of people now rely on online platforms to meet their health information needs. Thus identifying inconsistent or conflicting textual health information has become a safety-critical task. Health advice data poses a uniqu…
View article: The Scope of In-Context Learning for the Extraction of Medical Temporal Constraints
The Scope of In-Context Learning for the Extraction of Medical Temporal Constraints Open
Medications often impose temporal constraints on everyday patient activity. Violations of such medical temporal constraints (MTCs) lead to a lack of treatment adherence, in addition to poor health outcomes and increased healthcare expenses…
View article: Theme-driven Keyphrase Extraction to Analyze Social Media Discourse
Theme-driven Keyphrase Extraction to Analyze Social Media Discourse Open
Social media platforms are vital resources for sharing self-reported health experiences, offering rich data on various health topics. Despite advancements in Natural Language Processing (NLP) enabling large-scale social media data analysis…
View article: ActSafe: Predicting Violations of Medical Temporal Constraints for Medication Adherence
ActSafe: Predicting Violations of Medical Temporal Constraints for Medication Adherence Open
Prescription medications often impose temporal constraints on regular health behaviors (RHBs) of patients, e.g., eating before taking medication. Violations of such medical temporal constraints (MTCs) can result in adverse effects. Detecti…
View article: HealthE
HealthE Open
# HealthE Dataset HealthE contains 3,400 pieces of health advice gathered 1) from public health websites (i.e. WebMD.com, MedlinePlus.gov, CDC.gov, and MayoClinic.org) 2) from the publicly available [Preclude dataset]([https://userpages.um…
View article: HealthE
HealthE Open
# HealthE Dataset HealthE contains 3,400 pieces of health advice gathered 1) from public health websites (i.e. WebMD.com, MedlinePlus.gov, CDC.gov, and MayoClinic.org) 2) from the publicly available [Preclude dataset]([https://userpages.um…
View article: Chain-of-Thought Embeddings for Stance Detection on Social Media
Chain-of-Thought Embeddings for Stance Detection on Social Media Open
Stance detection on social media is challenging for Large Language Models (LLMs), as emerging slang and colloquial language in online conversations often contain deeply implicit stance labels. Chain-of-Thought (COT) prompting has recently …
View article: HealthE: Classifying Entities in Online Textual Health Advice
HealthE: Classifying Entities in Online Textual Health Advice Open
The processing of entities in natural language is essential to many medical NLP systems. Unfortunately, existing datasets vastly under-represent the entities required to model public health relevant texts such as health advice often found …
View article: Scope of Pre-trained Language Models for Detecting Conflicting Health Information
Scope of Pre-trained Language Models for Detecting Conflicting Health Information Open
An increasing number of people now rely on online platforms to meet their health information needs. Thus identifying inconsistent or conflicting textual health information has become a safety-critical task. Health advice data poses a uniqu…
View article: Identifying the Perceived Severity of Patient-Generated Telemedical Queries Regarding COVID: Developing and Evaluating a Transfer Learning–Based Solution
Identifying the Perceived Severity of Patient-Generated Telemedical Queries Regarding COVID: Developing and Evaluating a Transfer Learning–Based Solution Open
Background Triage of textual telemedical queries is a safety-critical task for medical service providers with limited remote health resources. The prioritization of patient queries containing medically severe text is necessary to optimize …
View article: Detecting Inconsistent Health Information by Leveraging Abstract Meaning Representation Graphs
Detecting Inconsistent Health Information by Leveraging Abstract Meaning Representation Graphs Open
An increasing number of people now rely on online platforms to meet their health information needs. Thus identifying inconsistent or conflicting textual medical information has become a safety-critical task. While there are ongoing efforts…
View article: Identifying the Perceived Severity of Patient-Generated Telemedicine Queries Regarding COVID: Developing and Evaluating a Transfer Learning Based Solution (Preprint)
Identifying the Perceived Severity of Patient-Generated Telemedicine Queries Regarding COVID: Developing and Evaluating a Transfer Learning Based Solution (Preprint) Open
BACKGROUND Triage of textual telemedical queries is a safety-critical task for medical service providers with limited remote health resources. The prioritization of patient queries containing medically severe text is necessary to optimize…
View article: Single Sample Feature Importance: An Interpretable Algorithm for Low-Level Feature Analysis
Single Sample Feature Importance: An Interpretable Algorithm for Low-Level Feature Analysis Open
Have you ever wondered how your feature space is impacting the prediction of a specific sample in your dataset? In this paper, we introduce Single Sample Feature Importance (SSFI), which is an interpretable feature importance algorithm tha…