Aron Henriksson
YOU?
Author Swipe
View article: Mind the gap: from plausible to valid self-explanations in large language models
Mind the gap: from plausible to valid self-explanations in large language models Open
This paper investigates the reliability of explanations generated by large language models (LLMs) when prompted to explain their previous output. We evaluate two kinds of such self-explanations ( SE )—extractive and counterfactual—using st…
View article: Identifying Adverse Drug Events in Clinical Text Using Fine-Tuned Clinical Language Models: Machine Learning Study
Identifying Adverse Drug Events in Clinical Text Using Fine-Tuned Clinical Language Models: Machine Learning Study Open
Background Medications are essential for health care but can cause adverse drug events (ADEs), which are harmful and sometimes fatal. Detecting ADEs is a challenging task because they are often not documented in the structured data of elec…
View article: The future of healthcare‐associated infection surveillance: Automated surveillance and using the potential of artificial intelligence
The future of healthcare‐associated infection surveillance: Automated surveillance and using the potential of artificial intelligence Open
Healthcare‐associated infections (HAIs) are common adverse events, and surveillance is considered a core component of effective HAI reduction programmes. Recently, efforts have focused on automating the traditional manual surveillance proc…
View article: Mind the Gap: From Plausible to Valid Self-Explanations in Large Language Models
Mind the Gap: From Plausible to Valid Self-Explanations in Large Language Models Open
This paper investigates the reliability of explanations generated by large language models (LLMs) when prompted to explain their previous output. We evaluate two kinds of such self-explanations (SE) – extractive and counterfactual – using …
View article: Data-Constrained Synthesis of Training Data for De-Identification
Data-Constrained Synthesis of Training Data for De-Identification Open
Many sensitive domains -- such as the clinical domain -- lack widely available datasets due to privacy risks. The increasing generative capabilities of large language models (LLMs) have made synthetic datasets a viable path forward. In thi…
View article: Fine-tuning Clinical Language Models to Identify Adverse Drug Events in Clinical Text (Preprint)
Fine-tuning Clinical Language Models to Identify Adverse Drug Events in Clinical Text (Preprint) Open
BACKGROUND Medications are essential for health care but can cause adverse drug events (ADEs), which are harmful and sometimes fatal. Detecting ADEs is a challenging task because they are often not documented in the structured data of ele…
View article: Text Retrieval in Restricted Domains by Pairwise Term Co-occurrence
Text Retrieval in Restricted Domains by Pairwise Term Co-occurrence Open
Text similarity calculation by text embeddings requires fine-tuning of the language model by a large amount of labeled data, which may not be available for small text collections in their specific knowledge domains, in particular, in publi…
View article: Evaluating the Reliability of Self-Explanations in Large Language Models
Evaluating the Reliability of Self-Explanations in Large Language Models Open
This paper investigates the reliability of explanations generated by large language models (LLMs) when prompted to explain their previous output. We evaluate two kinds of such self-explanations - extractive and counterfactual - using three…
View article: End-to-end pseudonymization of fine-tuned clinical BERT models
End-to-end pseudonymization of fine-tuned clinical BERT models Open
Many state-of-the-art results in natural language processing (NLP) rely on large pre-trained language models (PLMs). These models consist of large amounts of parameters that are tuned using vast amounts of training data. These factors caus…
View article: CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification
CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification Open
Contaminated or adulterated food poses a substantial risk to human health. Given sets of labeled web texts for training, Machine Learning and Natural Language Processing can be applied to automatically detect such risks. We publish a datas…
View article: Multimodal fine-tuning of clinical language models for predicting COVID-19 outcomes
Multimodal fine-tuning of clinical language models for predicting COVID-19 outcomes Open
Clinical prediction models tend only to incorporate structured healthcare data, ignoring information recorded in other data modalities, including free-text clinical notes. Here, we demonstrate how multimodal models that effectively leverag…
View article: The augmented value of using clinical notes in semi-automated surveillance of deep surgical site infections after colorectal surgery
The augmented value of using clinical notes in semi-automated surveillance of deep surgical site infections after colorectal surgery Open
Background In patients who underwent colorectal surgery, an existing semi-automated surveillance algorithm based on structured data achieves high sensitivity in detecting deep surgical site infections (SSI), however, generates a significan…
View article: End-to-End Pseudonymization of Fine-Tuned Clinical BERT Models
End-to-End Pseudonymization of Fine-Tuned Clinical BERT Models Open
Many state-of-the-art results in natural language processing (NLP) rely on large pre-trained language models (PLMs). These models consist of large amounts of parameters that are tuned using vast amounts of training data. These factors caus…
View article: The accuracy of fully automated algorithms for surveillance of healthcare-onset <i>Clostridioides difficile</i> infections in hospitalized patients
The accuracy of fully automated algorithms for surveillance of healthcare-onset <i>Clostridioides difficile</i> infections in hospitalized patients Open
We developed and validated a set of fully automated surveillance algorithms for healthcare-onset CDI using electronic health records. In a validation data set of 750 manually annotated admissions, the algorithm based on International Class…
View article: Holistic data-driven requirements elicitation in the big data era
Holistic data-driven requirements elicitation in the big data era Open
Digital transformation stimulates continuous generation of large amounts of digital data, both in organizations and in society at large. As a consequence, there have been growing efforts in the Requirements Engineering community to conside…
View article: Data-Driven Agile Requirements Elicitation through the Lenses of Situational Method Engineering
Data-Driven Agile Requirements Elicitation through the Lenses of Situational Method Engineering Open
Ubiquitous digitalization has led to the continuous generation of large amounts of digital data, both in organizations and in society at large. In the requirements engineering community, there has been a growing interest in considering dig…
View article: The accuracy of fully automated algorithms for surveillance of healthcare-associated urinary tract infections in hospitalized patients
The accuracy of fully automated algorithms for surveillance of healthcare-associated urinary tract infections in hospitalized patients Open
A fully automated surveillance algorithm based on NLP to find UTI symptoms in free-text had acceptable performance to detect HA-UTI compared to manual record review. Algorithms based on administrative and microbiology data only were not su…
View article: Data-Driven Requirements Elicitation: A Systematic Literature Review
Data-Driven Requirements Elicitation: A Systematic Literature Review Open
Requirements engineering has traditionally been stakeholder-driven. In addition to domain knowledge, widespread digitalization has led to the generation of vast amounts of data (Big Data) from heterogeneous digital sources such as the Inte…
View article: Terminology Expansion with Prototype Embeddings: Extracting Symptoms of Urinary Tract Infection from Clinical Text
Terminology Expansion with Prototype Embeddings: Extracting Symptoms of Urinary Tract Infection from Clinical Text Open
Many natural language processing applications rely on the availability of domain-specific terminologies containing synonyms. To that end, semi-automatic methods for extracting additional synonyms of a given concept from corpora are useful,…
View article: HAI-Proactive: Development of an Automated Surveillance System for Healthcare-Associated Infections in Sweden
HAI-Proactive: Development of an Automated Surveillance System for Healthcare-Associated Infections in Sweden Open
Background: Healthcare-associated infection (HAI) surveillance is essential for most infection prevention programs and continuous epidemiological data can be used to inform healthcare personal, allocate resources, and evaluate intervention…
View article: Validation of automated sepsis surveillance based on the Sepsis-3 clinical criteria against physician record review in a general hospital population: observational study using electronic health records data
Validation of automated sepsis surveillance based on the Sepsis-3 clinical criteria against physician record review in a general hospital population: observational study using electronic health records data Open
Background Surveillance of sepsis incidence is important for directing resources and evaluating quality-of-care interventions. The aim was to develop and validate a fully-automated Sepsis-3 based surveillance system in non-intensive care w…
View article: The Impact of De-identification on Downstream Named Entity Recognition in Clinical Text
The Impact of De-identification on Downstream Named Entity Recognition in Clinical Text Open
The impact of de-identification on data quality and, in particular, utility for developing models for downstream tasks has been more thoroughly studied for structured data than for unstructured text. While previous studies indicate that te…
View article: Deep Learning from Heterogeneous Sequences of Sparse Medical Data for Early Prediction of Sepsis
Deep Learning from Heterogeneous Sequences of Sparse Medical Data for Early Prediction of Sepsis Open
Sepsis is a life-threatening complication to infections, and early treatment is key for survival. Symptoms of sepsis are difficult to recognize, but prediction models using data from electronic health records (EHRs) can facilitate early de…