Ben Hachey
YOU?
Author Swipe
View article: Less is More: Explainable and Efficient ICD Code Prediction with Clinical Entities
Less is More: Explainable and Efficient ICD Code Prediction with Clinical Entities Open
View article: Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review
Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review Open
View article: Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review
Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review Open
Clinical coding is crucial for healthcare billing and data analysis. Manual clinical coding is labour-intensive and error-prone, which has motivated research towards full automation of the process. However, our analysis, based on US Englis…
View article: Synthetic Data, Common Data Models and Federation: Holy Trinity or unholy mess?
Synthetic Data, Common Data Models and Federation: Holy Trinity or unholy mess? Open
The healthcare sector's adoption of data and digital technologies is hindered by stringent data privacy regulations. Synthetic data, common data models (CDMs) and federated data ecosystems present promising solutions to these challenges. T…
View article: SynD: Australian synthetic health data community of practice
SynD: Australian synthetic health data community of practice Open
ObjectivesThe current workflow for health data research in Australia is inefficient. After funding is secured, researchers often face delays of months or years to access the necessary data. Synthetic data could significantly improve the pa…
View article: Designing a utility evaluation framework for synthetic health data
Designing a utility evaluation framework for synthetic health data Open
ObjectivesSynthetic data (SD) promises to unlock health data for training, research, and innovation. However, where utility evaluation is performed, it is applied ad-hoc for a single task of interest. We produce an initial design for a rob…
View article: MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction
MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction Open
Active adverse event surveillance monitors Adverse Drug Events (ADE) from different data sources, such as electronic health records, medical literature, social media and search engine logs. Over the years, many datasets have been created, …
View article: Effects of a comprehensive brain computed tomography deep learning model on radiologist detection accuracy
Effects of a comprehensive brain computed tomography deep learning model on radiologist detection accuracy Open
View article: Effects of a comprehensive brain computed tomography deep-learning model on radiologist detection accuracy: a multireader, multicase study
Effects of a comprehensive brain computed tomography deep-learning model on radiologist detection accuracy: a multireader, multicase study Open
Background: Non-contrast computed tomography of the brain (NCCTB) is commonly used in clinical practice to detect intracranial pathology but is subject to interpretation errors. Machine learning is capable of augmenting clinical decision m…
View article: Charting the potential of brain computed tomography deep learning systems
Charting the potential of brain computed tomography deep learning systems Open
Brain computed tomography (CTB) scans are widely used to evaluate intracranial pathology. The implementation and adoption of CTB has led to clinical improvements. However, interpretation errors occur and may have substantial morbidity and …
View article: AI-CLRA 2021 Workshop Organizing Committee
AI-CLRA 2021 Workshop Organizing Committee Open
View article: Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study
Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study Open
Annalise.ai.
View article: Chest radiographs and machine learning – Past, present and future
Chest radiographs and machine learning – Past, present and future Open
Summary Despite its simple acquisition technique, the chest X‐ray remains the most common first‐line imaging tool for chest assessment globally. Recent evidence for image analysis using modern machine learning points to possible improvemen…
View article: Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media
Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media Open
Recent studies on domain-specific BERT models show that effectiveness on downstream tasks can be improved when models are pretrained on in-domain data. Often, the pretraining data used in these models are selected based on their subject ma…
View article: An Effective Transition-based Model for Discontinuous NER
An Effective Transition-based Model for Discontinuous NER Open
Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans. Conventional sequence tagging techniques encode Markov assumptions that are …
View article: An Effective Transition-based Model for Discontinuous NER
An Effective Transition-based Model for Discontinuous NER Open
Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans. Conventional sequence tagging techniques encode Markov assumptions that are …
View article: Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media
Cost-effective Selection of Pretraining Data: A Case Study of Pretraining BERT on Social Media Open
Recent studies on domain-specific BERT models show that effectiveness on downstream tasks can be improved when models are pretrained on in-domain data. Often, the pretraining data used in these models are selected based on their subject ma…
View article: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations Open
View article: Using Similarity Measures to Select Pretraining Data for NER
Using Similarity Measures to Select Pretraining Data for NER Open
Word vectors and Language Models (LMs) pretrained on a large amount of unlabelled data can dramatically improve various Natural Language Processing (NLP) tasks. However, the measure and impact of similarity between pretraining data and tar…
View article: Using Similarity Measures to Select Pretraining Data for
Using Similarity Measures to Select Pretraining Data for Open
Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.
View article: NNE: A Dataset for Nested Named Entity Recognition in English Newswire
NNE: A Dataset for Nested Named Entity Recognition in English Newswire Open
Named entity recognition (NER) is widely used in natural language processing applications and downstream tasks. However, most NER tools target flat annotation from popular datasets, eschewing the semantic information available in nested en…
View article: Face Value: business leaders nervous about consumers spending less and regulation
Face Value: business leaders nervous about consumers spending less and regulation Open
View article: Can adult mental health be predicted by childhood future-self narratives? Insights from the CLPsych 2018 Shared Task
Can adult mental health be predicted by childhood future-self narratives? Insights from the CLPsych 2018 Shared Task Open
Kylie Radford, Louise Lavrencic, Ruth Peters, Kim Kiely, Ben Hachey, Scott Nowson, Will Radford. Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic. 2018.
View article: Learning to generate one-sentence biographies from Wikidata
Learning to generate one-sentence biographies from Wikidata Open
We investigate the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summa…
View article: Post-edit Analysis of Collective Biography Generation
Post-edit Analysis of Collective Biography Generation Open
Text generation is increasingly common but often requires manual post-editing where high precision is critical to end users. However, manual editing is expensive so we want to ensure this effort is focused on high-value tasks. And we want …
View article: English Event Detection With Translated Language Features
English Event Detection With Translated Language Features Open
We propose novel radical features from automatic translation for event extraction. Event detection is a complex language processing task for which it is expensive to collect training data, making generalisation challenging. We derive meani…
View article: Learning to generate one-sentence biographies from Wikidata
Learning to generate one-sentence biographies from Wikidata Open
We investigate the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summa…
View article: Post-edit Analysis of Collective Biography Generation
Post-edit Analysis of Collective Biography Generation Open
Text generation is increasingly common but often requires manual post-editing where high precision is critical to end users. However, manual editing is expensive so we want to ensure this effort is focused on high-value tasks. And we want …
View article: Presenting a New Dataset for the Timeline Generation Problem
Presenting a New Dataset for the Timeline Generation Problem Open
The timeline generation task summarises an entity's biography by selecting stories representing key events from a large pool of relevant documents. This paper addresses the lack of a standard dataset and evaluative methodology for the prob…
View article: :telephone::person::sailboat::whale::okhand:; or "Call me Ishmael" - How do you translate emoji?
:telephone::person::sailboat::whale::okhand:; or "Call me Ishmael" - How do you translate emoji? Open
We report on an exploratory analysis of Emoji Dick, a project that leverages crowdsourcing to translate Melville's Moby Dick into emoji. This distinctive use of emoji removes textual context, and leads to a varying translation quality. In …