Biomedical text mining
View article
BioBERT: a pre-trained biomedical language representation model for biomedical text mining Open
Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processing (NLP), extracting valuable information from biomedical literature ha…
View article
Publicly Available Clinical Open
Contextual word embedding models such as ELMo and BERT have dramatically improved performance for many natural language processing (NLP) tasks in recent months. However, these models have been minimally explored on specialty corpora, such …
View article
Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets Open
Inspired by the success of the General Language Understanding Evaluation benchmark, we introduce the Biomedical Language Understanding Evaluation (BLUE) benchmark to facilitate research in the development of pre-training language represent…
View article
A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques Open
The amount of text that is generated every day is increasing dramatically. This tremendous volume of mostly unstructured text cannot be simply processed and perceived by computers. Therefore, efficient and effective techniques and algorith…
View article
BioWordVec, improving biomedical word embeddings with subword information and MeSH Open
Distributed word representations have become an essential foundation for biomedical natural language processing (BioNLP), text mining and information retrieval. Word embeddings are traditionally computed at the word level from a large corp…
View article
Text Mining in Big Data Analytics Open
Text mining in big data analytics is emerging as a powerful tool for harnessing the power of unstructured textual data by analyzing it to extract new knowledge and to identify significant patterns and correlations hidden in the data. This …
View article
The application of text mining methods in innovation research: current state, evolution patterns, and development priorities Open
Unstructured data in the form of digitized text is rapidly increasing in volume, accessibility, and relevance for research on innovation and beyond. While traditional attempts to analyze text (i.e., qualitative analysis) are limited in pro…
View article
Extraction of Pharmacokinetic Evidence of Drug–Drug Interactions from the Literature Open
Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from…
View article
GNormPlus: An Integrative Approach for Tagging Genes, Gene Families, and Protein Domains Open
The automatic recognition of gene names and their associated database identifiers from biomedical text has been widely studied in recent years, as these tasks play an important role in many downstream text-mining applications. Despite sign…
View article
Community challenges in biomedical text mining over 10 years: success, failure and the future Open
One effective way to improve the state of the art is through competitions. Following the success of the Critical Assessment of protein Structure Prediction (CASP) in bioinformatics research, a number of challenge evaluations have been orga…
View article
Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task Open
Manually curating chemicals, diseases and their relationships is significantly important to biomedical research, but it is plagued by its high cost and the rapid growth of the biomedical literature. In recent years, there has been a growin…
View article
Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery Open
Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease…
View article
Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art Open
A large array of pretrained models are available to the biomedical NLP (BioNLP) community. Finding the best model for a particular task can be difficult and time-consuming. For many applications in the biomedical and clinical domains, it i…
View article
Named Entity Recognition and Relation Detection for Biomedical Information Extraction Open
The number of scientific publications in the literature is steadily growing, containing our knowledge in the biomedical, health, and clinical sciences. Since there is currently no automatic archiving of the obtained results, much of this i…
View article
KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences Open
KnowLife is a large knowledge base for health and life sciences, automatically constructed from different Web sources. As a unique feature, KnowLife is harvested from different text genres such as scientific publications, health portals, a…
View article
A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining Open
The amount of biomedical literature is vast and growing quickly, and accurate text mining techniques could help researchers to efficiently extract useful information from the literature. However, existing named entity recognition models us…
View article
A global network of biomedical relationships derived from text Open
Motivation The biomedical community’s collective understanding of how chemicals, genes and phenotypes interact is distributed across the text of over 24 million research articles. These interactions offer insights into the mechanisms behin…
View article
A Complete Process of Text Classification System Using State-of-the-Art NLP Models Open
With the rapid advancement of information technology, online information has been exponentially growing day by day, especially in the form of text documents such as news events, company reports, reviews on products, stocks-related reports,…
View article
BioRED: a rich biomedical relation extraction dataset Open
Automated relation extraction (RE) from biomedical literature is critical for many downstream text mining applications in both research and real-world settings. However, most existing benchmarking datasets for biomedical RE only focus on r…
View article
Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text Open
Summary Background: Natural Language Processing (NLP) methods are increasingly being utilized to mine knowledge from unstructured health-related texts. Recent advances in noisy text processing techniques are enabling researchers and medica…
View article
Biomedical named entity recognition using deep neural networks with contextual information Open
Background In biomedical text mining, named entity recognition (NER) is an important task used to extract information from biomedical articles. Previously proposed methods for NER are dictionary- or rule-based methods and machine learning …
View article
Natural Language to Structured Query Generation via Meta-Learning Open
Po-Sen Huang, Chenglong Wang, Rishabh Singh, Wen-tau Yih, Xiaodong He. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers).…
View article
Corpus domain effects on distributional semantic modeling of medical terms Open
Motivation: Automatically quantifying semantic similarity and relatedness between clinical terms is an important aspect of text mining from electronic health records, which are increasingly recognized as valuable sources of phenotypic info…
View article
D3NER: biomedical named entity recognition using CRF-biLSTM improved with fine-tuned embeddings of various linguistic information Open
Motivation Recognition of biomedical named entities in the textual literature is a highly challenging research topic with great interest, playing as the prerequisite for extracting huge amount of high-valued biomedical knowledge deposited …
View article
Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods Open
Objective Identification of drugs, associated medication entities, and interactions among them are crucial to prevent unwanted effects of drug therapy, known as adverse drug events. This article describes our participation to the n2c2 shar…
View article
Recent advances in biomedical literature mining Open
The recent years have witnessed a rapid increase in the number of scientific articles in biomedical domain. These literature are mostly available and readily accessible in electronic format. The domain knowledge hidden in them is critical …
View article
Tracking State Changes in Procedural Text: a Challenge Dataset and Models for Process Paragraph Comprehension Open
Bhavana Dalvi, Lifu Huang, Niket Tandon, Wen-tau Yih, Peter Clark. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.
View article
BioELECTRA:Pretrained Biomedical text Encoder using Discriminators Open
Recent advancements in pretraining strategies in NLP have shown a significant improvement in the performance of models on various text mining tasks. We apply ‘replaced token detection’ pretraining technique proposed by ELECTRA and pretrain…
View article
BERN2: an advanced neural biomedical named entity recognition and normalization tool Open
In biomedical natural language processing, named entity recognition (NER) and named entity normalization (NEN) are key tasks that enable the automatic extraction of biomedical entities (e.g. diseases and drugs) from the ever-growing biomed…
View article
Overview of the Bacteria Biotope Task at BioNLP Shared Task 2016 Open
This paper presents the Bacteria Biotope task of the BioNLP Shared Task 2016, which follows the previous 2013 and 2011 editions.The task focuses on the extraction of the locations (biotopes and geographical places) of bacteria from PubMed …