Explanipedia

On Efficient and Statistical Quality Estimation for Data Annotation Open

Jan-Christoph Klie, Rahul Nair, Juan Haladjian, Marc Kirchner · 2024

Annotated datasets are an essential ingredient to train, evaluate, compare and productionalize supervised machine learning models. It is therefore imperative that annotations are of high quality. For their creation, good quality management…

Analyzing Dataset Annotation Quality Management in the Wild Open

Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych · 2024

Data quality is crucial for training accurate, unbiased, and trustworthy machine learning models as well as for their correct evaluation. Recent work, however, has shown that even popular datasets used to train and evaluate state-of-the-ar…

Analyzing Dataset Annotation Quality Management in the Wild Open

Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych · 2023

Data quality is crucial for training accurate, unbiased, and trustworthy machine learning models as well as for their correct evaluation. Recent works, however, have shown that even popular datasets used to train and evaluate state-of-the-…

Lessons Learned from a Citizen Science Project for Natural Language Processing Open

Jan-Christoph Klie, Ji-Ung Lee, Kevin Stowe, Gözde Gül Şahin, Nafise Sadat Moosavi , et al. · 2023

Many Natural Language Processing (NLP) systems use annotated corpora for training and evaluation. However, labeled data is often costly to obtain and scaling annotation projects is difficult, which is why annotation tasks are often outsour…

Lessons Learned from a Citizen Science Project for Natural Language Processing Open

Jan-Christoph Klie, Ji-Ung Lee, Kevin Stowe, Gözde Gül Şahin, Nafise Sadat Moosavi , et al. · 2023

Jan-Christoph Klie, Ji-Ung Lee, Kevin Stowe, Gözde Şahin, Nafise Sadat Moosavi, Luke Bates, Dominic Petrak, Richard Eckart De Castilho, Iryna Gurevych. Proceedings of the 17th Conference of the European Chapter of the Association for Compu…

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future Open

Jan-Christoph Klie, Bonnie Webber, Iryna Gurevych · 2022

Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that se…

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future Open

Jan-Christoph Klie, Bonnie Webber, Iryna Gurevych · 2022

Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that se…

Annotation Curricula to Implicitly Train Non-Expert Annotators Open

Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych · 2022

Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; …

Erratum: Annotation Curricula to Implicitly Train Non-Expert Annotators Open

Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych · 2022

The authors of this work (“Annotation Curricula to Implicitly Train Non-Expert Annotators” by Ji-Ung Lee, Jan-Christoph Klie, and Iryna Gurevych in Computational Linguistics 48:2 https://doi.org/10.1162/coli_a_00436) discovered an incorrec…

Annotation Curricula to Implicitly Train Non-Expert Annotators Open

Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych · 2021

Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; …

Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances Open

K. B. Saxena, Tushita Singh, Ashwini Patil, Sagar Sunkle, Vinay Kulkarni , et al. · 2021

Domain-specific conceptual bases use key concepts to capture domain scope and relevant information.Conceptual bases serve as a foundation for various downstream tasks, including ontology construction, information mapping, and analysis.Howe…

Human-In-The-LoopEntity Linking for Low Resource Domains Open

Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych · 2021

Entity linking (EL) is concerned with disambiguating entity mentions in a text against knowledge bases (KB). To quickly annotate texts with EL even in low-resource domains and noisy text, we present a novel Human-In-The-Loop EL approach. W…

From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains Open

Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych · 2020

Entity linking (EL) is concerned with disambiguating entity mentions in a text against knowledge bases (KB). It is crucial in a considerable number of fields like humanities, technical writing and biomedical sciences to enrich texts with s…

Towards cross-platform interoperability for machine-assisted text annotation Open

Richard Eckart de Castilho, Nancy Ide, Jin-Dong Kim, Jan-Christoph Klie, Keith Suderman · 2019

In this paper we investigate cross-platform interoperability for natural language processing (NLP) and, in particular, annotation of textual resources, with an eye toward identifying the design elements of annotation models and processes t…

A Multi-Platform Annotation Ecosystem for Domain Adaptation Open

Richard Eckart de Castilho, Nancy Ide, Jin-Dong Kim, Jan-Christoph Klie, Keith Suderman · 2019

This paper describes an ecosystem consisting of three independent text annotation platforms. To demonstrate their ability to work in concert, we illustrate how to use them to address an interactive domain adaptation task in biomedical enti…

Integrating Knowledge-Supported Search into the INCEpTION Annotation Platform Open

Beto Boullosa, Richard Eckart de Castilho, Naveen Kumar, Jan-Christoph Klie, Iryna Gurevych · 2018

Beto Boullosa, Richard Eckart de Castilho, Naveen Kumar, Jan-Christoph Klie, Iryna Gurevych. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2018.

Multimodal Frame Identification with Multilingual Evaluation Open

Teresa Botschen, Iryna Gurevych, Jan-Christoph Klie, Hatem Mousselly-Sergieh, Stefan Roth · 2018

Teresa Botschen, Iryna Gurevych, Jan-Christoph Klie, Hatem Mousselly-Sergieh, Stefan Roth. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volu…

Jan-Christoph Klie YOU? Author Swipe