Jan-Christoph Klie
YOU?
Author Swipe
View article: On Efficient and Statistical Quality Estimation for Data Annotation
On Efficient and Statistical Quality Estimation for Data Annotation Open
Annotated datasets are an essential ingredient to train, evaluate, compare and productionalize supervised machine learning models. It is therefore imperative that annotations are of high quality. For their creation, good quality management…
View article: Analyzing Dataset Annotation Quality Management in the Wild
Analyzing Dataset Annotation Quality Management in the Wild Open
Data quality is crucial for training accurate, unbiased, and trustworthy machine learning models as well as for their correct evaluation. Recent work, however, has shown that even popular datasets used to train and evaluate state-of-the-ar…
View article: Analyzing Dataset Annotation Quality Management in the Wild
Analyzing Dataset Annotation Quality Management in the Wild Open
Data quality is crucial for training accurate, unbiased, and trustworthy machine learning models as well as for their correct evaluation. Recent works, however, have shown that even popular datasets used to train and evaluate state-of-the-…
View article: Lessons Learned from a Citizen Science Project for Natural Language Processing
Lessons Learned from a Citizen Science Project for Natural Language Processing Open
Many Natural Language Processing (NLP) systems use annotated corpora for training and evaluation. However, labeled data is often costly to obtain and scaling annotation projects is difficult, which is why annotation tasks are often outsour…
View article: Lessons Learned from a Citizen Science Project for Natural Language Processing
Lessons Learned from a Citizen Science Project for Natural Language Processing Open
Jan-Christoph Klie, Ji-Ung Lee, Kevin Stowe, Gözde Şahin, Nafise Sadat Moosavi, Luke Bates, Dominic Petrak, Richard Eckart De Castilho, Iryna Gurevych. Proceedings of the 17th Conference of the European Chapter of the Association for Compu…
View article: Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future Open
Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that se…
View article: Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future Open
Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that se…
View article: Annotation Curricula to Implicitly Train Non-Expert Annotators
Annotation Curricula to Implicitly Train Non-Expert Annotators Open
Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; …
View article: Erratum: Annotation Curricula to Implicitly Train Non-Expert Annotators
Erratum: Annotation Curricula to Implicitly Train Non-Expert Annotators Open
The authors of this work (“Annotation Curricula to Implicitly Train Non-Expert Annotators” by Ji-Ung Lee, Jan-Christoph Klie, and Iryna Gurevych in Computational Linguistics 48:2 https://doi.org/10.1162/coli_a_00436) discovered an incorrec…
View article: Annotation Curricula to Implicitly Train Non-Expert Annotators
Annotation Curricula to Implicitly Train Non-Expert Annotators Open
Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; …
View article: Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances
Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances Open
Domain-specific conceptual bases use key concepts to capture domain scope and relevant information.Conceptual bases serve as a foundation for various downstream tasks, including ontology construction, information mapping, and analysis.Howe…
View article: Human-In-The-LoopEntity Linking for Low Resource Domains
Human-In-The-LoopEntity Linking for Low Resource Domains Open
Entity linking (EL) is concerned with disambiguating entity mentions in a text against knowledge bases (KB). To quickly annotate texts with EL even in low-resource domains and noisy text, we present a novel Human-In-The-Loop EL approach. W…
View article: From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains
From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains Open
Entity linking (EL) is concerned with disambiguating entity mentions in a text against knowledge bases (KB). It is crucial in a considerable number of fields like humanities, technical writing and biomedical sciences to enrich texts with s…
View article: Towards cross-platform interoperability for machine-assisted text annotation
Towards cross-platform interoperability for machine-assisted text annotation Open
In this paper we investigate cross-platform interoperability for natural language processing (NLP) and, in particular, annotation of textual resources, with an eye toward identifying the design elements of annotation models and processes t…
View article: A Multi-Platform Annotation Ecosystem for Domain Adaptation
A Multi-Platform Annotation Ecosystem for Domain Adaptation Open
This paper describes an ecosystem consisting of three independent text annotation platforms. To demonstrate their ability to work in concert, we illustrate how to use them to address an interactive domain adaptation task in biomedical enti…
View article: Integrating Knowledge-Supported Search into the INCEpTION Annotation Platform
Integrating Knowledge-Supported Search into the INCEpTION Annotation Platform Open
Beto Boullosa, Richard Eckart de Castilho, Naveen Kumar, Jan-Christoph Klie, Iryna Gurevych. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2018.
View article: Multimodal Frame Identification with Multilingual Evaluation
Multimodal Frame Identification with Multilingual Evaluation Open
Teresa Botschen, Iryna Gurevych, Jan-Christoph Klie, Hatem Mousselly-Sergieh, Stefan Roth. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volu…