Stefanie Dipper
YOU?
Author Swipe
View article: Beyond semantics: the challenges of annotating pragmatic and discourse phenomena
Beyond semantics: the challenges of annotating pragmatic and discourse phenomena Open
The goal of this special issue is to show the challenges faced in reliably annotating abstractsemantic and pragmatic information at both the sentence and discourse levels, and how those chal-lenges are being met. Such information is freque…
View article: Cheap Annotation of Complex Information: A Study on the Annotation of Information Status in German TEDx Talks
Cheap Annotation of Complex Information: A Study on the Annotation of Information Status in German TEDx Talks Open
View article: Annotating Metaphorical Mappings: An Implementation of Steen’s Five Step Method
Annotating Metaphorical Mappings: An Implementation of Steen’s Five Step Method Open
This paper presents an implementation of the Five Step Method first proposed by Gerard J. Steen. It shows the utility of the method for annotating metaphorical language and the underlying conceptual mappings in religious texts from differe…
View article: Metaphors of Religion
Metaphors of Religion Open
"The CRC studies the role of metaphor in religious meaning-making. In metaphors, meaning is transferred from one semantic domain to another. By adopting conceptual metaphor theory, the CRC seeks to more thoroughly understand this process a…
View article: Reference Corpus of Early New High German (1350–1650)
Reference Corpus of Early New High German (1350–1650) Open
Das "Referenzkorpus Frühneuhochdeutsch" (ReF) ist ein Korpus diplomatisch transkribierter und annotierter Texte des Frühneuhochdeutschen (1350–1650). Es wurde an den drei Projektstandorten Bochum, Halle und Potsdam erstellt und enthält mor…
View article: Reference Corpus of Early New High German (1350–1650)
Reference Corpus of Early New High German (1350–1650) Open
Das "Referenzkorpus Frühneuhochdeutsch" (ReF) ist ein Korpus diplomatisch transkribierter und annotierter Texte des Frühneuhochdeutschen (1350–1650). Es wurde an den drei Projektstandorten Bochum, Halle und Potsdam erstellt und enthält mor…
View article: Reference Corpus of Early New High German (1350–1650)
Reference Corpus of Early New High German (1350–1650) Open
Das "Referenzkorpus Frühneuhochdeutsch" (ReF) ist ein Korpus diplomatisch transkribierter und annotierter Texte des Frühneuhochdeutschen (1350–1650). Es wurde an den drei Projektstandorten Bochum, Halle und Potsda…
View article: Reference Corpus of Early New High German (1350–1650)
Reference Corpus of Early New High German (1350–1650) Open
Das "Referenzkorpus Frühneuhochdeutsch" (ReF) ist ein Korpus diplomatisch transkribierter und annotierter Texte des Frühneuhochdeutschen (1350–1650). Es wurde an den drei Projektstandorten Bochum, Halle und Potsdam erstellt und enthält mor…
View article: Reference Corpus of Early New High German (1350–1650)
Reference Corpus of Early New High German (1350–1650) Open
Das "Referenzkorpus Frühneuhochdeutsch" (ReF) ist ein Korpus diplomatisch transkribierter und annotierter Texte des Frühneuhochdeutschen (1350–1650). Es wurde an den drei Projektstandorten Bochum, Halle und Potsdam erstellt und enthält mor…
View article: Towards a broad-coverage graphemic analysis of large historical corpora
Towards a broad-coverage graphemic analysis of large historical corpora Open
This paper presents a method which we are developing to explore graphemic variation in large historical corpora of German. Historical corpora provide an amount of data at the level of graphemics which cannot be handled exhaustively using c…
View article: Frontmatter
Frontmatter Open
View article: Frontmatter
Frontmatter Open
View article: Special Issue on <i>Indeterminacies and mismatches in grammatical systems</i>
Special Issue on <i>Indeterminacies and mismatches in grammatical systems</i> Open
This issue
View article: Frontmatter
Frontmatter Open
View article: Frontmatter
Frontmatter Open
View article: The Litkey Corpus: A richly annotated longitudinal corpus of German texts written by primary school children
The Litkey Corpus: A richly annotated longitudinal corpus of German texts written by primary school children Open
Compared to early language development, later changes to the language system during orthography and literacy acquisition have not yet been researched in detail. We present a longitudinal corpus of texts on short picture stories written by …
View article: Frontmatter
Frontmatter Open
View article: The making of the Litkey Corpus, a richly annotated longitudinal corpus of German texts written by primary school children
The making of the Litkey Corpus, a richly annotated longitudinal corpus of German texts written by primary school children Open
To date, corpus and computational linguistic work on written language acquisition has mostly dealt with second language learners who have usually already mastered orthography acquisition in their first language. In this paper, we present t…
View article: Variation between Different Discourse Types: Literate vs. Oral
Variation between Different Discourse Types: Literate vs. Oral Open
This paper deals with the automatic identification of literate and oral discourse in German texts. A range of linguistic features is selected and their role in distinguishing between literate- and oral-oriented registers is investigated, u…
View article: Frontmatter
Frontmatter Open
View article: Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin
Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin Open
In this paper we describe a dataset of German and Latin \textit{ground truth} (GT) for historical OCR in the form of printed text line images paired with their transcription. This dataset, called \textit{GT4HistOCR}, consists of 313,173 li…
View article: GT4HistOCR: Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin
GT4HistOCR: Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin Open
GT4HistOCR contains ground truth for research in Optical Character Recognition (OCR) technology applied to historical printings in German Fraktur and Early Modern Latin. The ground truth comes in pairs of images of single …
View article: GT4HistOCR: Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin
GT4HistOCR: Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin Open
GT4HistOCR contains ground truth for research in Optical Character Recognition (OCR) technology applied to historical printings in German Fraktur and Early Modern Latin. The ground truth comes in pairs of images of single …
View article: Anaphora With Non-nominal Antecedents in Computational Linguistics: a Survey
Anaphora With Non-nominal Antecedents in Computational Linguistics: a Survey Open
This article provides an extensive overview of the literature related to the phenomenon of non-nominal-antecedent anaphora (also known as abstract anaphora or discourse deixis), a type of anaphora in which an anaphor like “that” refers to …
View article: 5. Historische Linguistik 2.0
5. Historische Linguistik 2.0 Open
View article: Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin
Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin Open
In this paper we describe a dataset of German and Latin ground truth (GT) for historical OCR in the form of printed text line images paired with their transcription.This dataset, called GT4HistOCR, consists of 313,173 line pairs covering a…
View article: Abstract Pronominal Anaphors And Label Nouns In German And English: Selected Case Studies And Quantitative Investigations
Abstract Pronominal Anaphors And Label Nouns In German And English: Selected Case Studies And Quantitative Investigations Open
anaphors refer to abstract referents, such as facts or events. This paper presents a corpus-based comparative study of German and English abstract anaphors. Parallel bi-directional texts from the Europarl Corpus were annotated with functio…
View article: Variance in Historical Data: How bad is it and how can we profit from it for historical linguistics?
Variance in Historical Data: How bad is it and how can we profit from it for historical linguistics? Open
View article: Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus
Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus Open
NLP applications for learners often rely on annotated learner corpora. Thereby, it is important that the annotations are both meaningful for the task, and consistent and reliable. We present a new longitudinal L1 learner corpus for German …
View article: Investigating Diatopic Variation in a Historical Corpus
Investigating Diatopic Variation in a Historical Corpus Open
This paper investigates diatopic variation in a historical corpus of German. Based on equivalent word forms from different language areas, replacement rules and mappings are derived which describe the relations between these word forms. Th…