Andrew Piper
YOU?
Author Swipe
View article: NarraBench: A Comprehensive Framework for Narrative Benchmarking
NarraBench: A Comprehensive Framework for Narrative Benchmarking Open
We present NarraBench, a theory-informed taxonomy of narrative-understanding tasks, as well as an associated survey of 78 existing benchmarks in the area. We find significant need for new evaluations covering aspects of narrative understan…
View article: BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages
BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages Open
People worldwide use language in subtle and complex ways to express emotions. Although emotion recognition--an umbrella term for several NLP tasks--impacts various applications within NLP and beyond, most work in this area has focused on h…
View article: Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents
Mini Worldlit: A Dataset of Contemporary Fiction from 13 Countries, Nine Languages, and Five Continents Open
World literature plays a key role in understanding the global diversity of human storytelling. However, datasets suitable for large-scale cross-cultural analysis remain limited. Responding to the increasing digitization of literary texts a…
View article: The Detection and Understanding of Fictional Discourse
The Detection and Understanding of Fictional Discourse Open
In this paper, we present a variety of classification experiments related to the task of fictional discourse detection. We utilize a diverse array of datasets, including contemporary professionally published fiction, historical fiction fro…
View article: Where Do People Tell Stories Online? Story Detection Across Online Communities
Where Do People Tell Stories Online? Story Detection Across Online Communities Open
Story detection in online communities is a challenging task as stories are scattered across communities and interwoven with non-storytelling spans within a single text. We address this challenge by building and releasing the StorySeeker to…
View article: Supplementary Data for "What do characters do? The Embodied Agency of Fictional Characters" (JCLS 2023)
Supplementary Data for "What do characters do? The Embodied Agency of Fictional Characters" (JCLS 2023) Open
Supplementary Data and Code for Andrew Piper, "What do characters do? The Embodied Agency of Fictional Characters" (JCLS 2023).
View article: Non-monetary narratives motivate businesses to engage with climate change
Non-monetary narratives motivate businesses to engage with climate change Open
The dominant narrative to motivate business actors to take climate actions emphasizes opportunities to increase monetary gains, linking sustainability to the financial goals of these organizations. The prevalence of monetary motivations in…
View article: Towards a Data-Driven Theory of Narrativity
Towards a Data-Driven Theory of Narrativity Open
Pre-print version of published article.
View article: A quantitative study of non-linearity in storytelling
A quantitative study of non-linearity in storytelling Open
In this paper, we present a study of non-linearity in storytelling in a collection of 2,348 books published since 2001 that are divided among 10 different categories. We employ word embeddings to capture the semantic non-linearity of a boo…
View article: Computational Narrative Understanding: A Big Picture Analysis
Computational Narrative Understanding: A Big Picture Analysis Open
This paper provides an overview of outstanding major research goals for the field of computational narrative understanding. Storytelling is an essential human practice, one that provides a sense of personal meaning, shared sense of communi…
View article: MultiHATHI: A Complete Collection of Multilingual Prose Fiction in the HathiTrust Digital Library
MultiHATHI: A Complete Collection of Multilingual Prose Fiction in the HathiTrust Digital Library Open
This dataset provides detailed metadata on ca. 10.2 million works of fiction and non-fiction written after 1799 in 521 different languages available in the HathiTrust Digital Library. The dataset bolsters the May 2022 Hathifile by supplyin…
View article: A quantitative study of non-linearity in storytelling
A quantitative study of non-linearity in storytelling Open
Data accompanying the article, "A quantitative study of non-linearity in storytelling"
View article: The TRANSCOMP Dataset of Literary Translations from 120 Languages and a Parallel Collection of English-language Originals
The TRANSCOMP Dataset of Literary Translations from 120 Languages and a Parallel Collection of English-language Originals Open
The TRANSCOMP Dataset of Literary Translations is a collection of document-level word frequencies sampled from 10,631 translations into English of global literary fiction published since 1950, together with a historically matched parallel …
View article: The TRANSCOMP Dataset of Literary Translations from 120 Languages and a Parallel Collection of English-language Originals
The TRANSCOMP Dataset of Literary Translations from 120 Languages and a Parallel Collection of English-language Originals Open
The TRANSCOMP Dataset of Literary Translations is a collection of document-level word frequencies sampled from over 10,000 translations into English of global literary fiction published since 1950, together with a historically matched para…
View article: The COVID That Wasn't: Counterfactual Journalism Using GPT
The COVID That Wasn't: Counterfactual Journalism Using GPT Open
In this paper, we explore the use of large language models to assess human interpretations of real world events. To do so, we use a language model trained prior to 2020 to artificially generate news articles concerning COVID-19 given the h…
View article: Replication Data for "Biodiversity is not declining in fiction"
Replication Data for "Biodiversity is not declining in fiction" Open
This repository provides data and code to support the replication of the paper "Biodiversity is not declining in fiction."
View article: Biodiversity is not declining in fiction
Biodiversity is not declining in fiction Open
This paper attempts to replicate the findings of the recent work, “The rise and fall of biodiversity in literature,” by Langer et al. (2021). Using a large corpus from Project Gutenberg (N = ~15,000) and a dictionary-matching method of ove…
View article: Buying the news: A quantitative study of the effects of corporate acquisition on local news
Buying the news: A quantitative study of the effects of corporate acquisition on local news Open
Local newspapers are increasingly subject to predatory corporate acquisition—corporate takeovers in which media conglomerates purchase publications in financially precarious states, drastically cut staff, and in certain cases consolidate n…
View article: HATHI 1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust
HATHI 1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust Open
We present a new dataset built on prior work consisting of 1,671,370 randomly sampled pages of English-language prose roughly divided between modes of fictional and non-fictional writing and published between the years 1800 and 2000. In ad…
View article: CONLIT
CONLIT Open
This dataset includes derived data on a collection of ca. 2,700 books in English published between 2001-2021 and spanning twelve different genres. The data was manually collected to capture popular writing aimed at a range of different rea…