Antonis Maronikolakis
YOU?
Author Swipe
View article: Temporal Robustness in Hate Speech Detection: Updating German Classifiers with Advanced AI Infrastructures
Temporal Robustness in Hate Speech Detection: Updating German Classifiers with Advanced AI Infrastructures Open
Over the past two decades, hate speech on social media has surged, causing significant harm and threatening democracies. Initially, research focused on English hate speech, but recent years have seen the development of non-English datasets…
View article: A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities
A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities Open
View article: A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities
A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities Open
Hate speech online remains an understudied issue for marginalized communities, particularly in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized commun…
View article: What should I wear to a party in a Greek taverna? Evaluation for Conversational Agents in the Fashion Domain
What should I wear to a party in a Greek taverna? Evaluation for Conversational Agents in the Fashion Domain Open
Large language models (LLMs) are poised to revolutionize the domain of online fashion retail, enhancing customer experience and discovery of fashion online. LLM-powered conversational agents introduce a new way of discovery by directly int…
View article: Politeness Stereotypes and Attack Vectors: Gender Stereotypes in Japanese and Korean Language Models
Politeness Stereotypes and Attack Vectors: Gender Stereotypes in Japanese and Korean Language Models Open
In efforts to keep up with the rapid progress and use of large language models, gender bias research is becoming more prevalent in NLP. Non-English bias research, however, is still in its infancy with most work focusing on English. In our …
View article: Sociocultural knowledge is needed for selection of shots in hate speech detection tasks
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks Open
We introduce HATELEXICON, a lexicon of slurs and targets of hate speech for the countries of Brazil, Germany, India and Kenya, to aid training and interpretability of models. We demonstrate how our lexicon can be used to interpret model pr…
View article: Ethical scaling for content moderation: Extreme speech and the (in)significance of artificial intelligence
Ethical scaling for content moderation: Extreme speech and the (in)significance of artificial intelligence Open
In this article, we present new empirical evidence to demonstrate the severe limitations of existing machine learning content moderation methods to keep pace with, let alone stay ahead of, hateful language online. Building on the collabora…
View article: This joke is [MASK]: Recognizing Humor and Offense with Prompting
This joke is [MASK]: Recognizing Humor and Offense with Prompting Open
Humor is a magnetic component in everyday human interactions and communications. Computationally modeling humor enables NLP systems to entertain and engage with users. We investigate the effectiveness of prompting, a new transfer learning …
View article: Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes Open
To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis. When it comes to analysis of bias, previous work has focused predominantly on race. In our work, we further investigate bias in hate…
View article: Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) Open
Warning: This work contains strong and offensive language, sometimes uncensored.To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis.When it comes to analysis of bias, previous work has …
View article: Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes Open
To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis. When it comes to analysis of bias, previous work has focused predominantly on race. In our work, we further investigate bias in hate…
View article: Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments Open
Building on current work on multilingual hate speech (e.g., Ousidhoum et al. (2019)) and hate speech reduction (e.g., Sap et al. (2020)), we present XTREMESPEECH, a new hate speech dataset containing 20,297 social media passages from Brazi…
View article: Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments Open
Building on current work on multilingual hate speech (e.g., Ousidhoum et al. (2019)) and hate speech reduction (e.g., Sap et al. (2020)), we present XTREMESPEECH, a new hate speech dataset containing 20,297 social media passages from Brazi…
View article: Separating Hate Speech and Offensive Language Classes via Adversarial Debiasing
Separating Hate Speech and Offensive Language Classes via Adversarial Debiasing Open
Research to tackle hate speech plaguing online media has made strides in providing solutions, analyzing bias and curating data. A challenging problem is ambiguity between hate speech and offensive language, causing low performance both ove…
View article: Wine is Not v i n. -- On the Compatibility of Tokenizations Across Languages
Wine is Not v i n. -- On the Compatibility of Tokenizations Across Languages Open
The size of the vocabulary is a central design choice in large pretrained language models, with respect to both performance and memory requirements. Typically, subword tokenization algorithms such as byte pair encoding and WordPiece are us…
View article: Identifying Automatically Generated Headlines using Transformers
Identifying Automatically Generated Headlines using Transformers Open
False information spread via the internet and social media influences public opinion and user activity, while generative models enable fake content to be generated faster and more cheaply than had previously been possible. In the not so di…
View article: Artificial Intelligence, Extreme Speech and the Challenges of Online Content Moderation
Artificial Intelligence, Extreme Speech and the Challenges of Online Content Moderation Open
View article: Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda
Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda Open
Welcome to the fourth edition of the Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda.This is the second time we are running the workshop virtually, due to the COVID-19 pandemic.The pandemic has had a profou…
View article: Wine is not v i n. On the Compatibility of Tokenizations across Languages
Wine is not v i n. On the Compatibility of Tokenizations across Languages Open
The size of the vocabulary is a central design choice in large pretrained language models, with respect to both performance and memory requirements. Typically, subword tokenization algorithms such as byte pair encoding and WordPiece are us…
View article: BERT Cannot Align Characters
BERT Cannot Align Characters Open
In previous work, it has been shown that BERT can adequately align cross-lingual sentences on the word level. Here we investigate whether BERT can also operate as a char-level aligner. The languages examined are English, Fake-English, Germ…
View article: Transformers Are Better Than Humans at Identifying Generated Text.
Transformers Are Better Than Humans at Identifying Generated Text. Open
Fake information spread via the internet and social media influences public opinion and user activity. Generative models enable fake content to be generated faster and more cheaply than had previously been possible. This paper examines the…
View article: Analyzing Political Parody in Social Media
Analyzing Political Parody in Social Media Open
Parody is a figurative device used to imitate an entity for comedic or critical purposes and represents a widespread phenomenon in social media through many popular parody accounts. In this paper, we present the first computational study o…