Explanipedia

A New Generation of Perspective API: Efficient Multilingual Character-level Transformers Open

Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Sorensen, Jai Prakash Gupta , et al. · 2022

On the world wide web, toxic content detectors are a crucial line of defense against potentially hateful and offensive messages. As such, building highly effective classifiers that enable a safer internet is an important research area. Mor…

A New Generation of Perspective API: Efficient Multilingual Character-level Transformers Open

Alyssa Lees, Vinh Q. Tran, Yi Tay, Jeffrey Sorensen, Jai Prakash Gupta , et al. · 2022

Computer science Physics Economics

On the world wide web, toxic content detectors are a crucial line of defense against potentially hateful and offensive messages. As such, building highly effective classifiers that enable a safer internet is an important research area. Mor…

Lost in Distillation: A Case Study in Toxicity Modeling Open

Alyssa Chvasta, Alyssa Lees, Jeffrey Sorensen, Lucy Vasserman, Nitesh Goyal · 2022

Computer science Chemistry Geology

In an era of increasingly large pre-trained language models, knowledge distillation is a powerful tool for transferring information from a large model to a smaller one. In particular, distillation is of tremendous benefit when it comes to …

SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification Open

Elisabetta Fersini, Francesca Gasparini, Giulia Rizzi, Aurora Saibene, Berta Chulvi , et al. · 2022

Computer science Engineering Biology

The paper describes the SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification (MAMI),which explores the detection of misogynous memes on the web by taking advantage of available texts and images. The task has been organised in …

Sentence/Table Pair Data from Wikipedia for Pre-training with Distant-Supervision Open

Xiang Deng, Yu Su, Alyssa Lees, You Wu, Cong Yu , et al. · 2021

Computer science Geography

This is the dataset used for pre-training in "ReasonBERT: Pre-trained to Reason with Distant Supervision", EMNLP'21. There are two files: sentence_pairs_for_pretrain_no_tokenization.tar.gz -> Contain only sentences as evidence, Text-only t…

Sentence/Table Pair Data from Wikipedia for Pre-training with Distant-Supervision Open

Xiang Deng, Yu Su, Alyssa Lees, You Wu, Cong Yu , et al. · 2021

Computer science Geography

This is the dataset used for pre-training in "ReasonBERT: Pre-trained to Reason with Distant Supervision", EMNLP'21. There are two files: sentence_pairs_for_pretrain_no_tokenization.tar.gz -> Contain only sentences as evidence, Text-only t…

ReasonBERT: Pre-trained to Reason with Distant Supervision Open

Xiang Deng, Yu Su, Alyssa Lees, You Wu, Cong Yu , et al. · 2021

Computer science Materials science Chemistry

We present ReasonBert, a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid contexts. Unlike existing pre-training methods that only harvest learning signals…

TURL: Table Understanding through Representation Learning Open

Xiang Deng, Huan Sun, Alyssa Lees, You Wu, Cong Yu · 2020

Computer science Geography Economics

Relational tables on the Web store a vast amount of knowledge. Owing to the wealth of such tables, there has been tremendous progress on a variety of tasks in the area of table understanding. However, existing work generally relies on heav…

TURL: Table Understanding through Representation Learning Open

Xiang Deng, Huan Sun, Alyssa Lees, You Wu, Cong Yu · 2020

Computer science Physics Political science

Relational tables on the Web store a vast amount of knowledge. Owing to the wealth of such tables, there has been tremendous progress on a variety of tasks in the area of table understanding. However, existing work generally relies on heav…

Embedding Semantic Taxonomies Open

Alyssa Lees, Chris Welty, Shubin Zhao, Jacek Korycki, Sara Mc Carthy · 2020

Computer science Mathematics Biology

A common step in developing an understanding of a vertical domain, e.g. shopping, dining, movies, medicine, etc., is curating a taxonomy of categories specific to the domain. These human created artifacts have been the subject of research …

Jigsaw @ AMI and HaSpeeDe2: Fine-Tuning a Pre-Trained Comment-Domain BERT Model Open

Alyssa Lees, Jeffrey Sorensen, Ian Kivlichan · 2020

Computer science Psychology Mathematics

The Google Jigsaw team produced submissions for two of the EVALITA 2020 (Basile et al. 2020) shared tasks, based in part on the technology that powers the publicly available PerspectiveAPI comment evaluation service. We present a basic des…

What is Fair? Exploring Pareto-Efficiency for Fairness Constraint Classifiers Open

Alyssa Lees, Ananth Balashankar, Chris Welty, Lakshminarayanan Subramanian · 2019

Computer science Mathematics Materials science

The potential for learned models to amplify existing societal biases has been broadly recognized. Fairness-aware classifier constraints, which apply equality metrics of performance across subgroups defined on sensitive attributes such as r…

What is Fair? Exploring Pareto-Efficiency for Fairness Constrained Classifiers Open

Ananth Balashankar, Alyssa Lees, Chris Welty, Lakshminarayanan Subramanian · 2019

Computer science Economics

The potential for learned models to amplify existing societal biases has been broadly recognized. Fairness-aware classifier constraints, which apply equality metrics of performance across subgroups defined on sensitive attributes such as r…

Fairness Sample Complexity and the Case for Human Intervention Open

Ananth Balashankar, Alyssa Lees · 2019

Computer science Mathematics Psychology

With the aim of building machine learning systems that incorporate standards of fairness and accountability, we explore explicit subgroup sample complexity bounds. The work is motivated by the observation that classifier predictions for re…

Alyssa Lees YOU? Author Swipe