Explanipedia

Introducing v0.5 of the AI Safety Benchmark from MLCommons Open

Bertie Vidgen, Adarsh Agrawal, Ahmed Mohamed Ahmed, Victor Akinwande, Namir Al-Nuaimi , et al. · 2024

This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models.…

ROBBIE: Robust Bias Evaluation of Large Generative Language Models Open

David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang , et al. · 2023

Computer science Psychology Business

As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across mult…

ROBBIE: Robust Bias Evaluation of Large Generative Language Models Open

David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang , et al. · 2023

Computer science Sociology Psychology

David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, Eric Smith. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 20…

"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset Open

Eric M. Smith, Melissa Hall, Melanie Kambadur, E. Presani, Adina Williams · 2022

Computer science Psychology Physics

As language models grow in popularity, it becomes increasingly important to clearly measure all possible markers of demographic identity in order to avoid perpetuating existing societal harms. Many datasets for measuring bias currently exi…

“I’m sorry to hear that”: Finding New Biases in Language Models with a Holistic Descriptor Dataset Open

Eric M. Smith, Melissa Hall, Melanie Kambadur, E. Presani, Adina Williams · 2022

Computer science Psychology Physics

As language models grow in popularity, it becomes increasingly important to clearly measure all possible markers of demographic identity in order to avoid perpetuating existing societal harms. Many datasets for measuring bias currently exi…

Bringing Citations and Usage Metrics Together to Make Data Count Open

Helena Cousijn, Patricia Feeney, Daniella Lowenberg, E. Presani, Natasha Simons · 2019

Computer science Medicine Biology

Over the last years, many organizations have been working on infrastructure to facilitate sharing and reuse of research data. This means that researchers now have ways of making their data available, but not necessarily incentives to do so…

E. Presani YOU? Author Swipe