Explanipedia

Intent-Aware Schema Generation And Refinement For Literature Review Tables Open

Vishakh Padmakumar, Joseph Chee Chang, Kyle Lo, Doug Downey, Aakanksha Naik · 2025

The increasing volume of academic literature makes it essential for researchers to organize, compare, and contrast collections of documents. Large language models (LLMs) can support this process by generating schemas defining shared aspect…

Ai2 Scholar QA: Organized Literature Synthesis with Attribution Open

Amanpreet Singh, Joseph Chee Chang, Chloe Anastasiades, Dany Haddad, Aakanksha Naik , et al. · 2025

Retrieval-augmented generation is increasingly effective in answering scientific questions from literature, but many state-of-the-art systems are expensive and closed-source. We introduce Ai2 Scholar QA, a free online scientific question a…

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Open

Akari Asai, Jacqueline He, Rulin Shao, Weijia Shi, Amanpreet Singh , et al. · 2024

Computer science

Scientific progress depends on researchers' ability to synthesize the growing body of literature. Can large language models (LMs) assist scientists in this task? We introduce OpenScholar, a specialized retrieval-augmented LM that answers s…

The Semantic Reader Project Open

Kyle Lo, Joseph Chee Chang, Andrew Head, Jonathan Bragg, Amy X. Zhang , et al. · 2024

Computer science

Scholarly publications are key to the transfer of knowledge from scholars to others. However, research papers are information-dense, and as the volume of the scientific literature grows, the greater the need for new technology to support s…

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature Open

David Wadden, Kejian Shi, Jacob Morrison, Aakanksha Naik, Shruti Singh , et al. · 2024

Computer science Psychology Philosophy

We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following instances for training and evaluation, covering 54 tasks. These tasks span five core scientific literature understan…

TOPICAL: TOPIC Pages AutomagicaLly Open

John Giorgi, Amanpreet Singh, Doug Downey, Sergey Feldman, Lucy Lu Wang · 2024

Computer science

Topic pages aggregate useful information about an entity or concept into a single succinct and accessible article. Automated creation of topic pages would enable their rapid curation as information resources, providing an alternative to tr…

MARG: Multi-Agent Review Generation for Scientific Papers Open

Mike D’Arcy, Tom Hope, Larry Birnbaum, Doug Downey · 2024

Computer science Psychology Political science

We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a feedback generation approach using multiple LLM instances that engage in internal discussion. By distributing paper text across agents, MARG can co…

CHAMP: Efficient Annotation and Consolidation of Cluster Hierarchies Open

Arie Cattan, Tom Hope, Doug Downey, Roy Bar-Haim, Lilach Eden , et al. · 2023

Computer science Mathematics Economics

Various NLP tasks require a complex hierarchical structure over nodes, where each node is a cluster of items. Examples include generating entailment graphs, hierarchical cross-document coreference resolution, annotating event and subevent …

CARE: Extracting Experimental Findings From Clinical Literature Open

Aakanksha Naik, Bailey Kuehl, Erin Bransom, Doug Downey, Tom Hope · 2023

Computer science Mathematics Geography

Extracting fine-grained experimental findings from literature can provide dramatic utility for scientific applications. Prior work has developed annotation schemas and datasets for limited aspects of this problem, failing to capture the re…

A Computational Inflection for Scientific Discovery Open

Tom Hope, Doug Downey, Daniel S. Weld, Oren Etzioni, Eric Horvitz · 2023

Computer science Psychology

Enabling researchers to leverage systems to overcome the limits of human cognitive capacity.

ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews Open

Mike D’Arcy, Alexis Ross, Erin Bransom, Bailey Kuehl, Jonathan Bragg , et al. · 2023

Computer science Economics

We introduce the task of automatically revising scientific papers based on peer feedback and release ARIES, a dataset of review comments and their corresponding paper edits. The data is drawn from real reviewer-author interactions from com…

Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents Open

Catherine Chen, Zejiang Shen, Dan Klein, Gabriel Stanovsky, Doug Downey , et al. · 2023

Computer science Engineering Mathematics

Recent work has shown that infusing layout features into language models (LMs) improves processing of visually-rich documents such as scientific papers. Layout-infused LMs are often evaluated on documents with familiar layout features (e.g…

SciMON: Scientific Inspiration Machines Optimized for Novelty Open

Qingyun Wang, Doug Downey, Heng Ji, Tom Hope · 2023

Computer science Psychology Biology

We explore and enhance the ability of neural language models to generate novel scientific directions grounded in literature. Work on literature-based hypothesis generation has traditionally focused on binary link prediction--severely limit…

S2abEL: A Dataset for Entity Linking from Scientific Tables Open

Yuze Lou, Bailey Kuehl, Erin Bransom, Sergey Feldman, Aakanksha Naik , et al. · 2023

Computer science Geology Economics

Entity linking (EL) is the task of linking a textual mention to its corresponding entry in a knowledge base, and is critical for many knowledge-intensive NLP applications. When applied to tables in scientific papers, EL is a step toward la…

Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections Open

Srishti Palani, Aakanksha Naik, Doug Downey, Amy X. Zhang, Jonathan Bragg , et al. · 2023

Computer science Psychology Engineering

Scholars who want to research a scientific topic must take time to read,\nextract meaning, and identify connections across many papers. As scientific\nliterature grows, this becomes increasingly challenging. Meanwhile, authors\nsummarize p…

CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context Open

Joseph Chee Chang, Amy X. Zhang, Jonathan Bragg, Andrew Head, Kyle Lo , et al. · 2023

Computer science History Psychology

When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work. However, it can be challenging to prioritize and make sense of the hundreds of citations encountered du…

Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks Open

Zejiang Shen, Tal August, Pao Siangliulue, Kyle Lo, Jonathan Bragg , et al. · 2023

Computer science Psychology Political science

Large language models have introduced exciting new opportunities and challenges in designing and developing new AI-assisted writing support tools. Recent work has shown that leveraging this new technology can transform writing in many scen…

LIMEADE: From AI Explanations to Advice Taking Open

Benjamin Charles Germain Lee, Doug Downey, Kyle Lo, Daniel S. Weld · 2023

Computer science Psychology Philosophy

Research in human-centered AI has shown the benefits of systems that can explain their predictions. Methods that allow AI to take advice from humans in response to explanations are similarly useful. While both capabilities are well develop…

The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces Open

Kyle Lo, Joseph Chee Chang, Andrew Head, Jonathan Bragg, Amy X. Zhang , et al. · 2023

Computer science Political science

Scholarly publications are key to the transfer of knowledge from scholars to others. However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading p…

The Semantic Scholar Open Data Platform Open

Rodney Kinney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg , et al. · 2023

Computer science Mathematics

The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field. Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping…

S2abEL: A Dataset for Entity Linking from Scientific Tables Open

Yuze Lou, Bailey Kuehl, Erin Bransom, Sergey Feldman, Aakanksha Naik , et al. · 2023

Computer science Geology Economics

Entity linking (EL) is the task of linking a textual mention to its corresponding entry in a knowledge base, and is critical for many knowledge-intensive NLP applications. When applied to tables in scientific papers, EL is a step toward la…

PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents Open

Kyle Lo, Zejiang Shen, Benjamin J. Newman, Joseph Chang, Russell Authur , et al. · 2023

Computer science Art Psychology

Kyle Lo, Zejiang Shen, Benjamin Newman, Joseph Chang, Russell Authur, Erin Bransom, Stefan Candra, Yoganand Chandrasekhar, Regan Huff, Bailey Kuehl, Amanpreet Singh, Chris Wilhelm, Angele Zamarron, Marti A. Hearst, Daniel Weld, Doug Downey…

Embedding Recycling for Language Models Open

Jon Saad-Falcon, Amanpreet Singh, Luca Soldaini, Mike D’Arcy, Arman Cohan , et al. · 2023

Computer science Biology Philosophy

Real-world applications of neural language models often involve running many different models over the same corpus. The high computational cost of these runs has led to interest in techniques that can reuse the contextualized embeddings pr…

CHAMP: Efficient Annotation and Consolidation of Cluster Hierarchies Open

Arie Cattan, Tom Hope, Doug Downey, Roy Bar-Haim, Lilach Eden , et al. · 2023

Computer science Economics

Arie Cattan, Tom Hope, Doug Downey, Roy Bar-Haim, Lilach Eden, Yoav Kantor, Ido Dagan. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2023.

Doug Downey YOU? Author Swipe