Marc Marone
YOU?
Author Swipe
View article: mmBERT: A Modern Multilingual Encoder with Annealed Language Learning
mmBERT: A Modern Multilingual Encoder with Annealed Language Learning Open
Encoder-only languages models are frequently used for a variety of standard machine learning tasks, including classification and retrieval. However, there has been a lack of recent research for encoder models, especially with respect to mu…
View article: Seq vs Seq: An Open Suite of Paired Encoders and Decoders
Seq vs Seq: An Open Suite of Paired Encoders and Decoders Open
The large language model (LLM) community focuses almost exclusively on decoder-only language models, since they are easier to use for text generation. However, a large subset of the community still uses encoder-only models for tasks such a…
View article: Certified Mitigation of Worst-Case LLM Copyright Infringement
Certified Mitigation of Worst-Case LLM Copyright Infringement Open
The exposure of large language models (LLMs) to copyrighted material during pre-training raises concerns about unintentional copyright infringement post deployment. This has driven the development of "copyright takedown" methods, post-trai…
View article: Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data Open
View article: Certified Mitigation of Worst-Case LLM Copyright Infringement
Certified Mitigation of Worst-Case LLM Copyright Infringement Open
View article: AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees
AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees Open
Large language models (LLMs) are increasingly capable of completing knowledge intensive tasks by recalling information from a static pretraining corpus. Here we are concerned with LLMs in the context of evolving data requirements. For inst…
View article: Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data Open
To trust the fluent generations of large language models (LLMs), humans must be able to verify their correctness against trusted, external sources. Recent efforts, such as providing citations via retrieved documents or post-hoc provenance,…
View article: Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Dated Data: Tracing Knowledge Cutoffs in Large Language Models Open
Released Large Language Models (LLMs) are often paired with a claimed knowledge cutoff date, or the dates at which training data was gathered. Such information is crucial for applications where the LLM must provide up to date information. …
View article: “According to . . . ”: Prompting Language Models Improves Quoting from Pre-Training Data
“According to . . . ”: Prompting Language Models Improves Quoting from Pre-Training Data Open
View article: "According to ...": Prompting Language Models Improves Quoting from Pre-Training Data
"According to ...": Prompting Language Models Improves Quoting from Pre-Training Data Open
Large Language Models (LLMs) may hallucinate and generate fake information, despite pre-training on factual data. Inspired by the journalistic device of "according to sources", we propose according-to prompting: directing LLMs to ground re…
View article: Data Portraits: Recording Foundation Model Training Data
Data Portraits: Recording Foundation Model Training Data Open
Foundation models are trained on increasingly immense and opaque datasets. Even while these models are now key in AI system building, it can be difficult to answer the straightforward question: has the model already encountered a given exa…
View article: The Effect of Alignment Correction on Cross-Lingual Annotation Projection
The Effect of Alignment Correction on Cross-Lingual Annotation Projection Open
Cross-lingual annotation projection is a practical method for improving performance on low resource structured prediction tasks. An important step in annotation projection is obtaining alignments between the source and target texts, which …
View article: Pretrained Models for Multilingual Federated Learning
Pretrained Models for Multilingual Federated Learning Open
Since the advent of Federated Learning (FL), research has applied these methods to natural language processing (NLP) tasks. Despite a plethora of papers in FL for NLP, no previous works have studied how multilingual text impacts FL algorit…
View article: Pretrained Models for Multilingual Federated Learning
Pretrained Models for Multilingual Federated Learning Open
Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, Benjamin Van Durme. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2022.
View article: Everything Is All It Takes: A Multipronged Strategy for Zero-Shot\n Cross-Lingual Information Extraction
Everything Is All It Takes: A Multipronged Strategy for Zero-Shot\n Cross-Lingual Information Extraction Open
Zero-shot cross-lingual information extraction (IE) describes the\nconstruction of an IE model for some target language, given existing\nannotations exclusively in some other language, typically English. While the\nadvance of pretrained mu…
View article: Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction
Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction Open
Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English. While the advance of pretrained multi…
View article: Character Eyes: Seeing Language through Character-Level Taggers
Character Eyes: Seeing Language through Character-Level Taggers Open
Character-level models have been used extensively in recent years in NLP tasks as both supplements and replacements for closed-vocabulary token-level word representations. In one popular architecture, character-level LSTMs are used to feed…
View article: Selecting, Planning, and Rewriting: A Modular Approach for Data-to-Document Generation and Translation
Selecting, Planning, and Rewriting: A Modular Approach for Data-to-Document Generation and Translation Open
In this paper, we report our system submissions to all 6 tracks of the WNGT 2019 shared task on Document-Level Generation and Translation. The objective is to generate a textual document from either structured data: generation task, or a d…