Explanipedia

Neuromodulatory Control Networks (NCNs): A Biologically Inspired Architecture for Dynamic LLM Processing Open

Jimmy Ba, Jamie Kiros, Geoffrey E. Hinton · 2025

Computer science Sociology

Large Language Models (LLMs) based on the Transformer architecture have achieved remarkable success, yet their core processing mechanisms remain largely static after training. While powerful, this static nature limits their ability to dyna…

Generate, Annotate, and Learn: NLP with Synthetic Text Open

Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi · 2022

Computer science Chemistry Economics

This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We formulate a general framework called “generate, annotate, and learn (GAL)” to take advantage of synthetic text within knowledge distillation,…

Generate, Annotate, and Learn: Generative Models Advance Self-Training and Knowledge Distillation. Open

Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi · 2021

Computer science Mathematics Physics

Semi-Supervised Learning (SSL) has seen success in many application domains, but this success often hinges on the availability of task-specific unlabeled data. Knowledge distillation (KD) has enabled compressing deep networks and ensembles…

Generate, Annotate, and Learn: NLP with Synthetic Text Open

Xuanli He, Islam A. Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi · 2021

Computer science Economics Physics

This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We formulate a general framework called ``generate, annotate, and learn (GAL)'' to take advantage of synthetic text within knowledge distillatio…

Improving domain adaptation in de-identification of electronic health records through self-training Open

S. Matthew Liao, Jamie Kiros, Jian‐Zhang Chen, Zhaolei Zhang, Ting Chen · 2021

Computer science Physics Mathematics

Objective De-identification is a fundamental task in electronic health records to remove protected health information entities. Deep learning models have proven to be promising tools to automate de-identification processes. However, when t…

Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels Open

Harris Chan, Jamie Kiros, William Chan · 2020

Computer science Mathematics Psychology

A channel corresponds to a viewpoint or transformation of an underlying meaning. A pair of parallel sentences in English and French express the same underlying meaning, but through two separate channels corresponding to their languages. In…

Contextual Lensing of Universal Sentence Representations Open

Jamie Kiros · 2020

Computer science Psychology Philosophy

What makes a universal sentence encoder universal? The notion of a generic encoder of text appears to be at odds with the inherent contextualization and non-permanence of language use in a dynamic world. However, mapping sentences into gen…

An Empirical Study of Generation Order for Machine Translation Open

William Chan, Mitchell Stern, Jamie Kiros, Jakob Uszkoreit · 2020

Computer science Physics Philosophy

In this work, we present an empirical study of generation order for machine translation. Building on recent advances in insertion-based modeling, we first introduce a soft order-reward framework that enables us to train models to follow ar…

Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels Open

Harris Chan, Jamie Kiros, William Chan · 2020

Computer science Mathematics Psychology

A channel corresponds to a viewpoint or transformation of an underlying meaning. A pair of parallel sentences in English and French express the same underlying meaning, but through two separate channels corresponding to their languages. In…

An Empirical Study of Generation Order for Machine Translation Open

William Chan, Mitchell Stern, Jamie Kiros, Jakob Uszkoreit · 2019

Computer science Mathematics Economics

In this work, we present an empirical study of generation order for machine translation. Building on recent advances in insertion-based modeling, we first introduce a soft order-reward framework that enables us to train models to follow ar…

Graph Normalizing Flows Open

Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, Kevin Swersky · 2019

Computer science

We introduce graph normalizing flows: a new, reversible graph neural network model for prediction and generation. On supervised tasks, graph normalizing flows perform similarly to message passing neural networks, but at a significantly red…

DOM-Q-NET: Grounded RL on Structured Language Open

Sheng Jia, Jamie Kiros, Jimmy Ba · 2019

Computer science Mathematics Art

Building agents to interact with the web would allow for significant improvements in knowledge understanding and representation learning. However, web navigation tasks are difficult for current deep reinforcement learning (RL) models due t…

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning Open

Harris Chan, Yuhuai Wu, Jamie Kiros, Sanja Fidler, Jimmy Ba · 2019

Computer science Psychology Political science

Sparse reward is one of the most challenging problems in reinforcement learning (RL). Hindsight Experience Replay (HER) attempts to address this issue by converting a failed experience to a successful one by relabeling the goals. Despite i…

Insertion Transformer: Flexible Sequence Generation via Insertion Operations Open

Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit · 2019

Computer science Mathematics Engineering

We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations. Unlike typical autoregressive models which rely on a fixed, often left-to-right ordering of the outpu…

Graph Normalizing Flows Open

Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, Kevin Swersky · 2019

Computer science

We introduce graph normalizing flows: a new, reversible graph neural network model for prediction and generation. On supervised tasks, graph normalizing flows perform similarly to message passing neural networks, but at a significantly red…

Proceedings of The Third Workshop on Representation Learning for NLP Open

Isabelle Augenstein, Kris Cao, He He, Felix Hill, Spandana Gella , et al. · 2018

Computer science Philosophy Economics

Despite the popularity of word embeddings, the precise way by which they acquire semantic relations between words remain unclear.In the present article, we investigate whether LSA and word2vec capacity to identify relevant semantic relatio…

InferLite: Simple Universal Sentence Representations from Natural Language Inference Data Open

Jamie Kiros, William Chan · 2018

Computer science Biology Philosophy

Natural language inference has been shown to be an effective supervised task for learning generic sentence embeddings. In order to better understand the components that lead to effective representations, we propose a lightweight version of…

Illustrative Language Understanding: Large-Scale Visual Grounding with Image Search Open

Jamie Kiros, William Chan, Geoffrey E. Hinton · 2018

Computer science Philosophy

We introduce Picturebook, a large-scale lookup operation to ground language via ‘snapshots’ of our physical world accessed through image search. For each word in a vocabulary, we extract the top-k images from Google image search and feed t…

VSE++: Improving Visual-Semantic Embeddings with Hard Negatives Open

Fartash Faghri, David J. Fleet, Jamie Kiros, Sanja Fidler · 2017

Computer science Art

We present a new technique for learning visual-semantic embeddings for cross-modal retrieval. Inspired by hard negative mining, the use of hard negatives in structured prediction, and ranking loss functions, we introduce a simple change to…

VSE++: Improved Visual-Semantic Embeddings. Open

Fartash Faghri, David J. Fleet, Jamie Kiros, Sanja Fidler · 2017

Computer science Mathematics Biology

This paper investigates the problem of image-caption retrieval using joint visual-semantic embeddings. We introduce a very simple change to the loss function used in the original formulation by Kiros et al. (2014), which leads to drastic i…

Joint Embeddings of Scene Graphs and Images Open

Eugene Belilovsky, Matthew B. Blaschko, Jamie Kiros, Raquel Urtasun, Richard S. Zemel · 2017

Computer science Engineering Economics

Belilovsky E., Blaschko M., Kiros J.R., Urtasun R., Zemel R., ''Joint embeddings of scene graphs and images'', 5th international conference on learning representations workshop track - ICLR 2017, 5 pp., April 24-26, 2017, Toulon, France.

Layer Normalization Open

Jimmy Ba, Jamie Kiros, Geoffrey E. Hinton · 2016

Computer science Sociology

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distributi…

Towards Generalizable Sentence Embeddings Open

Eleni Triantafillou, Jamie Kiros, Raquel Urtasun, Richard S. Zemel · 2016

Computer science

In this work, we evaluate different sentence encoders with emphasis on examining their embedding spaces.Specifically, we hypothesize that a "high-quality" embedding aids in generalization, promoting transfer learning as well as zero-shot a…

Jamie Kiros YOU? Author Swipe