Jamie Kiros
YOU?
Author Swipe
View article: Neuromodulatory Control Networks (NCNs): A Biologically Inspired Architecture for Dynamic LLM Processing
Neuromodulatory Control Networks (NCNs): A Biologically Inspired Architecture for Dynamic LLM Processing Open
Large Language Models (LLMs) based on the Transformer architecture have achieved remarkable success, yet their core processing mechanisms remain largely static after training. While powerful, this static nature limits their ability to dyna…
View article: Generate, Annotate, and Learn: NLP with Synthetic Text
Generate, Annotate, and Learn: NLP with Synthetic Text Open
This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We formulate a general framework called “generate, annotate, and learn (GAL)” to take advantage of synthetic text within knowledge distillation,…
View article: Generate, Annotate, and Learn: Generative Models Advance Self-Training and Knowledge Distillation.
Generate, Annotate, and Learn: Generative Models Advance Self-Training and Knowledge Distillation. Open
Semi-Supervised Learning (SSL) has seen success in many application domains, but this success often hinges on the availability of task-specific unlabeled data. Knowledge distillation (KD) has enabled compressing deep networks and ensembles…
View article: Generate, Annotate, and Learn: NLP with Synthetic Text
Generate, Annotate, and Learn: NLP with Synthetic Text Open
This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We formulate a general framework called ``generate, annotate, and learn (GAL)'' to take advantage of synthetic text within knowledge distillatio…
View article: Improving domain adaptation in de-identification of electronic health records through self-training
Improving domain adaptation in de-identification of electronic health records through self-training Open
Objective De-identification is a fundamental task in electronic health records to remove protected health information entities. Deep learning models have proven to be promising tools to automate de-identification processes. However, when t…
View article: Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels
Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels Open
A channel corresponds to a viewpoint or transformation of an underlying meaning. A pair of parallel sentences in English and French express the same underlying meaning, but through two separate channels corresponding to their languages. In…
View article: Contextual Lensing of Universal Sentence Representations
Contextual Lensing of Universal Sentence Representations Open
What makes a universal sentence encoder universal? The notion of a generic encoder of text appears to be at odds with the inherent contextualization and non-permanence of language use in a dynamic world. However, mapping sentences into gen…
View article: An Empirical Study of Generation Order for Machine Translation
An Empirical Study of Generation Order for Machine Translation Open
In this work, we present an empirical study of generation order for machine translation. Building on recent advances in insertion-based modeling, we first introduce a soft order-reward framework that enables us to train models to follow ar…
View article: Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels
Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels Open
A channel corresponds to a viewpoint or transformation of an underlying meaning. A pair of parallel sentences in English and French express the same underlying meaning, but through two separate channels corresponding to their languages. In…
View article: An Empirical Study of Generation Order for Machine Translation
An Empirical Study of Generation Order for Machine Translation Open
In this work, we present an empirical study of generation order for machine translation. Building on recent advances in insertion-based modeling, we first introduce a soft order-reward framework that enables us to train models to follow ar…
View article: Graph Normalizing Flows
Graph Normalizing Flows Open
We introduce graph normalizing flows: a new, reversible graph neural network model for prediction and generation. On supervised tasks, graph normalizing flows perform similarly to message passing neural networks, but at a significantly red…
View article: DOM-Q-NET: Grounded RL on Structured Language
DOM-Q-NET: Grounded RL on Structured Language Open
Building agents to interact with the web would allow for significant improvements in knowledge understanding and representation learning. However, web navigation tasks are difficult for current deep reinforcement learning (RL) models due t…
View article: ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning
ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning Open
Sparse reward is one of the most challenging problems in reinforcement learning (RL). Hindsight Experience Replay (HER) attempts to address this issue by converting a failed experience to a successful one by relabeling the goals. Despite i…
View article: Insertion Transformer: Flexible Sequence Generation via Insertion Operations
Insertion Transformer: Flexible Sequence Generation via Insertion Operations Open
We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations. Unlike typical autoregressive models which rely on a fixed, often left-to-right ordering of the outpu…
View article: Graph Normalizing Flows
Graph Normalizing Flows Open
We introduce graph normalizing flows: a new, reversible graph neural network model for prediction and generation. On supervised tasks, graph normalizing flows perform similarly to message passing neural networks, but at a significantly red…
View article: Proceedings of The Third Workshop on Representation Learning for NLP
Proceedings of The Third Workshop on Representation Learning for NLP Open
Despite the popularity of word embeddings, the precise way by which they acquire semantic relations between words remain unclear.In the present article, we investigate whether LSA and word2vec capacity to identify relevant semantic relatio…
View article: InferLite: Simple Universal Sentence Representations from Natural Language Inference Data
InferLite: Simple Universal Sentence Representations from Natural Language Inference Data Open
Natural language inference has been shown to be an effective supervised task for learning generic sentence embeddings. In order to better understand the components that lead to effective representations, we propose a lightweight version of…
View article: Illustrative Language Understanding: Large-Scale Visual Grounding with Image Search
Illustrative Language Understanding: Large-Scale Visual Grounding with Image Search Open
We introduce Picturebook, a large-scale lookup operation to ground language via ‘snapshots’ of our physical world accessed through image search. For each word in a vocabulary, we extract the top-k images from Google image search and feed t…
View article: VSE++: Improving Visual-Semantic Embeddings with Hard Negatives
VSE++: Improving Visual-Semantic Embeddings with Hard Negatives Open
We present a new technique for learning visual-semantic embeddings for cross-modal retrieval. Inspired by hard negative mining, the use of hard negatives in structured prediction, and ranking loss functions, we introduce a simple change to…
View article: VSE++: Improved Visual-Semantic Embeddings.
VSE++: Improved Visual-Semantic Embeddings. Open
This paper investigates the problem of image-caption retrieval using joint
visual-semantic embeddings. We introduce a very simple change to the loss
function used in the original formulation by Kiros et al. (2014), which leads
to drastic i…
View article: Joint Embeddings of Scene Graphs and Images
Joint Embeddings of Scene Graphs and Images Open
Belilovsky E., Blaschko M., Kiros J.R., Urtasun R., Zemel R., ''Joint embeddings of scene graphs and images'', 5th international conference on learning representations workshop track - ICLR 2017, 5 pp., April 24-26, 2017, Toulon, France.
View article: Layer Normalization
Layer Normalization Open
Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distributi…
View article: Towards Generalizable Sentence Embeddings
Towards Generalizable Sentence Embeddings Open
In this work, we evaluate different sentence encoders with emphasis on examining their embedding spaces.Specifically, we hypothesize that a "high-quality" embedding aids in generalization, promoting transfer learning as well as zero-shot a…