Explanipedia

Adaptive Accompaniment with ReaLchords Open

Yusong Wu, Tim Cooijmans, Kyle Kastner, Adam Roberts, Ian Simon , et al. · 2025

Jamming requires coordination, anticipation, and collaborative creativity between musicians. Current generative models of music produce expressive output but are not able to generate in an \emph{online} manner, meaning simultaneously with …

Zero-shot Cross-lingual Voice Transfer for TTS Open

Fadi Biadsy, Youzheng Chen, Isaac Elias, Kyle Kastner, Gary Wang , et al. · 2024

In this paper, we introduce a zero-shot Voice Transfer (VT) module that can be seamlessly integrated into a multi-lingual Text-to-speech (TTS) system to transfer an individual's voice across languages. Our proposed VT module comprises a sp…

Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting Open

Hyun Jin Park, Dhruuv Agarwal, Neng Chen, Rentao Sun, Kurt Partridge , et al. · 2024

Computer science Geography

The keyword spotting (KWS) problem requires large amounts of real speech training data to achieve high accuracy across diverse populations. Utilizing large amounts of text-to-speech (TTS) synthesized data can reduce the cost and time assoc…

Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model Open

Hyun Jin Park, Dhruuv Agarwal, Neng Chen, Rentao Sun, Kurt Partridge , et al. · 2024

Computer science

This paper explores the use of TTS synthesized training data for KWS (keyword spotting) task while minimizing development cost and time. Keyword spotting models require a huge amount of training data to be accurate, and obtaining such trai…

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data Open

Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner , et al. · 2024

Computer science Philosophy

Collecting high-quality studio recordings of audio is challenging, which limits the language coverage of text-to-speech (TTS) systems. This paper proposes a framework for scaling a multilingual TTS model to 100+ languages using found data …

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings Open

Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen , et al. · 2024

Computer science Philosophy

Automatic speech recognition (ASR) systems can suffer from poor recall for various reasons, such as noisy audio, lack of sufficient training data, etc. Previous work has shown that recall can be improved by retrieving rewrite candidates fr…

Understanding Shared Speech-Text Representations Open

Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew E. Rosenberg , et al. · 2023

Computer science Physics Political science

Recently, a number of approaches to train speech models by incorpo-rating text into end-to-end models have been developed, with Mae-stro advancing state-of-the-art automatic speech recognition (ASR)and Speech Translation (ST) performance. …

R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Open

Kyle Kastner, Aaron Courville · 2022

Computer science Mathematics Biology

This paper introduces R-MelNet, a two-part autoregressive architecture with a frontend based on the first tier of MelNet and a backend WaveRNN-style audio decoder for neural text-to-speech synthesis. Taking as input a mixed sequence of cha…

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling Open

Yusong Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner , et al. · 2021

Computer science Engineering Economics

Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive controls, but at the cost of realism. Black-box neural audio synthesis and concatena…

Deep Learning-Based Point-Scanning Super-Resolution Microscopy Open

Uri Manor, Linjing Fang, Fred Monroe, Sammy Weiser Novak, Lyndsey M. Kirk , et al. · 2021

Computer science Materials science Physics

An abstract is not available for this content so a preview has been provided. As you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Data for Point Scanning Super Resolution Imaging (PSSR) Open

Linjing Fang, Fred Monroe, Sammy Weiser Novak, Lyndsey M. Kirk, Cara R. Schiavon , et al. · 2021

Chemistry Physics Biology

This data release contains pretrained models, all training and testing data for the PSSR paper published in Nature Methods: Deep learning-based point-scanning super-resolution imaging (https://dx.doi.org/10.1038/s41592-021-01080-z) Data ar…

Deep Learning-Based Point-Scanning Super-Resolution Imaging Open

Linjing Fang, Fred Monroe, Sammy Weiser Novak, Lyndsey M. Kirk, Cara R. Schiavon , et al. · 2019

Computer science Physics

Point scanning imaging systems (e.g. scanning electron or laser scanning confocal microscopes) are perhaps the most widely used tools for high resolution cellular and tissue imaging. Like all other imaging modalities, the resolution, speed…

Representation Mixing for TTS Synthesis Open

Kyle Kastner, João Felipe Santos, Yoshua Bengio, Aaron Courville · 2019

Computer science Mathematics Physics

Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation. However, the choice between character or phoneme input can create serious limitations for practical d…

Planning in Dynamic Environments with Conditional Autoregressive Models Open

Johanna Hansen, Kyle Kastner, Aaron Courville, Gregory Dudek · 2018

Computer science Mathematics Engineering

We demonstrate the use of conditional autoregressive generative models (van den Oord et al., 2016a) over a discrete latent space (van den Oord et al., 2017b) for forward planning with MCTS. In order to test this method, we introduce a new …

Harmonic Recomposition using Conditional Autoregressive Modeling Open

Kyle Kastner, Rithesh Kumar, Tim Cooijmans, Aaron Courville · 2018

Computer science Mathematics Engineering

We demonstrate a conditional autoregressive pipeline for efficient music recomposition, based on methods presented in van den Oord et al.(2017). Recomposition (Casal & Casey, 2010) focuses on reworking existing musical pieces, adhering to …

Blindfold Baselines for Embodied QA Open

Ankesh Anand, Eugene Belilovsky, Kyle Kastner, Hugo Larochelle, Aaron Courville · 2018

Computer science Philosophy Engineering

We explore blindfold (question-only) baselines for Embodied Question Answering. The EmbodiedQA task requires an agent to answer a question by intelligently navigating in a simulated environment, gathering necessary visual information only …

Learning to discover sparse graphical models Open

Eugene Belilovsky, Kyle Kastner, Gaël Varoquaux, Matthew B. Blaschko · 2017

Computer science

Belilovsky E., Kastner K., Varoquaux G., Blaschko M., ''Learning to discover sparse graphical models'', 5th international conference on learning representations workshop track - ICLR 2017, 13 pp., April 24-26, 2017, Toulon, France.

Structured prediction and generative modeling using neural networks Open

Kyle Kastner · 2016

Computer science

Cette thèse traite de l'usage des Réseaux de Neurones pour modélisation de données séquentielles. La façon dont l'information a été ordonnée et structurée est cruciale pour la plupart des données. Les mots qui composent ce paragraphe en co…

ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation Open

Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho , et al. · 2016

Computer science Chemistry Economics

We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed archit…

Learning to Discover Graphical Model Structures Open

Eugene Belilovsky, Kyle Kastner, Gaël Varoquaux, Matthew B. Blaschko · 2016

Computer science Mathematics

We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring the formulation of priors and sophisticated inference procedures. In th…

Learning to Discover Sparse Graphical Models Open

Eugene Belilovsky, Kyle Kastner, Gaël Varoquaux, Matthew B. Blaschko · 2016

Computer science

We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring the formulation of priors and sophisticated inference procedures. Popul…

Learning to Discover Probabilistic Graphical Model Structures Open

Eugene Belilovsky, Kyle Kastner, Gaël Varoquaux, Matthew B. Blaschko · 2016

Computer science Mathematics

In this work we consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring formulating priors and sophisticated inference procedures.…

ReSeg: A Recurrent Neural Network for Object Segmentation Open

Francesco Visin, Kyle Kastner, Aaron Courville, Yoshua Bengio, Matteo Matteucci , et al. · 2015

Computer science Engineering Chemistry

We propose a structured prediction architecture for images centered around deep recurrent neural networks. The proposed network, called ReSeg, is based on the recently introduced ReNet model for object classification. We modify and extend …

theanets: v0.6.1 Open

Leif Johnson, talbaumel, Kyle Kastner, Yu Yang, С. В. Романов , et al. · 2015

Computer science

Version 0.6.1 of theanets is now live! pip install -U theanets http://pypi.python.org/pypi/theanets http://theanets.readthedocs.org http://github.com/lmjohns3/theanets The biggest change in this release series is a Network/Layer refactor t…

A Recurrent Latent Variable Model for Sequential Data Open

Jun‐Young Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron Courville , et al. · 2015

Computer science Physics

In this paper, we explore the inclusion of latent random variables into the dynamic hidden state of a recurrent neural network (RNN) by combining elements of the variational autoencoder. We argue that through the use of high-level latent r…

A Recurrent Latent Variable Model for Sequential Data Open

Jun‐Young Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron Courville , et al. · 2015

Computer science Physics

In this paper, we explore the inclusion of latent random variables into the dynamic hidden state of a recurrent neural network (RNN) by combining elements of the variational autoencoder. We argue that through the use of high-level latent r…

ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks Open

Francesco Visin, Kyle Kastner, Kyunghyun Cho, Matteo Matteucci, Aaron Courville , et al. · 2015

Computer science

In this paper, we propose a deep neural network architecture for object recognition based on recurrent neural networks. The proposed network, called ReNet, replaces the ubiquitous convolution+pooling layer of the deep convolutional neural …

theanets: Version 0.5.0 Open

Leif Johnson, talbaumel, Kyle Kastner, Filip Juricek, Eben Olson , et al. · 2015

Computer science

Version 0.5.0 of theanets is now live! pip install -U theanets http://pypi.python.org/pypi/theanets http://theanets.readthedocs.org http://github.com/lmjohns3/theanets Some great new features have been incorporated into this release, but t…

Kyle Kastner YOU? Author Swipe