Kyle Kastner
YOU?
Author Swipe
View article: Adaptive Accompaniment with ReaLchords
Adaptive Accompaniment with ReaLchords Open
Jamming requires coordination, anticipation, and collaborative creativity between musicians. Current generative models of music produce expressive output but are not able to generate in an \emph{online} manner, meaning simultaneously with …
View article: Zero-shot Cross-lingual Voice Transfer for TTS
Zero-shot Cross-lingual Voice Transfer for TTS Open
In this paper, we introduce a zero-shot Voice Transfer (VT) module that can be seamlessly integrated into a multi-lingual Text-to-speech (TTS) system to transfer an individual's voice across languages. Our proposed VT module comprises a sp…
View article: Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting Open
The keyword spotting (KWS) problem requires large amounts of real speech training data to achieve high accuracy across diverse populations. Utilizing large amounts of text-to-speech (TTS) synthesized data can reduce the cost and time assoc…
View article: Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model
Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model Open
This paper explores the use of TTS synthesized training data for KWS (keyword spotting) task while minimizing development cost and time. Keyword spotting models require a huge amount of training data to be accurate, and obtaining such trai…
View article: Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data Open
Collecting high-quality studio recordings of audio is challenging, which limits the language coverage of text-to-speech (TTS) systems. This paper proposes a framework for scaling a multilingual TTS model to 100+ languages using found data …
View article: High-precision Voice Search Query Correction via Retrievable Speech-text Embedings
High-precision Voice Search Query Correction via Retrievable Speech-text Embedings Open
Automatic speech recognition (ASR) systems can suffer from poor recall for various reasons, such as noisy audio, lack of sufficient training data, etc. Previous work has shown that recall can be improved by retrieving rewrite candidates fr…
View article: Understanding Shared Speech-Text Representations
Understanding Shared Speech-Text Representations Open
Recently, a number of approaches to train speech models by incorpo-rating text into end-to-end models have been developed, with Mae-stro advancing state-of-the-art automatic speech recognition (ASR)and Speech Translation (ST) performance. …
View article: R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Open
This paper introduces R-MelNet, a two-part autoregressive architecture with a frontend based on the first tier of MelNet and a backend WaveRNN-style audio decoder for neural text-to-speech synthesis. Taking as input a mixed sequence of cha…
View article: MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling Open
Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive controls, but at the cost of realism. Black-box neural audio synthesis and concatena…
View article: Deep Learning-Based Point-Scanning Super-Resolution Microscopy
Deep Learning-Based Point-Scanning Super-Resolution Microscopy Open
An abstract is not available for this content so a preview has been provided. As you have access to this content, a full PDF is available via the ‘Save PDF’ action button.
View article: Data for Point Scanning Super Resolution Imaging (PSSR)
Data for Point Scanning Super Resolution Imaging (PSSR) Open
This data release contains pretrained models, all training and testing data for the PSSR paper published in Nature Methods: Deep learning-based point-scanning super-resolution imaging (https://dx.doi.org/10.1038/s41592-021-01080-z) Data ar…
View article: Deep Learning-Based Point-Scanning Super-Resolution Imaging
Deep Learning-Based Point-Scanning Super-Resolution Imaging Open
Point scanning imaging systems (e.g. scanning electron or laser scanning confocal microscopes) are perhaps the most widely used tools for high resolution cellular and tissue imaging. Like all other imaging modalities, the resolution, speed…
View article: Representation Mixing for TTS Synthesis
Representation Mixing for TTS Synthesis Open
Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation. However, the choice between character or phoneme input can create serious limitations for practical d…
View article: Planning in Dynamic Environments with Conditional Autoregressive Models
Planning in Dynamic Environments with Conditional Autoregressive Models Open
We demonstrate the use of conditional autoregressive generative models (van den Oord et al., 2016a) over a discrete latent space (van den Oord et al., 2017b) for forward planning with MCTS. In order to test this method, we introduce a new …
View article: Harmonic Recomposition using Conditional Autoregressive Modeling
Harmonic Recomposition using Conditional Autoregressive Modeling Open
We demonstrate a conditional autoregressive pipeline for efficient music recomposition, based on methods presented in van den Oord et al.(2017). Recomposition (Casal & Casey, 2010) focuses on reworking existing musical pieces, adhering to …
View article: Blindfold Baselines for Embodied QA
Blindfold Baselines for Embodied QA Open
We explore blindfold (question-only) baselines for Embodied Question Answering. The EmbodiedQA task requires an agent to answer a question by intelligently navigating in a simulated environment, gathering necessary visual information only …
View article: Learning to discover sparse graphical models
Learning to discover sparse graphical models Open
Belilovsky E., Kastner K., Varoquaux G., Blaschko M., ''Learning to discover sparse graphical models'', 5th international conference on learning representations workshop track - ICLR 2017, 13 pp., April 24-26, 2017, Toulon, France.
View article: Structured prediction and generative modeling using neural networks
Structured prediction and generative modeling using neural networks Open
Cette thèse traite de l'usage des Réseaux de Neurones pour modélisation de données séquentielles. La façon dont l'information a été ordonnée et structurée est cruciale pour la plupart des données. Les mots qui composent ce paragraphe en co…
View article: ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation
ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation Open
We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed archit…
View article: Learning to Discover Graphical Model Structures
Learning to Discover Graphical Model Structures Open
We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring the formulation of priors and sophisticated inference procedures. In th…
View article: Learning to Discover Sparse Graphical Models
Learning to Discover Sparse Graphical Models Open
We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring the formulation of priors and sophisticated inference procedures. Popul…
View article: Learning to Discover Probabilistic Graphical Model Structures
Learning to Discover Probabilistic Graphical Model Structures Open
In this work we consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring formulating priors and sophisticated inference procedures.…
View article: ReSeg: A Recurrent Neural Network for Object Segmentation
ReSeg: A Recurrent Neural Network for Object Segmentation Open
We propose a structured prediction architecture for images centered around deep recurrent neural networks. The proposed network, called ReSeg, is based on the recently introduced ReNet model for object classification. We modify and extend …
View article: theanets: v0.6.1
theanets: v0.6.1 Open
Version 0.6.1 of theanets is now live! pip install -U theanets http://pypi.python.org/pypi/theanets http://theanets.readthedocs.org http://github.com/lmjohns3/theanets The biggest change in this release series is a Network/Layer refactor t…
View article: A Recurrent Latent Variable Model for Sequential Data
A Recurrent Latent Variable Model for Sequential Data Open
In this paper, we explore the inclusion of latent random variables into the dynamic hidden state of a recurrent neural network (RNN) by combining elements of the variational autoencoder. We argue that through the use of high-level latent r…
View article: A Recurrent Latent Variable Model for Sequential Data
A Recurrent Latent Variable Model for Sequential Data Open
In this paper, we explore the inclusion of latent random variables into the dynamic hidden state of a recurrent neural network (RNN) by combining elements of the variational autoencoder. We argue that through the use of high-level latent r…
View article: ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks
ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks Open
In this paper, we propose a deep neural network architecture for object recognition based on recurrent neural networks. The proposed network, called ReNet, replaces the ubiquitous convolution+pooling layer of the deep convolutional neural …
View article: theanets: Version 0.5.0
theanets: Version 0.5.0 Open
Version 0.5.0 of theanets is now live! pip install -U theanets http://pypi.python.org/pypi/theanets http://theanets.readthedocs.org http://github.com/lmjohns3/theanets Some great new features have been incorporated into this release, but t…