Zoubin Ghahramani
YOU?
Author Swipe
View article: RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Open
We introduce RecurrentGemma, a family of open language models which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state…
View article: Gemma: Open Models Based on Gemini Research and Technology
Gemma: Open Models Based on Gemini Research and Technology Open
This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language unde…
View article: Plex: Towards Reliability using Pretrained Large Model Extensions
Plex: Towards Reliability using Pretrained Large Model Extensions Open
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore…
View article: Pre-training helps Bayesian optimization too
Pre-training helps Bayesian optimization too Open
Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge…
View article: Neural Diffusion Processes
Neural Diffusion Processes Open
Neural network approaches for meta-learning distributions over functions have desirable properties such as increased flexibility and a reduced complexity of inference. Building on the successes of denoising diffusion models for generative …
View article: Pre-trained Gaussian Processes for Bayesian Optimization
Pre-trained Gaussian Processes for Bayesian Optimization Open
Bayesian optimization (BO) has become a popular strategy for global optimization of expensive real-world functions. Contrary to a common expectation that BO is suited to optimizing black-box functions, it actually requires domain knowledge…
View article: Automatic prior selection for meta Bayesian optimization with a case study on tuning deep neural network optimizers.
Automatic prior selection for meta Bayesian optimization with a case study on tuning deep neural network optimizers. Open
The performance of deep neural networks can be highly sensitive to the choice of a variety of meta-parameters, such as optimizer parameters and model hyperparameters. Tuning these well, however, often requires extensive and costly experime…
View article: Handling incomplete heterogeneous data using VAEs
Handling incomplete heterogeneous data using VAEs Open
View article: Deep Neural Networks as Point Estimates for Deep Gaussian Processes
Deep Neural Networks as Point Estimates for Deep Gaussian Processes Open
Neural networks and Gaussian processes are complementary in their strengths and weaknesses. Having a better understanding of their relationship comes with the promise to make each method benefit from the strengths of the other. In this wor…
View article: Handling incomplete heterogeneous data using VAEs
Handling incomplete heterogeneous data using VAEs Open
View article: Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic\n Circuits
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic\n Circuits Open
Probabilistic circuits (PCs) are a promising avenue for probabilistic\nmodeling, as they permit a wide range of exact and efficient inference\nroutines. Recent ``deep-learning-style'' implementations of PCs strive for a\nbetter scalability…
View article: Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits Open
Probabilistic circuits (PCs) are a promising avenue for probabilistic modeling, as they permit a wide range of exact and efficient inference routines. Recent ``deep-learning-style'' implementations of PCs strive for a better scalability, b…
View article: Resource-Efficient Neural Networks for Embedded Systems
Resource-Efficient Neural Networks for Embedded Systems Open
While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches. These approaches aim for a carefully c…
View article: Random sum-product networks: A simple and effective approach to probabilistic deep learning
Random sum-product networks: A simple and effective approach to probabilistic deep learning Open
Sum-product networks (SPNs) are expressive probabilistic models with a rich set of exact and efficient inference routines. However, in order to guarantee exact inference, they require specific structural constraints, which complicate learn…
View article: Automatic Bayesian Density Analysis
Automatic Bayesian Density Analysis Open
Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for exploratory data analysis are usually not flexible enough to deal with the uncertainty inherent to r…
View article: One-Network Adversarial Fairness
One-Network Adversarial Fairness Open
There is currently a great expansion of the impact of machine learning algorithms on our lives, prompting the need for objectives other than pure performance, including fairness. Fairness here means that the outcome of an automated decisio…
View article: Bayesian Learning of Sum-Product Networks
Bayesian Learning of Sum-Product Networks Open
Sum-product networks (SPNs) are flexible density estimators and have received significant attention due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be…
View article: Bayesian Learning of Sum-Product Networks
Bayesian Learning of Sum-Product Networks Open
Sum-product networks (SPNs) are flexible density estimators and have received significant attention due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be…
View article: Antithetic and Monte Carlo kernel estimators for partial rankings
Antithetic and Monte Carlo kernel estimators for partial rankings Open
View article: Efficient and Robust Machine Learning for Real-World Systems
Efficient and Robust Machine Learning for Real-World Systems Open
While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation and the vision of the Internet-of-Things fuel the interest in resource efficient approaches. These approaches require a carefully ch…
View article: Probabilistic Meta-Representations Of Neural Networks
Probabilistic Meta-Representations Of Neural Networks Open
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in …
View article: Probabilistic Deep Learning using Random Sum-Product Networks
Probabilistic Deep Learning using Random Sum-Product Networks Open
The need for consistent treatment of uncertainty has recently triggered increased interest in probabilistic deep learning methods. However, most current approaches have severe limitations when it comes to inference, since many of these mod…
View article: Functional programming for modular Bayesian inference
Functional programming for modular Bayesian inference Open
We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-of-the-art inference algorithms…
View article: Branch-recombinant Gaussian processes for analysis of perturbations in biological time series
Branch-recombinant Gaussian processes for analysis of perturbations in biological time series Open
Motivation A common class of behaviour encountered in the biological sciences involves branching and recombination. During branching, a statistical process bifurcates resulting in two or more potentially correlated processes that may under…
View article: Variational Bayesian dropout: pitfalls and fixes
Variational Bayesian dropout: pitfalls and fixes Open
Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks. The main contribution of the reinterpretati…
View article: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models
Discovering Interpretable Representations for Both Deep Generative and Discriminative Models Open
Interpretability of representations in both deep generative and discriminative models is highly desirable. Current methods jointly optimize an objective combining accuracy and interpretability. However, this may reduce accuracy, and is not…
View article: Probabilistic Deep Learning using Random Sum-Product Networks
Probabilistic Deep Learning using Random Sum-Product Networks Open
The need for consistent treatment of uncertainty has recently triggered increased interest in probabilistic deep learning methods. However, most current approaches have severe limitations when it comes to inference, since many of these mod…
View article: The Mirage of Action-Dependent Baselines in Reinforcement Learning
The Mirage of Action-Dependent Baselines in Reinforcement Learning Open
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance. Several recent papers extend the baseline to depend on both the…
View article: Variational Measure Preserving Flows.
Variational Measure Preserving Flows. Open
Statistical inference methods are fundamentally important in machine learning. Most state-of-the-art inference algorithms are variants of Markov chain Monte Carlo (MCMC) or variational inference (VI). However, both methods struggle with li…
View article: Gaussian Process Behaviour in Wide Deep Neural Networks
Gaussian Process Behaviour in Wide Deep Neural Networks Open
Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties. In this paper, we study the relationship between random, wide, fully connected, feedforward net…