Explanipedia

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Open

Aleksandar Botev, Soham De, Samuel Smith, Anushan Fernando, George-Cristian Muraru , et al. · 2024

We introduce RecurrentGemma, a family of open language models which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state…

Gemma: Open Models Based on Gemini Research and Technology Open

Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju , et al. · 2024

This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language unde…

Plex: Towards Reliability using Pretrained Large Model Extensions Open

Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier , et al. · 2022

A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore…

Pre-training helps Bayesian optimization too Open

Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet , et al. · 2022

Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge…

Neural Diffusion Processes Open

Vincent Dutordoir, Alan Saul, Zoubin Ghahramani, Fergus Simpson · 2022

Neural network approaches for meta-learning distributions over functions have desirable properties such as increased flexibility and a reduced complexity of inference. Building on the successes of denoising diffusion models for generative …

Pre-trained Gaussian Processes for Bayesian Optimization Open

Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet , et al. · 2021

Bayesian optimization (BO) has become a popular strategy for global optimization of expensive real-world functions. Contrary to a common expectation that BO is suited to optimizing black-box functions, it actually requires domain knowledge…

Automatic prior selection for meta Bayesian optimization with a case study on tuning deep neural network optimizers. Open

Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet , et al. · 2021

The performance of deep neural networks can be highly sensitive to the choice of a variety of meta-parameters, such as optimizer parameters and model hyperparameters. Tuning these well, however, often requires extensive and costly experime…

Handling incomplete heterogeneous data using VAEs Open

Alfredo Nazábal, Pablo M. Olmos, Zoubin Ghahramani, Isabel Valera · 2021

Deep Neural Networks as Point Estimates for Deep Gaussian Processes Open

Vincent Dutordoir, James Hensman, Mark van der Wilk, Carl Henrik Ek, Zoubin Ghahramani , et al. · 2021

Neural networks and Gaussian processes are complementary in their strengths and weaknesses. Having a better understanding of their relationship comes with the promise to make each method benefit from the strengths of the other. In this wor…

Handling incomplete heterogeneous data using VAEs Open

Alfredo Nazábal, Pablo M. Olmos, Zoubin Ghahramani, Isabel Valera · 2020

Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic\n Circuits Open

Robert Peharz, Steven Lang, Antonio Vergari, Karl Stelzner, Alejandro Molina , et al. · 2020

Probabilistic circuits (PCs) are a promising avenue for probabilistic\nmodeling, as they permit a wide range of exact and efficient inference\nroutines. Recent ``deep-learning-style'' implementations of PCs strive for a\nbetter scalability…

Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits Open

Robert Peharz, Steven Lang, Antonio Vergari, Karl Stelzner, Alejandro Molina , et al. · 2020

Probabilistic circuits (PCs) are a promising avenue for probabilistic modeling, as they permit a wide range of exact and efficient inference routines. Recent ``deep-learning-style'' implementations of PCs strive for a better scalability, b…

Resource-Efficient Neural Networks for Embedded Systems Open

Wolfgang Roth, Günther Schindler, Matthias Zöhrer, Lukas Pfeifenberger, Robert Peharz , et al. · 2020

While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches. These approaches aim for a carefully c…

Random sum-product networks: A simple and effective approach to probabilistic deep learning Open

Robert Peharz, Antonio Vergari, Karl Stelzner, Alejandro Molina, Xiaoting Shao , et al. · 2019

Sum-product networks (SPNs) are expressive probabilistic models with a rich set of exact and efficient inference routines. However, in order to guarantee exact inference, they require specific structural constraints, which complicate learn…

Automatic Bayesian Density Analysis Open

Antonio Vergari, Alejandro Molina, Robert Peharz, Zoubin Ghahramani, Kristian Kersting , et al. · 2019

Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for exploratory data analysis are usually not flexible enough to deal with the uncertainty inherent to r…

One-Network Adversarial Fairness Open

Tameem Adel, Isabel Valera, Zoubin Ghahramani, Adrian Weller · 2019

There is currently a great expansion of the impact of machine learning algorithms on our lives, prompting the need for objectives other than pure performance, including fairness. Fairness here means that the outcome of an automated decisio…

Bayesian Learning of Sum-Product Networks Open

Martin Trapp, Robert Peharz, Hong Ge, Franz Pernkopf, Zoubin Ghahramani · 2019

Sum-product networks (SPNs) are flexible density estimators and have received significant attention due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be…

Bayesian Learning of Sum-Product Networks Open

Martin Trapp, Robert Peharz, Hong Ge, Franz Pernkopf, Zoubin Ghahramani · 2019

Sum-product networks (SPNs) are flexible density estimators and have received significant attention due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be…

Antithetic and Monte Carlo kernel estimators for partial rankings Open

María Lomelí, Mark Rowland, Arthur Gretton, Zoubin Ghahramani · 2019

Efficient and Robust Machine Learning for Real-World Systems Open

Franz Pernkopf, Wolfgang Roth, Matthias Zöhrer, Lukas Pfeifenberger, Günther Schindler , et al. · 2018

While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation and the vision of the Internet-of-Things fuel the interest in resource efficient approaches. These approaches require a carefully ch…

Probabilistic Meta-Representations Of Neural Networks Open

Theofanis Karaletsos, Peter Dayan, Zoubin Ghahramani · 2018

Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in …

Probabilistic Deep Learning using Random Sum-Product Networks Open

Robert Peharz, Antonio Vergari, Karl Stelzner, Alejandro Molina, Martin Trapp , et al. · 2018

The need for consistent treatment of uncertainty has recently triggered increased interest in probabilistic deep learning methods. However, most current approaches have severe limitations when it comes to inference, since many of these mod…

Functional programming for modular Bayesian inference Open

Adam Ścibior, Ohad Kammar, Zoubin Ghahramani · 2018

We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-of-the-art inference algorithms…

Branch-recombinant Gaussian processes for analysis of perturbations in biological time series Open

Christopher A. Penfold, Anastasiya Sybirna, John E. Reid, Yun Huang, Lorenz Wernisch , et al. · 2018

Motivation A common class of behaviour encountered in the biological sciences involves branching and recombination. During branching, a statistical process bifurcates resulting in two or more potentially correlated processes that may under…

Variational Bayesian dropout: pitfalls and fixes Open

Jiri Hron, Alexander Matthews, Zoubin Ghahramani · 2018

Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks. The main contribution of the reinterpretati…

Discovering Interpretable Representations for Both Deep Generative and Discriminative Models Open

Tameem Adel, Zoubin Ghahramani, Adrian Weller · 2018

Interpretability of representations in both deep generative and discriminative models is highly desirable. Current methods jointly optimize an objective combining accuracy and interpretability. However, this may reduce accuracy, and is not…

Probabilistic Deep Learning using Random Sum-Product Networks Open

Robert Peharz, Antonio Vergari, Karl Stelzner, Alejandro Molina, Martin Trapp , et al. · 2018

The need for consistent treatment of uncertainty has recently triggered increased interest in probabilistic deep learning methods. However, most current approaches have severe limitations when it comes to inference, since many of these mod…

The Mirage of Action-Dependent Baselines in Reinforcement Learning Open

George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani , et al. · 2018

Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance. Several recent papers extend the baseline to depend on both the…

Variational Measure Preserving Flows. Open

Yichuan Zhang, José Miguel Hernández-Lobato, Zoubin Ghahramani · 2018

Statistical inference methods are fundamentally important in machine learning. Most state-of-the-art inference algorithms are variants of Markov chain Monte Carlo (MCMC) or variational inference (VI). However, both methods struggle with li…

Gaussian Process Behaviour in Wide Deep Neural Networks Open

Alexander Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani · 2018

Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties. In this paper, we study the relationship between random, wide, fully connected, feedforward net…

Zoubin Ghahramani YOU? Author Swipe