Explanipedia

On cubic vertex-transitive graphs of given girth Open

Ted Dobson, Ademir Hujdurović, Wilfried Imrich, Ronald Ortner · 2025

A set of vertices of a graph is distinguishing if the only automorphism that preserves it is the identity. The minimal size of such sets, if they exist, is the distinguishing cost. The distinguishing costs of vertex transitive cubic graphs…

A Note on the Bias and Kemeny's Constant in Markov Reward Processes with an Application to Markov Chain Perturbation Open

Ronald Ortner · 2024

Given a unichain Markov reward process (MRP), we provide an explicit expression for the bias values in terms of mean first passage times. This result implies a generalization of known Markov chain perturbation bounds for the stationary dis…

Autonomous Exploration for Navigating in MDPs Using Blackbox RL Algorithms Open

Pratik Gajane, Peter Auer, Ronald Ortner · 2023

Computer science Mathematics

We consider the problem of navigating in a Markov decision process where extrinsic rewards are either absent or ignored. In this setting, the objective is to learn policies to reach all the states that are reachable within a given number o…

When is Cartesian product a Cayley graph? Open

Edward Dobson, Ademir Hujdurović, Wilfried Imrich, Ronald Ortner · 2023

Mathematics

A graph is said to be {\it Cayley} graph if its automorphism group admits a regular subgroup. Automorphisms of the Cartesian product of graphs are well understood, and it is known that Cartesian product of Cayley graphs is a Cayley graph. …

Adaptive Algorithms for Meta-Induction Open

Ronald Ortner · 2022

Computer science Mathematics Physics

Work in online learning traditionally considered induction-friendly (e.g. stochastic with a fixed distribution) and induction-hostile (adversarial) settings separately. While algorithms like Exp3 that have been developed for the adversaria…

Predicting Packaging Sizes Using Machine Learning Open

M. Heininger, Ronald Ortner · 2022

Computer science Engineering Business

The increasing rate of e-commerce orders necessitates a faster packaging process, challenging warehouse employees to correctly choose the size of the package needed to pack each order. To speed up the packing process in the Austrian e-comm…

Transfer in Reinforcement Learning via Regret Bounds for Learning Agents Open

Adrienne Tuynman, Ronald Ortner · 2022

Computer science Mathematics Physics

We present an approach for the quantification of the usefulness of transfer in reinforcement learning via regret bounds for a multi-agent setting. Considering a number of $\aleph$ agents operating in the same Markov decision process, howev…

A new heuristic and an exact approach for a production planning problem Open

Peter Auer, György Dósa, Tibor Dulai, Armin Fügenschuh, Peggy Näser , et al. · 2020

Computer science Mathematics

We deal with a very complex and hard scheduling problem. Two types of products are processed by a heterogeneous resource set, where resources have different operating capabilities and setup times are considered. The processing of the produ…

Regret Bounds for Reinforcement Learning via Markov Chain Concentration Open

Ronald Ortner · 2020

Mathematics Computer science Physics

We give a simple optimistic algorithm for which it is easy to derive regret bounds of O(sqrt{t-mix SAT}) steps in uniformly ergodic Markov decision processes with S states, A actions, and mixing time parameter t-mix. These bounds are the f…

Regret Bounds for Learning State Representations in Reinforcement Learning Open

Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard · 2019

Computer science Mathematics Political science

We consider the problem of online reinforcement learning when several state representations (mapping histories to a discrete state space) are available to the learning agent. At least one of these representations is assumed to induce a Mar…

Autonomous exploration for navigating in non-stationary CMPs Open

Pratik Gajane, Ronald Ortner, Peter Auer, Csaba Szepesvári · 2019

Computer science Mathematics Chemistry

We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change. For this setting, we propose a performance measure called exploration steps which…

Active Roll-outs in MDP with Irreversible Dynamics Open

Odalric-Ambrym Maillard, Timothy Mann, Ronald Ortner, Shie Mannor · 2019

Chemistry Computer science Physics

In Reinforcement Learning (RL), regret guarantees scaling with the square root of the time horizon have been shown to hold only for communicating Markov decision processes (MDPs) where any two states are connected. This essentially means t…

Variational Regret Bounds for Reinforcement Learning Open

Pratik Gajane, Ronald Ortner, Peter Auer · 2019

Computer science Mathematics Psychology

We consider undiscounted reinforcement learning in Markov decision processes (MDPs) where both the reward functions and the state-transition probabilities may vary (gradually or abruptly) over time. For this problem setting, we propose an …

A Sliding-Window Algorithm for Markov Decision Processes with Arbitrarily Changing Rewards and Transitions Open

Pratik Gajane, Ronald Ortner, Peter Auer · 2018

Computer science Mathematics Chemistry

We consider reinforcement learning in changing Markov Decision Processes where both the state-transition probabilities and the reward functions may vary over time. For this problem setting, we propose an algorithm using a sliding window ap…

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning Open

Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Ronald Ortner · 2018

Computer science Mathematics Engineering

We introduce SCAL, an algorithm designed to perform efficient exploration-exploitation in any unknown weakly-communicating Markov decision process (MDP) for which an upper bound $c$ on the span of the optimal bias function is known. For an…

Pareto Front Identification from Stochastic Bandit Feedback Open

Peter Auer, Chao-Kai Chiang, Ronald Ortner, Mădălina M. Drugan · 2016

Mathematics Computer science Engineering

We consider the problem of identifying the Pareto front for multiple objectives from a finite set of operating points. Sampling an operating point gives a random vector where each coordinate corresponds to the value of one of the objective…

Improved learning complexity in combinatorial pure exploration bandits Open

Victor Gabillon, Alessandro Lazaric, Mohammad Ghavamzadeh, Ronald Ortner, Peter L. Bartlett · 2016

Computer science Mathematics

We study the problem of combinatorial pure exploration in the stochastic multi-armed bandit problem. We first construct a new measure of complexity that provably characterizes the learning performance of the algorithms we propose for the f…

Ronald Ortner YOU? Author Swipe