Stephen Pasteris
YOU?
Author Swipe
View article: Guidelines for Applying RL and MARL in Cybersecurity Applications
Guidelines for Applying RL and MARL in Cybersecurity Applications Open
Reinforcement Learning (RL) and Multi-Agent Reinforcement Learning (MARL) have emerged as promising methodologies for addressing challenges in automated cyber defence (ACD). These techniques offer adaptive decision-making capabilities in h…
View article: Fairness with Exponential Weights
Fairness with Exponential Weights Open
Motivated by the need to remove discrimination in certain applications, we develop a meta-algorithm that can convert any efficient implementation of an instance of Hedge (or equivalently, an algorithm for discrete bayesian inference) into …
View article: Online Convex Optimisation: The Optimal Switching Regret for all Segmentations Simultaneously
Online Convex Optimisation: The Optimal Switching Regret for all Segmentations Simultaneously Open
We consider the classic problem of online convex optimisation. Whereas the notion of static regret is relevant for stationary problems, the notion of switching regret is more appropriate for non-stationary problems. A switching regret is d…
View article: Extraction Propagation
Extraction Propagation Open
Running backpropagation end to end on large neural networks is fraught with difficulties like vanishing gradients and degradation. In this paper we present an alternative architecture composed of many small neural networks that interact wi…
View article: Bandits with Abstention under Expert Advice
Bandits with Abstention under Expert Advice Open
We study the classic problem of prediction with expert advice under bandit feedback. Our model assumes that one action, corresponding to the learner's abstention from play, has no reward or loss on every trial. We propose the CBA algorithm…
View article: A Hierarchical Nearest Neighbour Approach to Contextual Bandits
A Hierarchical Nearest Neighbour Approach to Contextual Bandits Open
In this paper we consider the adversarial contextual bandit problem in metric spaces. The paper "Nearest neighbour with bandit feedback" tackled this problem but when there are many contexts near the decision boundary of the comparator pol…
View article: Sum-max Submodular Bandits
Sum-max Submodular Bandits Open
Many online decision-making problems correspond to maximizing a sequence of submodular functions. In this work, we introduce sum-max functions, a subclass of monotone submodular functions capturing several interesting problems, including b…
View article: Nearest Neighbour with Bandit Feedback
Nearest Neighbour with Bandit Feedback Open
In this paper we adapt the nearest neighbour rule to the contextual bandit problem. Our algorithm handles the fully adversarial setting in which no assumptions at all are made about the data-generation process. When combined with a suffici…
View article: Adversarial Online Collaborative Filtering
Adversarial Online Collaborative Filtering Open
We investigate the problem of online collaborative filtering under no-repetition constraints, whereby users need to be served content in an online fashion and a given user cannot be recommended the same content item more than once. We star…
View article: Joint Coreset Construction and Quantization for Distributed Machine Learning
Joint Coreset Construction and Quantization for Distributed Machine Learning Open
Coresets are small, weighted summaries of larger datasets, aiming at providing provable error bounds for machine learning (ML) tasks while significantly reducing the communication and computation costs. To achieve a better trade-off betwee…
View article: Communication-efficient k-Means for Edge-based Machine Learning
Communication-efficient k-Means for Edge-based Machine Learning Open
We consider the problem of computing the k-means centers for a large high-dimensional dataset in the context of edge-based machine learning, where data sources offload machine learning computation to nearby edge servers. k-Means computatio…
View article: Online Multitask Learning with Long-Term Memory
Online Multitask Learning with Long-Term Memory Open
We introduce a novel online multitask setting. In this setting each task is partitioned into a sequence of segments that is unknown to the learner. Associated with each segment is a hypothesis from some hypothesis class. We give algorithms…
View article: Online Learning of Facility Locations
Online Learning of Facility Locations Open
In this paper, we provide a rigorous theoretical investigation of an online learning version of the Facility Location problem which is motivated by emerging problems in real-world applications. In our formulation, we are given a set of sit…
View article: Online Multitask Learning with Long-Term Memory
Online Multitask Learning with Long-Term Memory Open
We introduce a novel online multitask setting. In this setting each task is partitioned into a sequence of segments that is unknown to the learner. Associated with each segment is a hypothesis from some hypothesis class. We give algorithms…
View article: Online Matrix Completion with Side Information
Online Matrix Completion with Side Information Open
We give an online algorithm and prove novel mistake and regret bounds for online binary matrix completion with side information. The mistake bounds we prove are of the form $\tilde{O}(D/γ^2)$. The term $1/γ^2$ is analogous to the usual mar…
View article: Service Placement with Provable Guarantees in Heterogeneous Edge Computing Systems
Service Placement with Provable Guarantees in Heterogeneous Edge Computing Systems Open
Mobile edge computing (MEC) is a promising technique for providing low-latency access to services at the network edge. The services are hosted at various types of edge nodes with both computation and communication capabilities. Due to the …
View article: Multicast-Based Weight Inference in General Network Topologies
Multicast-Based Weight Inference in General Network Topologies Open
Network topology plays an important role in many
\nnetwork operations. However, it is very difficult to obtain
\nthe topology of public networks due to the lack of internal
\ncooperation. Network tomography provides a powerful solution
\nt…
View article: MaxHedge: Maximising a Maximum Online with Theoretical Performance Guarantees.
MaxHedge: Maximising a Maximum Online with Theoretical Performance Guarantees. Open
We introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the ene…
View article: MaxHedge: Maximising a Maximum Online
MaxHedge: Maximising a Maximum Online Open
We introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the ene…
View article: On Similarity Prediction and Pairwise Clustering
On Similarity Prediction and Pairwise Clustering Open
We consider the problem of clustering a finite set of items from pairwise similarity information. Unlike what is done in the literature on this subject, we do so in a passive learning setting, and with no specific constraints on the cluste…
View article: On Pairwise Clustering with Side Information
On Pairwise Clustering with Side Information Open
Pairwise clustering, in general, partitions a set of items via a known similarity function. In our treatment, clustering is modeled as a transductive prediction problem. Thus rather than beginning with a known similarity function, the func…