Offline Reinforcement Learning for Bandwidth Estimation in RTC Using a Fast Actor and Not-So-Furious Critic Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1145/3625468.3652184
· OA: W4394881813
The increasing demand for real-time communication (RTC) applications necessitates robust and reliable systems. Seamless media delivery depends on an accurate assessment of the network conditions, with bandwidth estimation (BWE) being crucial for maintaining system reliability and achieving good quality of experience (QoE) for the users. BWE poses a significant challenge due to dynamic network conditions, limited information availability and computational complexity. The Second Bandwidth Estimation Challenge, organized within ACM MMSys 2024, aims to enhance RTC user QoE by developing a deep learning-based bandwidth estimator using offline reinforcement learning. This paper presents our solution, ranked second in the grand challenge. This solution employs an actor-critic approach to achieve accurate real-time BWE by relying solely on observed network statistics. Due to the offline setting of the challenge, the critic network is trained separately from the actor network to estimate the action quality without interacting with the real environment. Furthermore, the quality prediction by the critic is adjusted by a predefined conservation factor to address overshooting the bandwidth values. The solution's source code is publicly available at https://github.com/streaminguniversity/FARC.