S. T. Lai YOU? Author Swipe

Last 10y

AdaServe: Accelerating Multi-SLO LLM Serving with SLO-Customized Speculative Decoding Open

Zikun Li, Zhuofu Chen, Remi Delacourt, G. Oliaro, Zeyu Wang , et al. · 2025

Modern large language model (LLM) applications exhibit diverse service-level objectives (SLOs), from low-latency requirements in interactive coding assistants to more relaxed constraints in data wrangling tasks. Existing LLM serving system…