Yipeng Shen
YOU?
Author Swipe
View article: KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows
KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows Open
Large language model (LLM) based agentic workflows have become a popular paradigm for coordinating multiple specialized agents to solve complex tasks. To improve serving efficiency, existing LLM systems employ prefix caching to reuse key-v…