Exploring foci of:
Zenodo (CERN European Organization for Nuclear Research)
AI-Driven Predictive Load Orchestration for Distributed LLM Inference
December 2025 • Revista, Zen, IA, 10
This paper presents a novel framework for AI-driven predictive load orchestration specifically tailored for distributed Large Language Model (LLM) inference. As LLMs scale in size and complexity, deploying them across distributed computing environments becomes essential for meeting high throughput and low latency requirements. Traditional load balancing techniques often struggle with the dynamic and heterogeneous computational demands of LLM inference, leading to suboptimal resource utilization and increased respo…
Orchestration
Computer Science
Security Token
Reinforcement Learning
Machine Learning
Artificial Intelligence
Big Data