Chenshan Ren
YOU?
Author Swipe
View article: FourierCompress: Layer-Aware Spectral Activation Compression for Efficient and Accurate Collaborative LLM Inference
FourierCompress: Layer-Aware Spectral Activation Compression for Efficient and Accurate Collaborative LLM Inference Open
Collaborative large language model (LLM) inference enables real-time, privacy-preserving AI services on resource-constrained edge devices by partitioning computational workloads between client devices and edge servers. However, this paradi…
View article: Objective-Driven Differentiable Optimization of Traffic Prediction and Resource Allocation for Split AI Inference Edge Networks
Objective-Driven Differentiable Optimization of Traffic Prediction and Resource Allocation for Split AI Inference Edge Networks Open
Split AI inference partitions an artificial intelligence (AI) model into multiple parts, enabling the offloading of computation-intensive AI services. Resource allocation is critical for the performance of split AI inference. The challenge…
View article: Online-Learning-Based Predictive Optimization of Uplink Scheduling for Industrial Internet-of-Things
Online-Learning-Based Predictive Optimization of Uplink Scheduling for Industrial Internet-of-Things Open
The industrial Internet of Things (IIoT) operates in dynamic environments where wireless channels are subject to rapid changes, posing significant challenges for reliable data transmission. This paper introduces a novel online learning app…
View article: Mobile Edge Computing for Future Internet-of-Things
Mobile Edge Computing for Future Internet-of-Things Open