Reinforcement Learning-Driven Hybrid Precopy/Postcopy VM Migration for Energy-Efficient Data Centers Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.1109/access.2025.3613235
· OA: W4414404588
This study proposes the use of a hybrid precopy/postcopy virtual machine (VM) migration framework to aid an autonomous agent when making migration decisions to continuously optimize the balance among migration time, downtime, and energy consumption. The data center state and the resource load, including the CPU, memory, and network, are represented in the agent’s state space using a two-layer graph neural network (GNN), and the asynchronous advantage actor–critic (A3C) algorithm is employed to dynamically determine whether to continue the precopy phase or switch to postcopy and optimize the trade-off among the total migration time, downtime, and energy consumption while adhering to the service-level agreement (SLA) constraints. An adaptive host selection policy ensures that VMs are migrated only to underloaded machines, preventing overload and ensuring system stability. A simulation evaluation that employed the VM workload from the GWA-Bitbrains dataset revealed that this framework achieved a total migration time of 45.5 s, with 30.1 s spent on the precopy phase and 15.4 s spent on the postcopy phase, resulting in a downtime of 15.4 s. Compared with previous approaches, this result represents an decrease in total migration time of 12.5% from 52 s to 45.5 s; a 23% decrease in downtime from 20 s to 15.4 s; and a 4.4% increase in energy efficiency from 87% to 91.4%. The SLA compliance remained stable at 92.8%, affirming that the service quality was preserved. This study demonstrates the effectiveness of integrating GNN-based embeddings and A3C scheduling in terms of reducing downtime and energy usage while maintaining reliable service delivery in data centers.