Exploring foci of:
doi.org
Efficient warp execution in presence of divergence with collaborative context collection
December 2015 • Farzad Khorasani, Rajiv Gupta, Laxmi N. Bhuyan
GPU's SIMD architecture is a double-edged sword confronting parallel tasks with control flow divergence. On the one hand, it provides a high performance yet power-efficient platform to accelerate applications via massive parallelism; however, on the other hand, irregularities induce inefficiencies due to the warp's lockstep traversal of all diverging execution paths. In this work, we present a software (compiler) technique named Collaborative Context Collection (CCC) that increases the warp execution efficiency wh…
Computer Science
Parallel Computing
Cuda
Programming Language