arXiv (Cornell University)
STEP : A Distributed Multi-threading Framework Towards Efficient Data Analytics
December 2018 • Yijie Mei, Yanyan Shen, Yanmin Zhu, Linpeng Huang
Various general-purpose distributed systems have been proposed to cope with high-diversity applications in the pipeline of Big Data analytics. Most of them provide simple yet effective primitives to simplify distributed programming. While the rigid primitives offer great ease of use to savvy programmers, they probably compromise efficiency in performance and flexibility in data representation and programming specifications, which are critical properties in real systems. In this paper, we discuss the limitations of…