doi.org
Multi-Framework Reliability Approach
March 2021 • Bara Abusalah, Derek Schatzlein, Julian James Stephen, Masoud Saeida Ardekani, Patrick Eugster
Despite advances in making datacenters dependable, failures still happen. This is particularly onerous for long-running "big data" applications, where partial failures can lead to significant losses and lengthy recomputations. Big data processing frameworks like Hadoop MapReduce include fault tolerance (FT) mechanisms, but these are commonly targeted at specific system/failure models, and are often redundant between frameworks. This article proposes the paradigm of <italic xmlns:mml="http://www.w3.org/1998/Math/Ma…