Chiron: Optimizing Fault Tolerance in QoS-aware Distributed Stream Processing Jobs

11 Feb 2021 Morgan Geldenhuys Lauritz Thamsen Odej Kao

Fault tolerance is a property which needs deeper consideration when dealing with streaming jobs requiring high levels of availability and low-latency processing even in case of failures where Quality-of-Service constraints must be adhered to. Typically, systems achieve fault tolerance and the ability to recover automatically from partial failures by implementing Checkpoint and Rollback Recovery... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Categories


  • DISTRIBUTED, PARALLEL, AND CLUSTER COMPUTING