Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes

Colloquium | September 7 | 12:30-1:30 p.m. | Soda Hall, Woz 430 Soda Hall

 Sailesh Krishnamurthy, Amazon Web Services

 Electrical Engineering and Computer Sciences (EECS)

Amazon Aurora is a relational database service for OLTP workloads offered as part of Amazon Web Services (AWS). In this talk, we describe the architecture of Aurora and the design considerations leading to that architecture. We believe the central constraint in high throughput data processing has moved from compute and storage to the network. Aurora brings a novel architecture to the relational database to address this constraint, most notably by pushing redo processing to a multi-tenant scale-out storage service, purpose-built for Aurora. We describe how doing so not only reduces network traffic, but also allows for fast crash recovery, failovers to replicas without loss of data, and fault-tolerant, self-healing storage. Traditional implementations that leverage distributed storage would use distributed consensus algorithms for commits, reads, replication, and membership changes and amplify the cost of underlying storage. We will describe how Aurora avoids distributed consensus under most circumstances by establishing invariants and leveraging local transient state. These techniques improve performance, reduce variability, and lowers costs.

 jmfaleiro@berkeley.edu