Abstract: In this talk, I’ll do a deep-dive into Amazon Aurora, a new cloud-native database service for OLTP workloads that is designed from the ground up to exploit resource abundance in the cloud as opposed to traditional designs that optimize for resource scarcity. Aurora offers a novel architecture in the 30-year old relational database space, moving the monolithic database stack to a service-oriented architecture, starting with pushing the lowest layers of the database into a distributed multi-tenant log-structured storage service. It has been under development for four years, with numerous innovations throughout – low latency read replicas, instant crash recovery, in-place rewind, copy-on-write cloning, zero-downtime patching etc. I will present an under the hood view of some of these key innovations, review performance results and how we achieve them, and share real-life application use cases.
Biography: Debanjan Saha is the General Manager of Amazon Aurora, a massively scalable relational database service re-imagined for the cloud. Amazon Aurora, launched in July 2015, is the fastest growing service in the history of AWS. Prior to joining Amazon, Debanjan held multiple executive and technical leadership positions in IBM where he led the development of virtualizing storage controller and created a $1B/year business. Earlier in his career he was with Tellium, an optical networking pioneer, that he helped grow from an early stage start-up to a public company.
An acclaimed innovator and author, Debanjan has co-authored 50+ US patent applications, 100+ technical articles including award winning papers and major Internet standards. Debanjan is a Fellow of the IEEE and was the recipient of IEEE William Bennett award in 2003 and IEEE Frederic W. Ellerscik award in 2004. He holds MS and PhD degrees from University of Maryland, and a B.Tech from IIT, all in Computer Science.
Abstract: Datacenter hardware is currently evolving at a rate that we, as software developers, are completely unfamiliar with. These changes, which span rack-scale form factors, emerging I/O technologies, and the deployment of specialized processors are resulting in performance-dense datacenter environments that look absolutely nothing like the ones that developers were writing code for even five years ago.
In this talk, I'll quickly overview a few interesting hardware trends and their consequences for software. I will summarize my own experiences in developing a commercial rack-scale software-based storage system as an example of a subset of this space. Finally, I'll step back and point to (and probably also rant about) some of the assumptions implicit in the software stacks that we use today that are a burdensome liability in the face of trying to actually distill value from the rapid innovation that is happening in datacenter hardware.
Biography: Andrew Warfield is an associate professor in the computer science department at the University of British Columbia. He is a Sloan Research Fellow, and explores research topics broadly related to software systems including cloud computing, storage, networking, security, and big data. Andrew is the CTO at Coho Data, a Vancouver-based enterprise storage company that provides a scalable data platform based on high-performance nonvolatile memories. Dr. Warfield completed his PhD at the University of Cambridge, and has held technical positions at AT&T Research, Intel Research, XenSource, and Citrix.
Abstract: Concurrency control is still a key challenge in high performance transaction processing systems. I will present Centiman, a system for high performance and elastic transaction processing using optimistic concurrency control (OCC). Centiman provides serializability on top of key-value stores using a lightweight OCC-based protocol. Centiman has a loosely coupled distributed architecture, and avoids synchronization wherever possible. I will also discuss how we can reduce transaction conflicts in high data contention scenarios using batching. I will then describe how we can use two emerging technologies—Software-Defined Flash and the Precision Time Protocol --- to further improve performance for OCC-based transaction processing systems, and I will conclude with open research challenges in this area.
Biography: Johannes Gehrke is a Technical Fellow at Microsoft in the Office Product Group. Johannes' research interests are in the areas of database systems, data science, and distributed systems. Johannes has received an NSF Career Award, an Arthur P. Sloan Fellowship, a Humboldt Research Award, the 2011 IEEE Computer Society Technical Achievement Award, and the 2011 Blavatnik Award for Young Scientists from the New York Academy of Sciences, and he is an ACM Fellow. He co-authored the undergraduate textbook Database Management Systems (McGrawHill (2002), currently in its third edition), used at universities all over the world. Johannes was Program co-Chair of KDD 2004, VLDB 2007, ICDE 2012, SOCC 2014, and ICDE 2015.