https://github.com/pingcap/awesome-database-learning
A list of learning materials to understand databases internals
Awesome Database Learning
A list of learning materials to understand databases internals, including but not limited to:
- papers
- blogs
- courses
- talks
Please submit a pull request if there is any material that you think should be included in this collection.
Table of Contents
Recommended Courses and Books
Courses
Books
SQL & Relation Algebra
Courses:
Query Optimizer
Blogs:
Planner Models
Blogs:
Papers:
- 1979, Access Path Selection in a Relational Database Management System, SIGMOD
- 1979, Query Processing in Main Memory Database Management Systems, VLDB
- 1987, Query Optimization by Simulated Annealing, SIGMOD
- 1988, Grammar-like Functional Rules for Representing Query Optimization Alternatives, SIGMOD
- 1993, The Volcano Optimizer Generator- Extensibility and Efficient Search, ICDE
- 1995, The Cascades Framework for Query Optimization, IEEE Data engineering Bulltin
- 1998, An Overview of Query Optimization in Relational Systems, PODS
- 2001, LEO – DB2’s LEarning Optimizer, VLDB
- 2004, Robust Query Processing through Progressive Optimization, SIGMOD
- 2014, Orca: A Modular Query Optimizer Architecture for Big Data, SIGMOD
- 2016, Parallelizing Query Optimization on Shared-Nothing Architectures, VLDB
- 2016, The MemSQL Query Optimizer: A modern optimizer for real-time analytics in a distributed database, VLDB
Subquery Optimization
Blogs:
Papers:
Join Order Optimization
Papers:
Functional Dependency & Physical Properties
Thesis:
Papers:
Cost Model
Papers:
Statistics
Papers:
Books:
Query Execution
Execution Framework
Papers:
Vectorization vs Compilization
Blogs:
Papers:
- 2005, MonetDB/X100: Hyper-Pipelining Query Execution, CIDR
- 2011, Efficiently Compiling Efficient Query Plans for Modern Hardware, VLDB
- 2017, Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last, VLDB
- 2018, Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask, VLDB
- 2018, Adaptive Execution of Compiled Queries, ICDE
Join
Papers:
Hash Table
Blogs:
DDL
Transaction
Isolation Levels
Blogs:
Papers:
Concurrency Control
Courses:
Papers:
Network
Courses:
Papers:
Storage
Buffer Management
Courses:
Papers:
- 1987, The 5 Minute Rule for Trading Memory for Disc Accesses and the 5 Byte Rule for Trading Memory for CPU Time, SIGMOD
- 2008, The Five Minute Rule 20 Years Later and How Flash Memory Changes the Rules, ACM Queue
- 2018, Managing Non-Volatile Memory in Database Systems, SIGMOD
- 2018, LeanStore: In-Memory Data Management Beyond Main Memory, ICDE
- 2020, Umbra: A Disk-Based System with In-Memory Performance, CIDR
Disk IO
Blogs:
Papers:
B-Tree
Courses:
LSM-Tree
Serializing & RPC
Data Partitioning
Replication & Consistency
Consensus
- University of Cambridge Distributed consensus revised, a great paper about Consenssus especially Paxos and Paxos-Related algorithms, by Heidi Howard
Scale & Balance
Blogs:
Benchmark & Testing