202508291335 Status: #idea Tags: #systems_design #software_engineering #databases #architecture #interviews # Systems Design in an Hour **Link:** https://www.youtube.com/watch?v=iYIjJ7utdDI # The Steps of a System Design Interview 1. Requirements - figure out what part of the problem they want to focus on (usually only a subset of most important features need to be built) 2. Back of Envelope Calculation - e.g., estimates of number of reads/writes, whether read path or write path needs to be optimized, etc. 3. API Design - give general API of all endpoints in the system 4. Architectural Design # Database Fundamentals - Database - system that durably stores application state. Data is typically stored on a hard disk so that it is persistent - Sequential reads & writes are always cheaper than random reads & writes, so we want to keep data that is frequently written/read together close together - Indexes allow us to speed up certain read queries, often at the expense of slowing down other write queries - Without an index, we'd just keep a log and write data to the end of it. Every read would then require a full scan over the entire database to find the row. ## Indexes - **Types of indexes:** - Hash Indexes - keep all keys in memory in a hashmap that correspond to their location on disk. - Pros: O(1) reads and O(1) writes - Cons: Keyset needs to fit in memory & poor support for range queries (since data on disk is not ordered by key) ![[Pasted image 20241002162516.png]] - B+ Trees - organize a tree of keys on disk. Follow the tree structure to get to your key and its corresponding data on disk - O(log n) reads and writes since we need to traverse the tree structure - Pros: Data from adjacent keys is stored adjacently on disk allowing for easier range queries - Cons: Slower single key reads than the hash index ![[Pasted image 20241002162500.png]] - LSM Trees + SSTables - writes go to an in-memory binary search tree, which is flushed to immutable sorted tables on disk when it gets too big. Reads for a key are first performed from the tree, and if not present we check each SSTable file from newest to oldest - Binary search trees are inherently sorted, and so are SSTables, so this index also supports range queries - Writes and reads: O(log n) - Pros: Fast writes to memory - Cons: Fairly slow reads due to having to check multiple different places for the value for a given key ![[Pasted image 20241002162537.png]] ## Transactions: - ACID means that writes are atomic and serializable - Atomic - put all writes in a write ahead log before writing them to their actual location on disk, mark them as committed in the write ahead log when the write is completed (so if a write is uncommitted and the database fails, we can replay the log entry) - Serializable - it appears to the writer as if all transactions ran on a single thread, no need to worry about race conditions - Most SQL databases support ACID transaction. Many NoSQL databases do not because not all workloads need them and they make reads & writes slower. - Row vs Column Storage: - We want data that we read together frequently in the same location on disk - Row-oriented storage is nice when we are consistently writing new data on a per-record basis (e.g. Facebook user profile information, where we want to retrieve all features, or columns, associated with a user row) - Column-oriented storage works well for analytical tables, where transformations need to be applied to all (or a large subset of) the data for a particular feature - Serialization - how we turn our data into bytes to be stored on disk in our databases and sent over the network - JSON schemas are flexible, but their flexibility requires specifying every field name in every message - A predefined schema can save space/maximize throughput by omitting field names in place of a more efficient representation ## Replication - Need multiple copies of data to mitigate database failures - Multiple copies of data also allows us to increase read/write throughput - Replication can allow us to: - Withstand hardware failures - Improve our database performance - Replicas send data to one another via a replication log, which is similar to a write ahead log - Synchronous vs Asynchronous Replication - Synchronous Replication - on a write, all replicas must process the data before the write is considered committed - Pros: Any read to a replica will return up-to-date data - Cons: If a single replica is down, we cannot commit any write - Asynchronous Replication (Eventual Consistency) - on a write, only a subset of replicas must process the data before the write is considered committed. Other replicas will be backfilled in the background - Pros: Writes can tolerate some node failures. Writes are faster - Cons: Reading from certain nodes can return stale data - Types of replication - Single leader - all writes go to one "leader" database, which are asynchronously replicated to follower databases. Reads can come from leader or follower - Pros: Simple to implement - Cons: - Write throughput limited to the leader - If the leader fails we need to pick a new leader, which may result in picking a node that is not completely up to date - During failover we cannot complete any writes - Multi-leader - writes for a given key can go to multiple leader databases - Pros: Increased write throughput, especially across multiple data centers - Cons: Write conflicts when the leaders synchronize - How to deal with write conflicts? - Last Write Wins - use the timestamp of each write to decide which one overwrites the other - Pros: very simple to implement, deterministic, works the majority of the time - Cons: - Clocks on different computers are not perfectly synchronized due to clock drift - Concurrent writes will all be deleted except for one with the latest timestamp - Storing Siblings - leaders can identify when they have conflicting concurrent writes and store both of them. Later, a user can manually merge them together back to one copy. ## Related Ideas - [[Think in terms of limits to expose trends or bottlenecks in a system]] - [[Checklists as a simple approach for clipping the left side of a distribution]] - [[The highest order bit in information]] - [[Complexity in software development and design]] - [[A philosophy of software design]] --- # References