202508291335
Status: #idea
Tags: #systems_design #software_engineering #databases #architecture #interviews
# Systems Design in an Hour
**Link:** https://www.youtube.com/watch?v=iYIjJ7utdDI
# The Steps of a System Design Interview
1. Requirements - figure out what part of the problem they want to focus on (usually only a subset of most important features need to be built)
2. Back of Envelope Calculation - e.g., estimates of number of reads/writes, whether read path or write path needs to be optimized, etc.
3. API Design - give general API of all endpoints in the system
4. Architectural Design
# Database Fundamentals
- Database - system that durably stores application state. Data is typically stored on a hard disk so that it is persistent
- Sequential reads & writes are always cheaper than random reads & writes, so we want to keep data that is frequently written/read together close together
- Indexes allow us to speed up certain read queries, often at the expense of slowing down other write queries
- Without an index, we'd just keep a log and write data to the end of it. Every read would then require a full scan over the entire database to find the row.
## Indexes
- **Types of indexes:**
- Hash Indexes - keep all keys in memory in a hashmap that correspond to their location on disk.
- Pros: O(1) reads and O(1) writes
- Cons: Keyset needs to fit in memory & poor support for range queries (since data on disk is not ordered by key)
![[Pasted image 20241002162516.png]]
- B+ Trees - organize a tree of keys on disk. Follow the tree structure to get to your key and its corresponding data on disk
- O(log n) reads and writes since we need to traverse the tree structure
- Pros: Data from adjacent keys is stored adjacently on disk allowing for easier range queries
- Cons: Slower single key reads than the hash index
![[Pasted image 20241002162500.png]]
- LSM Trees + SSTables - writes go to an in-memory binary search tree, which is flushed to immutable sorted tables on disk when it gets too big. Reads for a key are first performed from the tree, and if not present we check each SSTable file from newest to oldest
- Binary search trees are inherently sorted, and so are SSTables, so this index also supports range queries
- Writes and reads: O(log n)
- Pros: Fast writes to memory
- Cons: Fairly slow reads due to having to check multiple different places for the value for a given key
![[Pasted image 20241002162537.png]]
## Transactions:
- ACID means that writes are atomic and serializable
- Atomic - put all writes in a write ahead log before writing them to their actual location on disk, mark them as committed in the write ahead log when the write is completed (so if a write is uncommitted and the database fails, we can replay the log entry)
- Serializable - it appears to the writer as if all transactions ran on a single thread, no need to worry about race conditions
- Most SQL databases support ACID transaction. Many NoSQL databases do not because not all workloads need them and they make reads & writes slower.
- Row vs Column Storage:
- We want data that we read together frequently in the same location on disk
- Row-oriented storage is nice when we are consistently writing new data on a per-record basis (e.g. Facebook user profile information, where we want to retrieve all features, or columns, associated with a user row)
- Column-oriented storage works well for analytical tables, where transformations need to be applied to all (or a large subset of) the data for a particular feature
- Serialization - how we turn our data into bytes to be stored on disk in our databases and sent over the network
- JSON schemas are flexible, but their flexibility requires specifying every field name in every message
- A predefined schema can save space/maximize throughput by omitting field names in place of a more efficient representation
## Replication
- Need multiple copies of data to mitigate database failures
- Multiple copies of data also allows us to increase read/write throughput
- Replication can allow us to:
- Withstand hardware failures
- Improve our database performance
- Replicas send data to one another via a replication log, which is similar to a write ahead log
- Synchronous vs Asynchronous Replication
- Synchronous Replication - on a write, all replicas must process the data before the write is considered committed
- Pros: Any read to a replica will return up-to-date data
- Cons: If a single replica is down, we cannot commit any write
- Asynchronous Replication (Eventual Consistency) - on a write, only a subset of replicas must process the data before the write is considered committed. Other replicas will be backfilled in the background
- Pros: Writes can tolerate some node failures. Writes are faster
- Cons: Reading from certain nodes can return stale data
- Types of replication
- Single leader - all writes go to one "leader" database, which are asynchronously replicated to follower databases. Reads can come from leader or follower
- Pros: Simple to implement
- Cons:
- Write throughput limited to the leader
- If the leader fails we need to pick a new leader, which may result in picking a node that is not completely up to date
- During failover we cannot complete any writes
- Multi-leader - writes for a given key can go to multiple leader databases
- Pros: Increased write throughput, especially across multiple data centers
- Cons: Write conflicts when the leaders synchronize
- How to deal with write conflicts?
- Last Write Wins - use the timestamp of each write to decide which one overwrites the other
- Pros: very simple to implement, deterministic, works the majority of the time
- Cons:
- Clocks on different computers are not perfectly synchronized due to clock drift
- Concurrent writes will all be deleted except for one with the latest timestamp
- Storing Siblings - leaders can identify when they have conflicting concurrent writes and store both of them. Later, a user can manually merge them together back to one copy.
## Related Ideas
- [[Think in terms of limits to expose trends or bottlenecks in a system]]
- [[Checklists as a simple approach for clipping the left side of a distribution]]
- [[The highest order bit in information]]
- [[Complexity in software development and design]]
- [[A philosophy of software design]]
---
# References