1 — Data models & schema evolution (Ch 2 + 4)

Quick cue 10-second answer Typical follow-up
“Relational vs document?” Relational wins when many-to-many joins, ad-hoc queries and multi-row ACID matter; document wins for aggregate-by-key patterns and when you need schema-on-read flexibility. “How do you avoid joins on doc DB?” → Denormalize or pre-compute views.
“How do you change schemas safely?” Version the payload (ʺV1 vs V2ʺ fields) and migrate in back-fills, not big-bangs; use Avro/Protobuf with schema-registry for forward+backward compatibility. “How to clean up old code paths?” → Add a TTL to versions and track usage metrics.

2 — Storage engines & indexing (Ch 3)

Concept Interview gold-nugget
B-Tree vs LSM-Tree B-Tree = read-heavy, point-look-ups (MySQL, Postgres). LSM-Tree = write-heavy, sequential-append (Cassandra, RocksDB). Quote: “SSTables + memtable give you write amplification; B-Trees give you space amplification.”
Secondary index on huge table Explain that every secondary index is itself a key-value store that must be partitioned and replicated just like the primary. Tie to Chapter 6 hot-spot problem.

Mini-quiz (ask yourself): Why do LSMs bloom-filter every SSTable? (To dodge disk seeks.)


3 — Replication patterns (Ch 5)

Model Pitch line interviewer loves
Leader–Follower “Predictable writes, simple reads; but followers lag ⇒ need read-your-own-writes fix.”
Multi-Leader “Great for geo-writes (mobile), but conflict-resolution moves to app layer.”
Leaderless / Quorum “High availability, tunable R/W, but beware write-skew under network partitions.”

Sample white-board twist: “Your leader dies just after ack-ing a client write—what happens?” → Walk through log-based replication, fencing tokens, and failover election with a higher term.


4 — Partitioning/Sharding (Ch 6)


5 — Transactions & isolation (Ch 7)

Isolation level One-liner
Read-Committed “No dirty reads.”
Snapshot (Repeatable) “Reads from a consistent snapshot; still allows write-skew.”
Serializable “Equivalent to single-threaded order; can do OCC or two-phase-lock.”

Remember the killer diagram from the book showing double-booking a doctor’s appointment under snapshot isolation—use it to explain write-skew.