System Design resources

![[system-design-map.png]]

System Design Interview: A Step-By-Step Guide

https://www.youtube.com/watch?v=i7twT3x5yv8

understand the problem and establish design scope (5 min)

clarify requirements
why are we building this
who are the users
what features do we need to build
- get interviewer buy in on feature list
non-functional
- focus on scale and performance
- do rough calculations
- get general sense of scale
should end with short list of features and a few non-functional requirements to satisfy

propose high level design and get buy in (20 min)

top down, start with APIs
- use REST
- define input parameters and output responses carefully
- verify they satisfy functional requirements
design a diagram
- used to verify design satisfies requirements end to end
- start with load balancer or API gateway
- behind that is the services that satisfy the requirements
- behind that is persistence so introduce data storage
- do this for each requirement
- keep a list of conversation topics for later (scaling, concurrency, failure scenarios)
create data model and schema
- data access patterns and read/write ratio
step back and review the design

design deep dive (15 min)

identify problematic areas and discuss trade offs
determine with interview what to discuss in depth
non-functional requirements go here
ask interviewer if they have any concerns about current design
for each area
- articulate the problems
- come up with 2 solutions
- discuss tradeoffs of the solutions
- pick a solution and discuss

wrap up (5 min)

summarize the design
note parts that are unique to this particular situation

How to Answer System Design Interview Questions

https://www.youtube.com/watch?v=L9TfZdODuFQ

define the problem space

define the scope
ask lots of questions to narrow the scope
clarify functional and non-functional requirements
functional
- what's in and out of scope?
- state assumptions
- is this from scratch?
- who are the clients or users?
- do we need to talk to existing pieces of the system?
- what are those pieces?
non-functional
- business objectives
- user experience
- availability, consistency, speed, security, reliability
- cost
focus on the ones you think are most critical
estimate the amount of data you're dealing with
- storage size
- bandwidth requirements
- this can help you choose components and give an idea of what scaling might look like later
- make some assumptions about user volume and typical behavior
- check if these match interviewer's expectations

design the system at a high level

lay out the most fundamental pieces of the system and illustrate how they work together to achieve desired functionality
no nitty-gritty
start by designing APIs
- each system requirement should translate to one or more APIs
- use REST, SOAP, graphql, etc
- consider request parameters and response type
- these form the foundation of the architecture
client/web server communication
create a high level diagram
- show how the data and control flow looks in this system
- no scalability yet

deep dive into the design

examine system components and relationships in more detail
start by talking about how non-functional requirements impact your design choices
this is where you start adding load balancers, database partitioning

identify bottlenecks and scaling opportunities

examine system for ability to operate under various conditions and has room to support growth
is there a single point of failure? what can we do to support robustness and enhance the system's availability?
is the data valuable enough to require replication? how important is it to keep all versions consistent?
is the service global? do we need to provide multi-geo data centers to improve locality?
are there edge cases like peak time usage or hot users that have usage patterns that could deteriorate the performance?
how do we scale the system to support 10x more users?
this is where horizontal sharding, CDNs, caching, rate limiting knowledge is useful

review and wrap up

summarize major decisions with justifications
summarize any trade-offs in space, time, complexity
check that design satisfies all requirements
identify potential areas for improvement

how i mastered system design interviews

https://www.youtube.com/watch?v=l3X1t3kpmwY

what are the requirements of the system? who are the users and how many? what components do we need in our system? how should these components be organized? how do we make the system scalable? how to make the system reliable? how to make it easy to maintain?

key concepts

scalability
- how well a system can handle more users or data without slowing down
- vertical - adding resources like bigger hard drive, more memory
- horizontal - adding more machines to the system to handle the load
performance
- how fast your system works
- latency - time it takes for a single task
- throughput - how many tasks your system can handle in a certain time
availability
- making sure the system is up and running when users need it without significant downtime
reliability
- system is doing what it's supposed to be doing even when things go wrong
- replication
- redundancy
- failover mechanisms
consistency
- all users see the same data at the same time no matter which part of the system they interact with
- can slow down performance
- eventual consistency - the data may not be up to date immediately but will be after a specific time
cap theorm
- in a distributed system you can only have 2 of these 3 things:
  - consistency
  - availability
  - partition tolerance
- you need to make trade offs based on what the system needs the most
data storage and retrieval
- choosing the database
- designing the schema
- partitioning, sharding, replication
- for optimal storage and retrieval
ACID transactions
- atomicity
- consistency
- isolation
- durability
- a way to make sure everything we do in a database is done right and reliably
consistence hashing
- used to spread data across a group of servers
- makes it easier to add and remove servers with minimal disruptions
- load balancing, scalability
rate limiting
- controls the rate clients can make requests to the system
- prevent abuse
- protect against DDOS attacks
- ensure fair use of resources
networking and communication
- how different parts of a system communicate
- network protocols
- APIs
- message queues
- event-driven architecture
security and privacy
- putting in place methods to keep important data safe and stop unwanted access
- authentication
- authorization
- encryption

building blocks

application servers
- computers that handle the business logic and processing required by the application
load balancers
- distribute incoming requests to different serves to ensure no single server gets overwhelmed
databases
- data storage
- there are different types to serve different needs
- common: SQL and NoSQL
caching
- store frequently accessed data in a fast access storage to reduce load on the primary data source and improve response times
message queues
- enable asynchronous communication between system components
- decouple sender and receiver and allow them to work independently at different rates
storage
- store and retrieve data such as files, images or videos
- local file systems
- distributed files systems
- object storage systems (S3)
proxy server
- acts as an intermediary between client and server
- can be used for things like load balancing, caching, security, or content filtering
CDN
- content delivery network distributed globally
- stores copies of website content
- serves up local instance of data

system design interview

clarify requirements
- functional and non-functional
estimate the capacity the system is dealing with
choose the right database and define the schema
design APIs and request/response pattern
- define endpoints and parameters
sketch out a high level block diagram
- identify major components
deep dive into key components and discuss how components interact
- Common Areas for Deep Dives:
  - Databases: How would you handle a massive increase in data volume? Discuss sharding (splitting data across multiple databases), replication (read/write replicas).
  - Web Servers/Application Servers: How do you add more servers behind the load balancer for increased traffic?
  - Load Balancers: Which Load Balancing techniques and algorithms to use (e.g., round-robin, least connections).
  - Caching: Where do you add more cache layers (in front of web servers? in the application layer?), and how do you deal with cache invalidation?
  - Single Points of Failure: Identify components whose failure would take down the system and discuss how to address it.
  - Authentication/Authorization: How do you manage user access and permissions securely?
  - Rate Limiting: How do you prevent excessive use or abuse of your APls?
discuss how system will scale under load
- sharding
- replication
- partitioning
discuss tradeoffs
- sql vs nosql
discuss caching strategies and where they can be added
discuss strategies for handling failures
- replicas
- fallbacks
- retries

System Design Concepts Course and Interview Prep

https://www.youtube.com/watch?v=F2FmTdLtb_4

Good design

scalability
maintainability
efficiency
reliability

key elements

moving data
- ensure data can move between parts of the system
- user requests or database transfers
- optimize for speed and security
storing data
- not just sql or nosql
- access patterns
- indexing strategies
- backup solutions
- optimize for security and availability
transforming data
- turning data into meaningful information

CAP or Brewer's theorem

you can only have two of the three at the same time best solution for the specific use case where can we afford to compromise?

consistency
- the same data is available to every user
availability
- the system is always available to requests
- bulwarks
  - reliability
  - fault tolerance
  - redundancy
partition tolerance
- the system is able to function even when a partition occurs

speed

throughput
- server - RPS
- database - QPS
- data - B/s
latency
- how long it takes to handle a single request
shorter latency equals longer throughput and vice versa

API design

defining inputs (product details from a seller)
defining outputs (information when user queries for a product)
CRUD
- create - post
- read - get
- update - put
- delete - delete
paradigms
- REST
  - stateless
  - can result in over/under fetching data
- graphQL
  - strongly typed, get only what you need
  - queries can impact server performance
  - only post requests
- gRPC
  - efficient
  - less human readable
ensure backwards compatibility
set rate limiter and CORS

Caching

browser
server
database
CDN

Databases

types
- relational
  - tables
  - SQL query language
  - great for transactions, complex queries, integrity
  - ACID compliant
- NoSQL
  - drops the Consistency property from ACID
  - it's flexible, can add and remove in any order
  - no schema
  - ideal for scalability, quick iteration, simple queries
- in memory
  - fast data retrieval
  - caching, session storage
scaling
- vertical "scale up"
  - increase CPU power
  - add more RAM
  - add more disk storage
  - upgrade network
  - limited by how many resources you can add
- horizontal "scale out"
  - add more machines
  - database sharding
    - vertical or horizontal
    - breaks data into smaller chunks and spreads it out across the system
    - range, directory, geographical
  - data replication
    - keep copies of data on multiple servers
    - master (read write) / slave (read only)
    - master master (both read and write)
performance
- caching
  - cache data or frequent queries
- indexing
  - boosts performance by indexing frequently access columns
- query optimization
  - minimize joins
  - use analyzers to understand query performance

Google system design interview: Design Spotify

https://www.youtube.com/watch?v=_K-eupuDVEc

constrain the problem

reduce the scope to something that can be designed within the hour

calculate the amount of data you're dealing with

specify some key metrics to help high level decision making

lay out the basic components of the design

device -> load balancer -> webserver -> database & storage (metadata vs blobs)

do basic database design

split out DBs for different types of data

add a cache where appropriate

analytics to ensure popular content is readily available via CDN
how does caching apply at different levels?

load balancing

make sure server isn't overloaded
look at different metrics to apply the right approach

talk through use cases

check your design
where are there bottlenecks?

refine the design

scaling and replication
- geolocated for more performant access
wrap up by outlining how it meets requirements
think big and add a new dimension

System Design Interview: A Step-By-Step Guide​

understand the problem and establish design scope (5 min)​

propose high level design and get buy in (20 min)​

design deep dive (15 min)​

wrap up (5 min)​

How to Answer System Design Interview Questions​

define the problem space​

design the system at a high level​

deep dive into the design​

identify bottlenecks and scaling opportunities​

review and wrap up​

how i mastered system design interviews​

key concepts​

building blocks​

system design interview​

System Design Concepts Course and Interview Prep​

Good design​

key elements​

CAP or Brewer's theorem​

speed​

API design​

Caching​

Databases​

Google system design interview: Design Spotify​

other resources​