High Level Design

Index

Steps of Designs
- Time management
- Framework and mindset
- Steps with actual process.
Design Rate-limiter
Design Key-value store

HLD- Process Steps/ Memory Map of System Design

Time management – 45 minutes or an hour is not enough to cover the entire design.The following is a very rough guide on distributing your time in a 45- minute interview session :

Step 1 Understand the problem and establish design scope: 3 – 10 minutes
Step 2 Propose high-level design and get buy-in: 10 – 15 minutes
Step 3 Design deep dive: 10 – 25 minutes
Step 4 Wrap: 3 – 5 minutes

A FRAMEWORK FOR SYSTEM DESIGN INTERVIEWS

These questions are largely about the process rather than the ultimate design. A great system design interview is open-ended and there is no one-size-fits-all solution. However, there are 4 steps and common ground to cover in every system design interview.

Following are 4 steps –

Understand the problem & establish the design scope
- Aim for –
  1. Functional and
  2. non-functional requirements
Propose high-level design and get buy-in/Draw the Major Components
Design deep dive
Wrap up

Understand the problem &establish the design scope
1. Scoping the problem is important because you want to ensure that you’re building what the interviewer wants and because this might be something that interviewer is specifically evaluating.
2. There might be details the interviewer left out (intentionally or unintentionally).
3. Answering without a thorough understanding of the requirements is a huge red
  flag.
4. Make Reasonable Assumptions
  1. It’s okay to make some assumptions (when necessary), but they should be reasonable.
  2. For example,
    1. Unreasonable – to assume that your system only needs to process 100 users per day, or to assume that you have infinite memory available.
    2. reasonable – to design for a max of one million new URLs per day. Making this assumption can help you calculate how much data your system might need to store.
  3. Assumptions might take some “product sense” (Non-functional requirement)
    - For example : Data staleness
      1. is it okay for the data to be stale by a max of ten minutes? That all depends. If it takes 10 minutes for a just-entered URL to work, that’s a deal-breaking issue. People usually want these URLs to be active immediately.
      2. However, if the statistics are ten minutes out of date, that might be okay. Talk to your interviewer about these sorts of assumptions.
Draw the Major Components – we aim to develop a high-level design and reach an agreement with the interviewer on the design.
- Draw box diagrams with key components on the whiteboard or paper. This might include clients (mobile/web), APIs, web servers, data stores, cache, CDN, message queue, etc.
- back-of-the-envelope required? – Think out loud. Communicate with your interviewer if back-of-the-envelope is necessary before diving into it. Do back-of-the-envelope calculations to evaluate if your blueprint fits the scale constraints.
- API endpoints required? – Should we include API endpoints and database schema here? This depends on the problem.
  For large design problems like “Design Google search engine”, this is a bit of too low level.
  For a problem like designing the backend for a multi-player poker game, this is a fair game.
  Communicate with your interviewer.
  - Walk through your system from end-to-end to provide a flow. A user enters a new URL. Then what? It may help here to ignore major scalability challenges and just pretend that the simple, obvious approaches will be okay. You’ll handle the big issues In Step 4,
Design deep dive and Redesign for the Key Issues (if req)
- Once you have a basic design in mind. Focus on the key issues. What will be the bottlenecks or major challenges in the system?
- depending on whether trade-offs to be mentioned or about some specific components -LB
  - For example, if you were designing TinyURL, one situation you might consider is that while some URLs will be infrequently accessed, others can suddenly peak. This might happen if a URL is posted on Reddit or another popular forum. You don’t necessarily want to constantly hit the database. Your interviewer might provide some guidance here. If so, take this guidance and use it.
Wrap-up
- Error cases (server failure, network loss, etc.) are interesting to talk about.
- Operation issues are worth mentioning. How do you monitor metrics and error logs?
  How to roll out the system?

Algorithms that Scale: Step-By-Step

In some cases, you’re not being asked to design an entire system. You’re just being asked to design a single feature or algorithm, but you have to do it in a scalable way. Or, there might be one algorithm part that is the”real”focus of a broader design question.
In these cases, try the following approach.

Poking holes in your own solution is a fantastic way to demonstrate this.

Key Concepts

Horizontal vs. Vertical Scaling
- A system can be scaled one of two ways.
  1. Vertical scaling means increasing the resources of a specific node. For example, you might add additional memory to a server to improve its ability to handle load changes.
  2. Horizontal scaling means increasing the number of nodes.
    - For example, you might add additional servers, thus decreasing the load on any one server. Vertical scaling is generally easier than horizontal scaling, but it’s limited. You can only add so much memory or disk space.
Load Balancer
- Typically, some frontend parts of a scalable website will be thrown behind a load balancer. This allows a system to distribute the load evenly so that one server doesn’t crash and take down the whole system. To do so, of course, you have to build out a network of cloned servers that all have essentially the same code and access to the same data.
Database Denormalization and NoSQL
- Joins in a relational database such as SQL can get very slow as the system grows bigger. For this reason, you would generally avoid them.
- Denormalization is one part of this. Denormalization means adding redundant information into a database to speed up reads.
  - For example, imagine a database describing projects and tasks (where a project can have multiple tasks). You might need to get the project name and the task information. Rather than doing a join across these tables, you can store the project name within the task table {in addition to the project table). Or, you can go with a NoSQL database. A NoSQL database does not support joins and might structure data in a different way. It is designed to scale better.
Database Partitioning (Sharding )
- Sharding vs partitioning
  - Partitioning:
    1. A more generic term for dividing data into smaller, more manageable units.
    2. Can be done within a single database server (vertical partitioning) or across multiple servers (horizontal partitioning).
  - Sharding :
    1. A specific type of horizontal partitioning where data is spread across multiple database servers (shards). while ensuring you have a way of figuring out which data is on which machine.
- A few common ways of partitioning include:
  1. Vertical Partitioning: This is basically partitioning by feature. For example, if you were building a social network, you might have one partition for tables relating to profiles, another one for messages, and so on. One drawback of this is that if one of these tables gets very large, you might need to repartition that database (possibly using a different partitioning scheme).
  2. Key-Based (or Hash-Based) Partitioning: This uses some part of the data (for example an ID) to partition it, Avery simple way to do this is to allocate N servers and put the data on mod ( key , n). One issue with this is that:
    1. the number of servers you have is effectively fixed. Adding additional servers means reallocating all the data — a very expensive task.
  3. Directory-Based Partitioning: In this scheme, you maintain a lookup table for where the data can be found. This makes it relatively easy to add additional servers, but it comes with two major drawbacks.
    1. First, the lookup table can be a single point of failure.
    2. Second, constantly accessing this table impacts performance.
- Many architectures actually end up using multiple partitioning schemes.
Caching
- An in-memory cache can deliver very rapid results. It is a simple key-value pairing and typically sits between your application layer and your data store. When an application requests a piece of information, it first tries the cache. If the cache does not contain the key, it will then look up the data in the data store. (At this point, the data might—or might not—be stored in the data store.) When you cache, you might cache a query and its results directly. Or, alternatively, you can cache the specific object (for example, a rendered version of a part of the website, or a list of the most recent blog posts).
Asynchronous Processing & Queues
- Slow operations should ideally be done asynchronously. Otherwise, a user might get stuck waiting and waiting for a process to complete.In some cases, we can do this in advance (i.e., we can pre-process). For example,
  1. we might have a queue of jobs to be done that update some part of the website. If we were running a forum, one of these jobs might be to re-render a page that lists the most popular posts and the number of comments. That list might end up being slightly out of date, but that’s perhaps okay. It’s better than a user stuck waiting on the website to load simply because someone added a new comment and invalidated the cached version of this page.
  2. In other cases, we might tell the user to wait and notify them when the process is done. You’ve probably seen this on websites before. Perhaps you enabled some new part of a website and it says it needs a few minutes to import your data, but you’ll get a notification when it’s done.
Networking Metrics
- Some of the most important metrics around networking include:
  1. Bandwidth: This is the maximum amount of data that can be transferred in a unit of time. It is typically expressed in bits per second (or some similar ways, such as gigabytes per second),
  2. Throughput: Whereas bandwidth is the maximum data that can be transferred in a unit of time,throughput is the actual amount of data that is transferred.
  3. Latency:This is how long it takes data to go from one end to the other. That is, it is the delay between the sender sending information (even a very small chunk of data) and the receiver receiving it,
    - Standard metrics for latency p95 – Generally below values are for near-real time system , non-real time system can have higher values (2x)
      1. UI updates: Below 50ms (ideal), below 100ms (acceptable)
      2. Database queries: Below 100ms (ideal), below 500ms (acceptable)
      3. API responses: Below 200ms (ideal), below 1 second (acceptable)
Considerations
- In addition to the earlier concepts to learn, you should consider the following issues when designing a system.
  1. Failures: Essentially any part of a system can fail. You’ll need to plan for many or all of these failures.
    - Failure Detection
    - Handling temporary failure
    - Handling permanent failure
  2. Availability and Reliability: Availability is a function of the percentage of time the system is operational. Reliability is a function of the probability that the system is operational for a certain unit of time.
  3. Read-heavy vs. Write-heavy: Whether an application will do a lot of reads or a lot of writes impacts the design. If it’s write-heavy, you could consider queuing up the writes {but think about potential failure here!). If it’s read-heavy, you might want to cache. Other design decisions could change as well.

Design Rate-limiter

Objective :

(iteration 1) –

Rate-limiting algorithms trade-offs
handling distributed rate-limiting

Iteration 2 –

put and get work-flow
redash use case working

Solution –

Steps 1– Understand the problem and establish design scope

re-iterate the requirement for understanding .

Questions –

What is the scale of the system we are expecting? Is it built for a startup or a big company with a large user base?
Specific –
- Do we need to inform users who are throttled?
  1. Yes!
  2. where to put rate limiter – client side or server side(or middle-ware like gateway)

If not given then summarise the requirement of system –

Functional
1. Accurately limit excessive requests
2. Does the rate limiter throttle API requests based on IP, the user ID, or other
  properties?
  - Interviewer: The rate limiter should be flexible enough to support different sets of throttle
    rules
Non-functional
1. Low latency. The rate limiter should not slow down HTTP response time.
2. Use as little memory as possible.
3. High fault tolerance. If there are any problems with the rate limiter (for example, a cache server goes offline), it does not affect the entire system
4. Distributed rate limiting. The rate limiter can be shared across multiple servers or
  processes.
5. Exception handling. Show clear exceptions to users when their requests are throttled.

Here is a list of popular rate-limiting algorithms:

Token bucket
Leaking bucket
Fixed window counter
Sliding window log
Sliding window counter

Refer for working and Pros and cons – https://techglot.tech.blog/2022/01/26/high-level-design/3/

To have a distributed Rate-limiting use – redis as rate checker (how exactly)

Design Key-value store

requirements

Functional

able to store a key and it’s value
able to get key and corresponding value

Non-Functional

Pages: 1234567

High Level Design

Published by

sevanand yadav

Leave a comment Cancel reply

Share this:

Related

Published by

sevanand yadav

Leave a comment Cancel reply