Thursday, July 25, 2024

Learn System Design For Interviews

Don't Miss

How Does Geohashing Work

Machine Learning System Design Interview – Valerii Babushkin

Geohash is a hierarchical spatial index that uses Base-32 alphabet encoding, the first character in a geohash identifies the initial location as one of the 32 cells. This cell will also contain 32 cells. This means that to represent a point, the world is recursively divided into smaller and smaller cells with each additional bit until the desired precision is attained. The precision factor also determines the size of the cell.

Geohashing guarantees that points are spatially closer if their Geohashes share a longer prefix which means the more characters in the string, the more precise the location. For example, geohashes 9q8yy9mf and 9q8yy9vx are spatially closer as they share the prefix 9q8yy9.

Geohashing can also be used to provide a degree of anonymity as we don’t need to expose the exact location of the user because depending on the length of the geohash we just know they are somewhere within an area.

The cell sizes of the geohashes of different lengths are as follows:

Geohash length

Saml Vs Oauth 20 And Openid Connect

There are many differences between SAML, OAuth, and OIDC. SAML uses XML to pass messages, while OAuth and OIDC use JSON. OAuth provides a simpler experience, while SAML is geared towards enterprise security.

OAuth and OIDC use RESTful communication extensively, which is why mobile, and modern web applications find OAuth and OIDC a better experience for the user. SAML, on the other hand, drops a session cookie in a browser that allows a user to access certain web pages. This is great for short-lived workloads.

OIDC is developer-friendly and simpler to implement, which broadens the use cases for which it might be implemented. It can be implemented from scratch pretty fast, via freely available libraries in all common programming languages. SAML can be complex to install and maintain, which only enterprise-size companies can handle well.

OpenID Connect is essentially a layer on top of the OAuth framework. Therefore, it can offer a built-in layer of permission that asks a user to agree to what the service provider might access. Although SAML is also capable of allowing consent flow, it achieves this by hard-coding carried out by a developer and not as part of its protocol.

Both of these authentication protocols are good at what they do. As always, a lot depends on our specific use cases and target audience.

What To Do After You Set A Goal

Alexey: Okay. So we do this, and then you also mentioned A/B tests. We define a metric, and then we say how exactly we are going to measure this metric. What do we do next?

Valerii: Let’s say we know what we would like to do. We know how we can try to optimize it in this way. What does that mean? That means that if my model improves, there is a high chance that my metric of interest will be better. Now, I need to think about the labels, but that’s obvious, right? Theres a proxy metric, you can say it’s a label. I will construct my labels. We know that you can say that labels are ys, now we need to think about access.

Valerii: What are the features? Okay, what features do we have? We have this, this, and that feature? They might make sense, right? We have x and y, now we need a model. What kind of model? We have a target, we have labels. What about the loss function? Can we just put in the loss function directly or not? Now lets come back to the features we have basic features do we think they interact with each other? Do we need to do some pre-processing? Okay, think about that. Now let’s say we can put the model, we have x, we have y, we can train it, right? So what happens here? Let’s do that.

Alexey: Perhaps if you cover all these parts during your system design interview, you’re already in quite a good position. Right?

Alexey: By crazy do you mean it outputs random stuff?

Don’t Miss: What To Bring To A Job Interview Teenager

Top 5 Distributed System Design Patterns

Get hands-on with modern system design today.

Try one of our 300+ courses and learning paths: Grokking Modern System Design for Software Engineers and Managers.

Distributed applications are a staple of the modern software development industry. Theyre pivotal to cloud storage services and allow web applications of massive scale to stay reactive. As programmers build these systems, they need fundamental building blocks they can use as a starting point and to communicate in a shared vocabulary.

This is where distributed system design patterns become invaluable. While sometimes overused, design patterns are a key skill recruiters are looking for and are essential to stand out in advanced system design interviews.

Today, well explore 5 of the top distributed system design patterns to help you learn their advantages, disadvantages, and when to use them.

Heres what well cover today:

What Is The Purpose Of A System Design Interview


To start things off, lets cover the basics. Whats the point of a system design interview, anyway? Well, at its highest level, heres an outline of what your interview should convey to an interviewer:

  • Given an ambiguous problem, can you create concrete technical requirements? It takes some experience to understand that system design interview questions are ambiguous, but thats one of the challenges.
  • Given these technical requirements, can you give a high level description of how to implement each specific requirement? Descriptions come in the form of text, diagrams, voice over, etc.
  • Can you defend your design and explain why its built to last? Can you create on cohesive system that feels cohesive, durable, fits into the existing flow, and is scalable?
  • Don’t Miss: How To Run An Interview

    Why Do We Need Distributed Transactions

    Unlike an ACID transaction on a single database, a distributed transaction involves altering data on multiple databases. Consequently, distributed transaction processing is more complicated, because the database must coordinate the committing or rollback of the changes in a transaction as a self-contained unit.

    In other words, all the nodes must commit, or all must abort and the entire transaction rolls back. This is why we need distributed transactions.

    Now, let’s look at some popular solutions for distributed transactions:

    Why Use A Cdn

    Content Delivery Network increases content availability and redundancy while reducing bandwidth costs and improving security. Serving content from CDNs can significantly improve performance as users receive content from data centers close to them and our servers do not have to serve requests that the CDN fulfills.

    Recommended Reading: How To Make A Short Interview Video

    High Availability Vs Fault Tolerance

    Both high availability and fault tolerance apply to methods for providing high uptime levels. However, they accomplish the objective differently.

    A fault-tolerant system has no service interruption but a significantly higher cost, while a highly available system has minimal service interruption. Fault-tolerance requires full hardware redundancy as if the main system fails, with no loss in uptime, another system should take over.

    Example Question : Design Reddit

    How to Prepare for System Design Interviews | Top System Design Interview Concepts

    “Let’s pretend that you are the lead engineer tasked with building a platform like Reddit from scratch.

    Assume youre given the following requirements:

    • Users can make posts in different forums ,
    • Users can include images in their posts,
    • Users can upvote or downvote posts they like or dislike,
    • Users can leave comments on posts,
    • The system has a news feed of posts sorted by both ranking and recency of posts.

    The question would come with the following constraint:

    • The system must be able to support large volumes of users viewing and posting content simultaneously.

    Design Reddit Interview Question: Our Answer

    There’s no denying this is a broad system design question. We will follow a similar approach as we did for the previous question.

    First, we will define the problem space. This, again, means clarifying the requirements. In this sample system design interview question, the interviewer gave you a concrete list of necessary features.

    Still, you should first dig deeper into the requirements for additional clarification and detail. For instance, you can ask questions like:

    • Does our system need to support users on mobile and the web, or just one or the other?
    • Will the images be uploaded to the system itself, or will they be links to another hosting service?
    • Are there any necessary performance-related features that would influence our system design and require load balancing?

    Let’s imagine your interviewer stated you only need to be concerned with web users.

    Don’t Miss: What Questions Will I Be Asked In An Interview

    Design For The Right Scale

    The same feature set requires a very different approach for different scales. Its important to determine the scale so that you know whether your data can fit on one machine or whether you need to scale the reads. You might ask:

    • What is the expected read-to-write ratio?
    • How many concurrent requests should we expect?
    • Whats the average expected response time?
    • Whats the limit of the data we allow users to provide?

    Different answers require very different designs, so getting the scale right is key to success.

    My System Design Interviews And The Genesis Of This Book

    By Zhiyong Tan, author of Acing the System Design Interview

    In this article, Zhiyong Tan explores aspects of his career development that led to success and planted the seeds for his book, which is available in the Manning Early Access Program now.

    Read this if you want to learn tips, tricks, and experiences regarding high-level software engineering and system design interviews.

    Take 25% off Acing the System Design Interview by entering fcctan into the discount code box at checkout at

    When I graduated with a masters in electrical engineering in 2013, I barely knew how to code. My experiences with writing more than 20 lines of code in a single file was scripting MATLAB to process some field measurements or draw simple 3D diagrams, or the C++ assignments of sophomore courses I took as a grad student in a semi-misguided effort to learn coding fundamentals. I had bought a copy of Gayle Laakmann McDowells 5th edition of Cracking the Coding Interview, learned Java to follow it , and had only just learned basic data structures in the past year.

    They may know thou shalt guide me with thy counsel.

    -Psalm 73:24

    What is a system design interview?

    Why system design?

    System design and your powers

    I wish you the best in your system design learning experience, and in making choices of what you do with your new abilities.

    If youre interested in learning more, you can find my book here.

    Manning Publications

    You May Like: How To Pass Facebook Interview

    Use The Star Response Technique

    Formatting your responses using the STAR interview response technique is a strategy to help you craft answers that illustrate your knowledge and qualifications through specific experiences. STAR is an acronym for Situation, Task, Action and Result. Using the STAR method, discuss an applicable situation, identify the task you needed to complete, outline the actions you took and reveal the results of your efforts to demonstrate your skills to the interviewer.

    Applying Ml Systems To Real

    Pin on Motivating your Team

    Alexey: Thank you. Let’s go to the next one, How to make machine learning algorithms work with other parts of systems to solve real world problems? I guess the question is more about, Okay, we have this model that we just discussed. This model for classifying images how do we integrate it with the rest of the system and what do we need to do for that?

    Valerii: The model is nothing by itself. That’s why you have a machine learning engineer. That’s why. I don’t like the job title data scientist. Because what is a data scientist? The person who does something in Jupyter Notebook? Who needs that? Yeah, people need a model integrated in the system. That’s why they need machine learning engineers. Thats why in Facebook, there are machine learning engineers you engineer.

    Valerii: You’re accounting for the software engineer plus machine learning. So yes, the company needs machine learning engineers. Then again, what was the first task for us? Understand what we want to achieve. As soon as you understand what you would like to achieve, it’s much easier to achieve that. Without understanding, of course, randomly, you might achieve a desired goal, but the chances are not high.

    Alexey: So the most important thing, when we start with building a machine learning system, is to think about the goal. This is something that was first in your checklist. Then the rest will come.

    Recommended Reading: Scenario Based Team Leader Interview Questions And Answers

    How Does A Cdn Work

    In a CDN, the origin server contains the original versions of the content while the edge servers are numerous and distributed across various locations around the world.

    To minimize the distance between the visitors and the website’s server, a CDN stores a cached version of its content in multiple geographical locations known as edge locations. Each edge location contains several caching servers responsible for content delivery to visitors within its proximity.

    Once the static assets are cached on all the CDN servers for a particular location, all subsequent website visitor requests for static assets will be delivered from these edge servers instead of the origin, thus reducing the origin load and improving scalability.

    For example, when someone in the UK requests our website which might be hosted in the USA, they will be served from the closest edge location such as the London edge location. This is much quicker than having the visitor make a complete request to the origin server which will increase the latency.

    Parking Lot System Design Interview Question

    This is a nice YouTube video explaining how to solve a popular parking lot System design interview question.

    This video tutorial covers the following use cases

  • Give a user ticket when he enters
  • 2. Generate price when the user exits.
  • They also discuss APIs, Database models’ and database choice, how to make it distributed, and concurrency which is key system design concepts and often asked during coding interviews.

    Here is the full video you can watch to learn how to solve this popular system design interview question.

    Read Also: Questions To Ask Executive Assistant Interview

    Secure Software Design Specialization By University Of Colorado

    • It is a specialization offered by the University of Colorado made to help students learn how to design and maintain secure software.
    • It comprises four courses namely, Software Design as an Element of the Software Development Lifecycle, Software Design as an Abstraction, Software Design Methods, and Tools, and Software Design Threats and Mitigations.
    • You will gain various skills including UML, Unit Testing and Ethics, Effective User Interfaces, and Database Design.
    • This course has a flexible schedule with an approximate course completion time of 7 months.
    • This specialization is suitable for beginner-level programmers as no pre-requisite is required to begin with this course.

    Learn From The Tech Giants

    System Design Full Course (A-Z) and Common Interview Questions 2021 | System Design Mock Interview

    This is probably not going to help in the short term. But in the long term, to become an expert in System Design, its best to look at the Tech blogs of various tech companies and see how they are solving various technical problems.

    This would paint a clear picture of the real problems that they face and how innovatively they solve them. Understanding these things would help you become better at system design and also keep you up to date with the latest innovations in tech.

    Some of the best blogs to follow are:

    Read Also: How To Conduct A Telephone Interview

    Push Or Pull Delivery

    Most message queues provide both push and pull options for retrieving messages. Pull means continuously querying the queue for new messages. Push means that a consumer is notified when a message is available. We can also use long-polling to allow pulls to wait a specified amount of time for new messages to arrive.

    Why Do We Need Containers

    Let’s discuss some advantages of using containers:

    Containerization provides a clear separation of responsibility, as developers focus on application logic and dependencies, while operations teams can focus on deployment and management.

    Workload portability

    Containers can run virtually anywhere, greatly easing development and deployment.

    Application isolation

    Containers virtualize CPU, memory, storage, and network resources at the operating system level, providing developers with a view of the OS logically isolated from other applications.

    Agile development

    Containers allow developers to move much more quickly by avoiding concerns about dependencies and environments.

    Efficient operations

    Containers are lightweight and allow us to use just the computing resources we need.

    Don’t Miss: How To Develop Interview Skills

    Why Is System Design Important

    A software engineer who is proficient in both high-level and low-level design is more productive. It helps an engineer to make architectural decisions that improve the efficiency and scalability of a software system while lowering the cost of an organization. It helps you to distinguish between the various databases available, such as MySQL, PostgreSQL, and others. It also allows you to find the appropriate tool for a certain engineering problem, such as Messaging Queues, different Message Brokers, Caches, Storage Options like CDN, Columnar Data Store, Load Balancers, and so on.

    A content delivery network, or CDN, is a worldwide distributed proxy server network that distributes content to end-users from sites close to them. Static files such as HTML, CSS, JS files, pictures, and videos are typically provided from CDNs in websites. Caching is the process of saving the bits of your data that will be used the most at a location that is faster than the storage/DB.

    Software systems keeping system design principles in mind will be of higher quality. Proper system design knowledge can reduce a companys long-term engineering costs. It also helps in making software solutions that can handle changing product requirements as well as large-scale operations.

    When creating a software system, you must consider the following factors to become a master of systems design:

    More articles

    Popular Articles