Dropbox Interview – Design Hit Counter

It starts with a simple question – if you are building a website, how do you count the number of visitors for the past 1 minute?

“Design hit counter” problem has recently been asked by many companies including Dropbox and the question is harder than it seems to be. This week, we’ll uncover all the mysteries of the problem. A couple of topics are discussed including basic data structures design, various optimization, concurrency and distributed counter.

What’s special about this problem?

I always like to tell our readers why we select this question to analyze so that you’ll know exactly whether it’s worth your time to read. As an interviewer, I have a strong preference for questions that are not hard to solve in the simplest case but the discussion can go deeper and deeper by removing/adding specific conditions. And this question is exactly the case.

Also, the question doesn’t come from nowhere, but has real use cases. For many systems today, we need a system to track not only users numbers, but different types of request numbers in real time.

If you haven’t thought about this problem, spend some time working on it before reading following sections.

Simple case

Forget about all the hard problems like concurrency and scalability issue, let’s say we only have a single machine with no concurrent requests, how would you get the number of visitors for the past 1 minute?

Apparently, the simplest solution is to store all the visitors with the timestamps in the database. When someone asks for visitor number of the past minute, we just go over the database and do the filtering and counting. A little bit optimization is to order users by timestamp so that we won’t scan the whole table.

The solution is not efficient as the time complexity is O(N) where N is the number of visitors. If the website has a large volume, the function won’t be able to return the number immediately.

Let’s optimize

A couple of ways to think about this problem. Since the above approach returns not only visitor numbers, but also visitors for the past minute, which is not needed in the question. And this is something we can optimize. From a different angle, we only need numbers for the past minute instead of any time range, which is another area that we can improve potentially. In a nutshell, by removing unnecessary features, we can optimize our solution.

A straightforward idea is to only keep users from the past minute and as time passes by, we keep updating the list and its length. This allows us to get the number instantly. In essence, we reduce the cost of fetching the numbers, but have to keep updating the list.

We can use a queue or linked list to store only users from the past minute. We keep all the element in order and when the last user (the earliest user) has the time more than a minute, just remove it from the list and update the length.

Space optimization

There’s little room to improve the speed as we can return the visitor number in O(1) time. However, storing all the users from the past minute can be costly in terms of space. A simple optimization is to only keep the user timestamp in the list rather than the user object, which can save a lot of space especially when the user object is large.

If we want to further reduce the space usage, what approach would you take?

A good way to think about this is that to improve space complexity, what should we sacrifice? Since we still want to keep the time complexity O(1), one thing we can compromise is accuracy. If we can’t guarantee to return the most accurate number, can we use less space?

Instead of tracking users from the past minute, we can only track users from the past second. By doing this, we know exactly how many visitors are from the last second. To get visitor numbers for the past minute, we keep a queue/linked list of 60 spots representing the past 60 seconds. Each spot stores the visitor number of that second. So every second, we remove the last (the earliest) spot from the list and add a new one with the visitor number of past second. Visitor number of the past minute is the sum of the 60 spots.

The minute count can be off by the request of the past second. And you can control the trade-off between accuracy and space by adjusting the unit, e.g. you can store users from past 2 seconds and have 30 spots in the list.

How about concurrent requests?

In production systems, concurrency is the most common problems people face. If there can be multiple users visiting the site simultaneously, does the previous approach still work?

Part of. Apparently, the basic idea still holds. However, when two requests update the list simultaneously, there can be race conditions. It’s possible that the request that updated the list first may not be included eventually.

The most common solution is to use a lock to protect the list. Whenever someone wants to update the list (by either adding new elements or removing the tail), a lock will be placed on the container. After the operation finishes, the list will be unlocked.

This works pretty well when you don’t have a large volume of requests or performance is not a concern. Placing a lock can be costly at some times and when there are too many concurrent requests, the lock may potentially block the system and becomes the performance bottleneck.

Distribute the counter

When a single machine gets too many traffic and performance becomes an issue, it’s the perfect time to think of distributed solution. Distributed system significantly reduces the burden of a single machine by scaling the system to multiple nodes, but at the same time adding complexity.

Let’s say we distribute visit requests to multiple machines equally. I’d like to emphasize the importance of equal distribution first. If particular machines get much more traffic than the rest machines, the system doesn’t get to its full usage and it’s very important to take this into consideration when designing the system. In our case, we can get a hash of users email and distribute by the hash (it’s not a good idea to use email directly as some letter may appear much more frequent than the others).

To count the number, each machine works independently to count its own users from the past minute. When we request the global number, we just need to add all counters together.

Summary

One of the reasons I like this question is that the simplest solution can be a coding question and to solve concurrency and scalability issue, it becomes a system design question. Also, the question itself has a wide usage in production systems.

Again, the solution itself is not the most important thing in the post. What we’re focused on is to illustrate how to analyze the problem. for instance, trade-off is a great concept to be familiar with and when we try to optimize one area, think about what else should be sacrificed. By thinking like this, it opens up a lot of doors for you.

By the way, if you want to have more guidance from experienced interviewers, you can check Gainlo that allows you to have mock interview with engineers from Google, Facebook etc..

The post is written by Gainlo - a platform that allows you to have mock interviews with employees from Google, Amazon etc..

转载于:https://www.cnblogs.com/vicky-project/p/9177014.html

Dropbox Interview – Design Hit Counter相关推荐

  1. 362. Design Hit Counter

    这个傻逼题..我没弄明白 you may assume that calls are being made to the system in chronological order (ie, the ...

  2. 继续过中等难度.0309

      .   8  String to Integer (atoi)    13.9% Medium   . 151 Reverse Words in a String      15.7% Mediu ...

  3. 边工作边刷题:70天一遍leetcode: day 97-2

    Design Hit Counter 要点:因为是second granularity,所以可以用以秒为单位的circular buffer方法.这题简单在只需要count过去300秒的,增加难度可以 ...

  4. Leetcode重点250题

    LeetCode重点250题 这个重点题目是把LeetCode前400题进行精简.精简方法如下: 删除不常考,面试低频出现题目 删除重复代码题目(例:链表反转206题,代码在234题出现过) 删除过于 ...

  5. LeetCode All in One 题目讲解汇总(持续更新中...)

    原文地址:https://www.cnblogs.com/grandyang/p/4606334.html 终于将LeetCode的大部分题刷完了,真是漫长的第一遍啊,估计很多题都忘的差不多了,这次开 ...

  6. LEETCODE-刷题个人笔记 Python(1-400)-TAG标签版本(二)

    前面一篇由于文字太多,不给编辑,遂此篇出炉 LEETCODE-刷题个人笔记 Python(1-400)-TAG标签版本(一) DFS&BFS (262)200. Number of Islan ...

  7. taoqick 搜索自己CSDN博客

    L1 L2正则化和优化器的weight_decay参数 kaiming初始化的推导 Pytorch动态计算图 Pytorch自动微分机制 PyTorch中在反向传播前为什么要手动将梯度清零? 通俗讲解 ...

  8. dropbox_询问操作方法:加快“开始”菜单搜索,停止自动旋转Android屏幕以及由Dropbox驱动的Torrenting...

    dropbox This week we take a look at tweaking the Window's start menu search for fast and focused sea ...

  9. vvvvvvvvvvvvvvvvvvvvvvvvv

    Java Concurrency In Practice Brian Göetz Tim Peierls Joshua Bloch Joseph Bowbeer David Holmes Doug L ...

最新文章

  1. gRPC简介及简单使用(C++)
  2. oracle分区和锁的难,oracle使用三(锁和表分区)
  3. python学习(八)定制类和枚举
  4. (四)孪生神经网络介绍及pytorch实现
  5. 如何在Python中实现尾递归优化
  6. Activity传递数据
  7. 轻松转换矢量图的小工具Vector Magic
  8. matlab神经网络训练图解释,matlab实现神经网络算法
  9. 浅析大数据与人工智能
  10. 三峡学院计算机调剂,重庆三峡学院2019考研预调剂公告
  11. Git初始化项目设置向导(CSDN)
  12. 小龙虾,这个“三流入侵者”竟成“钻石网红”?
  13. [散分] 眼见为实?_眼见为实
  14. [1.2.0新功能系列:二] Apache Doris 1.2.0 JDBC外表 及 Mutil Catalog
  15. 2021年企业十大科技趋势预测
  16. python图片合成视频
  17. shouldoverrideurlloading为什么有时候不走_心理学:为什么很多看似不般配的人,往往都能走到最后?...
  18. 关于 LambdaMART 的六个疑惑
  19. 微信小程序测试过程中的各个要点(干货)
  20. “低头走路”与“抬头看天”

热门文章

  1. linux 修改ramdisk内容,在Linux下使用RamDisk
  2. un3.0服务器文档,unturned3.0服务器指令是什么?
  3. Spring Cloud Alibaba基础教程:Nacos的数据持久化 1
  4. 解决Visual Studio 2008 下,打开.dbml(LINQ) 文件时,提示The operation could not be completed. 的问题。...
  5. 从文件系统迁移到ASM上
  6. ubuntu16.04 安装apache2报错 解决方案
  7. 答题获得思科T-shirt
  8. oracle中to_date函数详解
  9. 用python批量下载modis数据的速度怎么样_MODIS数据的简介和下载(五)——应用密钥的Python脚本下载...
  10. bootstrap mysql分页_bootstrap实现分页