作者简介:

Jay Kreps是Linkedin的前技术大牛,也是目前Confluent的CEO。它的自我介绍如下:

I am the co-founder of Confluent and also the co-creator of Apache Kafka as well as various other open source projects.

内容简介:

Jay Kreps的这篇文章,非常简单直接地分析为什么有时候write耗时很久,用的ext4文件系统,全文行云流水,没有一句废话。

快,关注Linuxer公众号,一起涨姿势~

Why do write calls sometimes block for a long time in Linux?

Update: Explanation at the bottom.

Anyone know why fwrite() calls sometimes block?

Here is a test I did. I sequentially append 4096 byte chunks to 10 files in a round robin fashion. Throughput is artificially limited to a fixed rate set at about ¼ the maximum throughput of the drives, which I accomplish by sleeping at subsecond intervals in between my writes. I time each call to write. My expectation is that writes go immediately to pagecache and are asynchronously written out to disk (unless I were to call fsync, which I don’t).

In fact this is usually what happens, the average time is just a few microseconds, but sometimes the write calls block for somewhere between 400 ms and a few seconds. I am using Linux 3.6.32 (RHEL 6) with ext4. I am using default configurations otherwise (no change to any of the /proc/sys/vm stuff and fiddling with those parameters don’t seem to help).

Here is a trace of average and max latencies taken over 1 second intervals. Note the regular spikes. What is the cause of this? Locking between the flush threads and the write thread? Is there anything you can do to mitigate it? This effects anything that does logging–i.e. almost everything.

I understand why this would happen if I exceeded the throughput that Linux’s background flush threads can handle, but that is clearly not the case here as the drive can sustain 400MB writes over a long period.

I tried this on a few different machines, some with RAID, some with a single disk and all showed the same behavior to varying degrees.

Throughput (mb/sec) Avg. Latency (ms) Max Latency (ms)
99.915 0.000 0.054
99.973 0.000 0.087
100.005 0.000 0.057
100.089 0.000 0.041
99.965 0.000 0.071
99.977 0.000 0.098
99.999 0.000 0.076
99.995 0.000 0.104
99.961 0.000 0.057
100.016 0.000 0.226
56.977 0.000 756.174
99.966 0.000 0.100
99.925 0.000 0.093
100.001 0.000 3.070
100.074 0.000 0.084
100.193 0.000 0.054
100.207 0.000 0.053
99.998 0.000 0.055
99.908 0.000 107.069
99.980 0.000 117.124
99.985 0.000 0.054
99.948 0.000 0.061
99.991 0.000 0.090
99.973 0.000 0.046
99.989 0.000 11.923
100.035 0.000 0.041
100.355 0.000 2.698
99.999 0.000 0.052
100.000 0.000 0.055
99.963 0.000 12.534
99.975 0.000 0.058
100.351 0.000 0.044
99.990 0.000 2.889
100.284 0.000 0.042
99.931 0.000 0.042
100.218 0.000 0.056
99.992 0.000 0.065
100.191 0.000 0.057
100.023 0.000 0.401
99.918 0.000 1.957
100.004 0.000 61.265
99.938 0.000 0.092
100.179 0.000 0.057
99.996 0.000 0.062

Update

A number of people gave interesting and helpful suggestions, such as “this is your punishment for not using Solaris.” The best suggestion was from Mark Wolfe which was to install latencytop and measure it. To do this on Red Hat you need to install their debug kernel and reboot with that, then latencytop will capture the highest latency kernel operations for each process. This gives a great deal of insight into what is going on.

For those who are curious here are a few of the traces that pop up as causing hundreds of ms of latency:

vfs_write()do_sync_write()ext4_file_write()generic_file_aio_write()ext4_da_write_begin() [in this case da means delayed allocation]block_write_begin()__block_prepare_write()ext4_da_get_block_prep()ext4_get_block_prep()ext4_get_blocks()call_rw_sem_down_read_failed()

This trace seems to be due to delayed allocation. Turing off delayed allocation makes it go away, though probably at the cost of some throughput.

Here is another one, this one seems to be related to journalling.

sys_write()vfs_write()do_sync_write()ext4_file_write()generic_file_aio_write()__generic_file_aio_write()file_update_time()__mark_inode_dirty()ext4_dirty_inode()ext4_journal_start_sb()jbd2_journal_start()start_this_handle()

You can check out the details of the code in one of these nice Linux kernel code browsers. My take away from all this was that maybe it is time to look at XFS since that allegedly also has better file locking for fsync which is my other problem.

历史文章

《Linux进程、线程和调度》4次课程高清ppt和录播链接

吴锦华/明鑫: 用户态文件系统(FUSE)框架分析和实战

宋宝华:LEP(Linux易用剖析器) 是什么,为什么以及怎么办(1)

黄伟亮: 探秘Linux的块设备和根

郭健: Linux内存逆向映射(reverse mapping)技术的前世今生

何晔: 当ZYNQ遇到Linux Userspace I/O(UIO)

魏永明: MiniGUI的涅槃重生之路

宋宝华:Linux的任督二脉——进程调度和内存管理

黄伟亮:ext4文件系统之裸数据的分析实践

谢宝友: 深入理解Linux RCU之一——从硬件说起

...

Jay Kreps: 为什么write有时候在Linux里面耗时很久相关推荐

  1. linux启动tomcat很久或者很慢Tomcat启动时卡在“INFO: Deploying web application directory ......”的解决方法...

    解决方案: 找到jdk1.x.x_xx/jre/lib/security/java.security文件,在文件中找到securerandom.source这个设置项,将其改为: securerand ...

  2. Linkedln技术高管Jay Kreps:Lambda架构剖析

    摘要:Jay Kreps是Linkedln的一名在线数据架构技术高管,在日常工作中,Jay Kreps经常被问及有关Lambda架构的问题,为此他结合实际经验和个人体会,针对Lambda架构进行深度剖 ...

  3. 有时候,当我们要进入某个外网时,很慢很慢,等很久才进得去

    在开发的时候,有时候进入某个国外网站,比如github.www.JetBrains.com等,要等很久,甚至直接进不去 1.用vpn,当时是可以的: 2.有些网站是可以进入的,只不过要等很久 办法: ...

  4. linux程序测试工具gprof,Linux系统-耗时检测-gprof操作入门

    Linux系统-耗时检测-gprof操作入门. 一定时间的输入的程序处理延时,现需测量程序中各个函数的耗时比例,找到性能瓶颈,使用gprof工具检测. 1. gprof的功能 gprof和oprofi ...

  5. Linux中Adding visible gpu devices: 0 每次运行到这里卡很久

    Adding visible gpu devices: 0 每次运行到这里卡很久 我们参考win中的想法. 参考1 :https://www.csdn.net/tags/MtTaMg1sODkzNjE ...

  6. Linux makefile 教程 很具体,且易懂

    近期在学习Linux下的C编程,买了一本叫<Linux环境下的C编程指南>读到makefile就越看越迷糊,可能是我的理解能不行. 于是google到了下面这篇文章.通俗易懂.然后把它贴出 ...

  7. android IO流_深度解读:为什么Win/iOS很流畅而Linux/Android却很卡顿?怎样才能不卡顿?...

    先说是不是,再问为什么. 我就知道有人会这么说,然而那样就成了一篇议论文了,而我只是想写一篇随笔.所以,不管事实是不是那样,反正我就是觉得Windows,MacOS,iOS都很流畅,而Linux,An ...

  8. 技术宅学习Linux系统还是很有前途的

    老实说,我之所以入了Linux的坑,纯粹只是为了追我现在的男朋友,也就是技术宅.如果不是为了追我男朋友的话,我估计我这辈子都不会去接触linux.好吧,今天写一写过往事情,也是为了怀念当初追男友的一些 ...

  9. bash: 无法为立即文档创建临时文件: 权限不够_世界顶级Linux大牛耗时三年总结出3000页Linux文档...

    众所皆知的,Linux的核心原型是1991年由托瓦兹(Linus Torvalds)写出来的,但是托瓦兹为何可以写出Linux这个操作系统?为什么它要选择386的计算机来开发?为什么Linux的发展可 ...

  10. 电脑卡顿不流畅是什么原因_为什么Windows/iOS操作很流畅,而Linux/Android却很卡顿呢...

    作者:dog250 来源:https://blog.csdn.net/dog250/article/details/96362789 先说是不是,再问为什么. 我就知道有人会这么说,然而那样就成了一篇 ...

最新文章

  1. Java架构技术文档:并发编程+设计模式+常用框架+JVM+精选视频
  2. 20165315 第八周考试课下补做
  3. Room是怎样和LiveData结合使用的?(源码分析)
  4. 给ztree节点赋值
  5. 再谈typedef(重点为函数指针)
  6. kitti百度网盘分享 kitti百度云盘,全套kitti分享 自动驾驶
  7. js点击a链接弹出alert对话框
  8. 网络浏览器大战(Google与IE的较量)
  9. 2021 个人成长复盘
  10. DLP Lightcrafter™ 4500 EVM常见问题答疑
  11. 苹果开发者中心如何上传构建版本
  12. signature=4d7e0a8216b57730ec16fe4e5ae2b93f,dragonfly对接harbor拉取镜像没有走dragonfly问题
  13. Python练习---turtle绘图之绘制天安门
  14. 行走的Linux——将ubuntu装入移动硬盘
  15. 教师计算机课游戏教学设计,有趣的游戏教学设计及课堂实录
  16. 《操作系统真象还原》——0.25 指令集、体系结构、微架构、编程语言
  17. 多线程处理Excel导入数据入库
  18. 如何辨识“真假”敏捷?
  19. python---控制时间的函数time()
  20. STGCN、ASTGCN、STSGCN、STFGNN模型的对比实验操作步骤

热门文章

  1. 人类历史的进程vs互联网的进程
  2. python 流水作业调度_动态规划——流水作业调度问题
  3. 【约束优先级问题二】动态高度cell
  4. 免费短链接生成器推荐,长网址缩短工具。
  5. TSW马宝国杯 web(我马宝国被黑了)
  6. arcgis 属性表 汇总_最常用的GIS数据汇总
  7. Java实现 kiosk模式,適用於Linux Java Swing應用程序的Kiosk模式
  8. EasyExcel ExcelGenerateException: The index of 'xx' and 'xx' must be inconsistent
  9. 欠阻尼二阶系统的单位阶跃响应分析
  10. 学会一招!如何利用 pandas 批量合并 Excel?