目录

Uninterruptible Sleep

译文


Uninterruptible Sleep

Nov 16, 2015

One of the curious features of Unix systems (including Linux) is the “uninterruptible sleep” state. This is a state that a process can enter when doing certain system calls. In this state, the process is blocked performing a sytem call, and the process cannot be interrupted (or killed) until the system call completes. Most of these uninterruptible system calls are effectively instantaneous meaning that you never observe the uninterruptible nature of the system call. In rare cases (often because of buggy kernel drivers, but possibly for other reasons) the process can get stuck in this uninterruptible state. This is very similar to the zombie process state in the sense that you cannot kill a process in this state, although it’s worth that the two cases happen for different reasons. Typically when a process is wedged in the uninterruptible sleep state your only recourse is to reboot the system, because there is literally no way to kill the process.

One infamous example of this has been Linux with NFS. For historical reasons certain local I/O operations are not interruptible. For instance, the mkdir(2) system call is not interruptible, which you can verify from its man page by observing that this system call cannot return EINTR. On a normal system the worst case situation for mkdir would be a few disk seeks, which isn’t exactly fast but isn’t the end of the world either. On a networked filesystem like NFS this operation can involve network RPC calls that can block, potentially forever. This means that if you get the right kind of horkage under NFS, a program that calls mkdir(2) can get stuck in the dreaded uninterruptible sleep state forever. When this happens there’s no way to kill the process and the operator has to either live with this zombie-like process or reboot the system. The Linux kernel programmers could “fix” this by making the mkdir(2) system call interruptible so that mkdir(2) could return EINTR. However, historical Unix system since the dawn of time don’t return EINTR for this system call so Linux adopts the same convention.

This was actually a big problem for us at my first job out of college at Yelp. At the time we had just taken the radical step of moving images out of MySQL tables storing the raw image data in a BLOB column, and had moved the images into NFS served from cheap unreliable NFS appliances. Under certain situations the NFS servers would lock up and processes accessing NFS would start entering uninterruptible sleep as they did various I/O operations. When this happened, very quickly (e.g. in a minute or two) every single Apache worker would service a request handler doing one of these I/O operations, and thus 100% of the Apache workers would become stuck in the uninterruptible sleep state. This would quite literally bring down the entire site until we rebooted everything. We eventually “solved” this problem by dropping the NFS dependency and moving things to S3.

Another fun fact about the uninterruptible sleep state is that occassionally it may not be possible to strace a process in this state. The man page for the ptrace system call notes that under rare circumstances attaching to a process using the ptrace system call can cause the traced process to be interrupted. If the process is in uninterruptible sleep then the process can’t be interrupted, which will cause the strace process itself to hang forever. Remarkably, it appears that the ptrace(2) system call is itself uninterruptible, which means that if this happens you may not be able to kill the strace process!

Tonight I learned about a “new” feature in Linux: the TASK_KILLABLE state. This is sort of a compromise between processes in interruptible sleep and processes in uninterruptible sleep. A process in the TASK_KILLABLE state still cannot be interrupted in the usual sense (i.e. you can’t force the system call to return EINTR); however, processes in this state can be killed. This means that, for instance, processes doing I/O over NFS can be killed if they get into a wedged state. Not all system calls implement this state, so it’s still possible to get stuck unkillable processes for some system calls, but it’s certainly an improvement over the previous situation. As usual LWN has a great article on the subject including information about the historical semantics of uinterruptible sleep on Linux.

译文

2015年11月16日

Unix系统(包括Linux)的一个奇怪特性是“不间断睡眠”状态。这是进程在执行某些系统调用时可以进入的状态。在这种状态下,进程将被阻止执行系统调用,并且直到系统调用完成后,进程才能被中断(或终止)。这些不间断的系统调用中的大多数都是有效的瞬时含义,即您永远不会观察到系统调用的不间断性质。在极少数情况下(通常是由于内核驱动程序有故障,但可能是由于其他原因),进程可能会陷入这种不可中断的状态。从某种意义上说,您不能杀死一个进程,这与僵尸进程状态非常相似,尽管这两种情况是由于不同的原因发生的。

一个臭名昭著的例子是带有NFS的Linux。由于历史原因,某些本地I / O操作不可中断。例如,mkdir(2)系统调用是不可中断的,您可以从其手册页中进行验证 通过观察此系统调用不能返回EINTR。在正常系统上,mkdir的最坏情况是几次磁盘寻道,虽然速度并不很快,但也不是世界末日。在类似NFS的网络文件系统上,此操作可能涉及可能永久阻止的网络RPC调用。这意味着,如果在NFS下获得正确的支持,则调用mkdir(2)的程序可能永远陷入可怕的不间断睡眠状态。发生这种情况时,无法杀死进程,操作员必须忍受这种类似于僵尸的进程,或者重新启动系统。Linux内核程序员可以通过使mkdir(2)系统调用可中断来“修复”此问题,以便mkdir(2)可以返回EINTR。然而,

对于我在Yelp大学毕业后的第一份工作来说,这实际上是一个大问题。当时,我们刚刚采取了根本性的步骤,即从将原始图像数据存储在BLOB列中的MySQL表中移出图像,并将图像移至由廉价,不可靠的NFS设备提供服务的NFS中。在某些情况下,NFS服务器将锁定,并且访问NFS的进程将在执行各种I / O操作时开始进入不间断的睡眠状态。当发生这种情况时,每个Apache工作人员很快就会(例如在一两分钟之内)为执行这些I / O操作之一的请求处理程序提供服务,因此100%的Apache工作人员将陷入不间断的睡眠状态。从字面上看,这将使整个站点瘫痪,直到我们重新启动一切。

关于不间断睡眠状态的另一个有趣事实是,有时可能无法在此状态下跟踪进程。ptrace系统调用的 手册页 指出,在极少数情况下,使用ptrace系统调用附加到进程可能会导致跟踪进程中断。如果该进程处于不间断的睡眠状态,则该进程将不会被中断,这将导致strace进程本身永远挂起。值得注意的是,ptrace(2)系统调用本身是不间断的,这意味着如果发生这种情况,您可能无法终止strace进程!

今晚,我了解了Linux中的“新”功能:TASK_KILLABLE状态。这是可中断睡眠中的进程与不可中断睡眠中的进程之间的一种折衷。通常情况下,处于该TASK_KILLABLE状态的进程仍无法中断(即,您不能强制系统调用返回EINTR);但是,处于这种状态的进程可以被杀死。这意味着,例如,如果进程进入楔入状态,则可以终止通过NFS执行I / O的进程。并非所有的系统调用都实现此状态,因此对于某些系统调用来说,仍然有可能陷入无法杀死的进程,但这肯定是对以前情况的一种改进。像往常一样, LWN在该主题上有一篇很棒的文章, 包括有关Linux上不间断睡眠的历史语义的信息。

https://eklitzke.org/uninterruptible-sleep

Uninterruptible Sleep(不可中断的睡眠)相关推荐

  1. BUG: scheduling while atomic 分析 and 为什么中断不能睡眠

    遇到一个BUG: scheduling while atomic: kworker/0:2/370/0x00000002:看了这篇文章BUG: scheduling while atomic 分析,是 ...

  2. 中断不可睡眠的一些理解

    一.        LINUX中到是有中断还没有完全返回就调用schedule()而睡眠过去的例子.        可以猜是哪里.        我觉得,中断和异常不同,中断是异步的,异常和系统调用是 ...

  3. linux驱动线程睡眠,linux驱动中断不能睡眠的原因

    自己的理解: 查了很多资料,比较多的是说中断没有上下文,调度出去之后不能正确返回,个人觉得这不是根本原因或者本身说法就不对.中断上下文借用进程上下文,被切换出去还是可以回来的.中断一般用于任务需要紧急 ...

  4. Linux 命令(74)—— top 命令

    文章目录 1.命令简介 2.命令格式 3.选项说明 4.输出介绍 5.交互式命令 6.多窗口模式介绍(alternate display mode) 6.1 窗口总览(WINDOWS Overview ...

  5. 为什么可能导致睡眠的函数都不能在中断上下文中使用呢?【转】

    转自:http://www.cnblogs.com/hoys/archive/2012/06/28/2567622.html 转自:http://blog.chinaunix.net/u1/49093 ...

  6. CC2530学习(四)休眠模式配置及外部中断/睡眠时钟唤醒

    文章目录 一.硬件连接 二.寄存器描述(外部中断) (一).睡眠模式设置 (二).电源模式控制设置 三.示例代码(外部中断) 四.实验现象(外部中断) 五.寄存器描述(睡眠时钟中断) (一).睡眠模式 ...

  7. CC2530基础实验:(10)系统睡眠唤醒--中断唤醒

    目录 前言 一.实验相关电路图 二.实验相关理论与寄存器 三.源码分析 前言 1) 为什么要睡眠? Zigbee的特点就是远距离低功耗的无线传输设备,节点模块闲时可以 进入睡眠模式,在需要传输数据时候 ...

  8. Zigbee之旅(九):几个重要的CC2430基础实验——系统睡眠及中断唤醒

    Zigbee之旅(九):几个重要的CC2430基础实验--系统睡眠及中断唤醒 一.承上启下 这一篇,我们来讨论一下CC2430的睡眠功能及唤醒方法.在实际运用中的CC2430节点一般是靠电池来供电,因 ...

  9. Linux进程的Uninterruptible sleep(D)状态

    Linux系统进程状态: PROCESS STATE CODES Here are the different values that the s, stat and state output spe ...

最新文章

  1. next_permutation(,)用法
  2. Maven中settings.xml的配置项说明
  3. java项目启动时登录,Java项目启动时报错解决方法
  4. oracle客户端三种连接,客户端连接ORACLE的几种方法
  5. Windows平台RTMP|RTSP播放器实现画面全屏功能
  6. UE4的MaterialInstance作用
  7. 判断当前是否运行于Design Mode
  8. clean build 的区别(转)
  9. PE格式第七讲,重定位表
  10. 20190827 On Java8 第十四章 流式编程
  11. java代码自动生成
  12. Unity实战篇 | 游戏中控制 地图无限自动化生成 的方法,进一步优化项目
  13. Ae导出 计算机内存,ae导出视频太大怎么办-缩小Ae导出视频大小的方法 - 河东软件园...
  14. cpu和接口之间数据传送控制方式
  15. amd cpu排行_amd cpu性能 排行榜_amd处理器性能排行
  16. 在微软官方网站”满速”下载Windows10最新系统镜像方法。
  17. 小游戏“程序猿大战产品*那啥”
  18. 月考分析五年级英语html,五年级英语月考总结
  19. “人人视频”下架整改冲上热搜;鸿蒙系统升级用户一周破千万;滴滴招股书:1300万司机去年赚了1174亿元 | 架构视点...
  20. 滴滴C2C模式隐忧暴露,神州专车却仍无力逆风翻盘?

热门文章

  1. 【共读Primer】52.[6.3]返回类型和return语句--返回数组指针 Page205
  2. java IO包装流如何关闭
  3. POJ2823 Sliding Window 单调队列
  4. BZOJ 1010: [HNOI2008]玩具装箱toy 斜率优化dp
  5. 【Python】setup-转载
  6. python盒中取球_在Python中找到占据给定球的盒子的位置
  7. html表格列平分行,CSS布局问题 , 如何实现一行多列div,类似于表格
  8. python求平面坐标最接近的点_从Python中的集合中有效地找到最接近的坐标对
  9. R 多变量数据预处理_超长文详解:C语言预处理命令
  10. 5去掉button按钮的点击样式_各种好看的小按钮合集,纯css编写,最近在学习时遇到的,记录成为笔记...