Kernel space lock contention配置及其使用
概述
本文涉及到的内容如下:
- kernel lock相关debug方式,比如lock耗时,拿不到lock,lock依赖等等
- trace的使用
1.开启lock contenttion涉及到的config配置
config LOCKDEP bool depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT select STACKTRACE select FRAME_POINTER if !MIPS && !PPC && !ARM_UNWIND && !S390 && !MICROBLAZE && !ARC && !SCORE && !X86 select KALLSYMS select KALLSYMS_ALL config LOCKDEP_SMALL bool config LOCK_STAT bool "Lock usage statistics" depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT select LOCKDEP select DEBUG_SPINLOCK select DEBUG_MUTEXES select DEBUG_RT_MUTEXES if RT_MUTEXES select DEBUG_LOCK_ALLOC default n help This feature enables tracking lock contention points For more details, see Documentation/locking/lockstat.txt This also enables lock events required by "perf lock", subcommand of perf. If you want to use "perf lock", you also need to turn on CONFIG_EVENT_TRACING. CONFIG_LOCK_STAT defines "contended" and "acquired" lock events. (CONFIG_LOCKDEP defines "acquire" and "release" events.)
上面的config默认是关闭状态。
2 menuconfig开启config配置
kernel hacking —> Lock Debugging (spinlock, mutexs, etc…) —>进入之后勾选
Lock Debugging: detect incorrect freeing of live locks 和Lcok usage statistics,就会把依赖项自动勾选.
保存退出之后,可以查看kernel下面新产生的config diff文件:
# Lock Debugging (spinlocks, mutexes, etc...) #
-# CONFIG_DEBUG_RT_MUTEXES is not set
-# CONFIG_DEBUG_SPINLOCK is not set
-# CONFIG_DEBUG_MUTEXES is not set
+CONFIG_DEBUG_RT_MUTEXES=y
+CONFIG_DEBUG_SPINLOCK=y
+CONFIG_DEBUG_MUTEXES=y # CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
-# CONFIG_DEBUG_LOCK_ALLOC is not set
+CONFIG_DEBUG_LOCK_ALLOC=y # CONFIG_PROVE_LOCKING is not set
-# CONFIG_LOCK_STAT is not set
+CONFIG_LOCKDEP=y
+CONFIG_LOCK_STAT=y
+# CONFIG_DEBUG_LOCKDEP is not set # CONFIG_DEBUG_ATOMIC_SLEEP is not set # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set # CONFIG_LOCK_TORTURE_TEST is not set
这样开启lock trace event了.。编译boot并刷机即可!
3 确定是否开启trace lock content feature
上面步骤二中开启之后,在adb shell里面多了如下几个接口:
- 增加了lock trace event, 有四个lock trace event,具体做什么使用,下面在详细讲解
- 增加了lock的统计信息, 节点信息实现源码kernel/locking/lockdep_proc.c文件中:
● lock_stat: 统计各种类型lock的时延
● lockdep: 表示锁的的深度,即一个lock里面调用了多少个lock,并将这些lock的调用全部显示出来
● lockdep_stats: 即锁深度里面包含了哪些lock以及频次
● locks:锁的状态,这个实现在fs/locks.c文件里面
4 如何使用lock trace event和proc node信息
4.1 lock trace event
从第三部分可以知道,lock trace event包含四个event,分别讲解如下:
四个event定义在include/trace/event/lock.h里面, 使用在kernel/locking/lockdep.c文件,
- lock_acquire: 获取lock,比如mutex lock ,rcu read lock 或者spinlock等等lock
- lock_acquired: 表示已经获取lock了
- lock_release: lock释放的event
- lock_contended: lock被谁hold了
那么如何抓取lock相关的trace event呢?
可以使用下面的脚本抓取(必须添加lock event ),即正常的抓取trace 命令:
● echo 40000 > buffer_size_kb && echo irq sched_switch sched_wakeup sched_waking cpu_frequency cpu_idle lock > set_event && cat set_event && echo > trace && echo 1 > tracing_on && sleep 10 && echo 0 > tracing_on && cat trace > /data/trace.txt
脚本执行完毕之后, pull出data目录下的trace.txt文件,vim打开即可看到下面类似的信息:
┊ ┊ sleep-2875 [004] d.s2 748.861607: sched_waking: comm=rcu_sched pid=9 prio=120 target_cpu=000
┊ ┊ sleep-2875 [004] d.s3 748.861608: lock_acquire: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s3 748.861609: lock_contended: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s3 748.861662: lock_acquired: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s3 748.861663: lock_acquire: 000000000ff04d65 read tk_core.seq
┊ ┊ sleep-2875 [004] d.s3 748.861664: lock_release: 000000000ff04d65 tk_core.seq
┊ ┊ sleep-2875 [004] d.s3 748.861666: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s3 748.861667: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s3 748.861668: lock_release: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s2 748.861669: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861670: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861671: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861672: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861673: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861674: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861675: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861676: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861677: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861678: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861679: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s2 748.861683: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s3 748.861684: lock_acquire: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s3 748.861685: lock_contended: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s3 748.861719: lock_acquired: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s4 748.861720: lock_acquire: 00000000cc2f375c &rq->lock
┊ ┊ sleep-2875 [004] d.s4 748.861720: lock_acquired: 00000000cc2f375c &rq->lock
┊ ┊ sleep-2875 [004] d.s4 748.861721: lock_acquire: 000000000ff04d65 read tk_core.seq
┊ ┊ sleep-2875 [004] d.s4 748.861722: lock_release: 000000000ff04d65 tk_core.seq
┊ ┊ sleep-2875 [004] d.s4 748.861724: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s4 748.861725: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s4 748.861726: lock_acquire: 00000000cc726398 read rcu_read_lock
┊ ┊ sleep-2875 [004] d.s4 748.861727: lock_release: 00000000cc726398 rcu_read_lock
┊ ┊ sleep-2875 [004] d.s4 748.861728: lock_release: 00000000d4eff177 &rq->lock
┊ ┊ sleep-2875 [004] d.s3 748.861730: lock_release: 00000000cc2f375c &rq->lock
我们可以看到如下的信息:
- rcu_read_lock 是lock_acquire 之后直接lock_release. 持锁时间非常的短
- &rq->lock, 是一个spinlock类型. 通过lock_acquire→ lock_acquired→ lock_release的过程. 在lock_acquire过程中出现了lock_contended trace event, 表示此时的&rq->lock spinlock这个lock存在contention或者contended,表示此时有lock的竞争.
- 如果出现锁竞争,就会统计当前进程获取lock的等待时间等等统计信息,详细在4.2节讲解.
- 可以明显的看到持锁时间, 或者获取锁的时间.
4.2 lock proc node信息
有三个主要的节点信息,都在shell proc目录下:
cat /proc/lock_stat:
/proc # cat lock_stat | head -n 40
lock_stat version 0.4
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- &(&n->list_lock)->rlock: 441821 441946 1.81 80.23 1891806.60 4.28 2488861 10314115 0.00 137.08 64789787.05 6.28 ----------------------- &(&n->list_lock)->rlock 114751 [<0000000051f28fcf>] ___slab_alloc+0x194/0x5ec &(&n->list_lock)->rlock 105155 [<000000000913f330>] deactivate_slab+0x2ec/0x540 &(&n->list_lock)->rlock 169288 [<00000000fb58074e>] free_debug_processing+0x34/0x26c &(&n->list_lock)->rlock 12316 [<00000000f00be1d8>] kfree+0x30c/0x55c ----------------------- &(&n->list_lock)->rlock 111558 [<0000000051f28fcf>] ___slab_alloc+0x194/0x5ec &(&n->list_lock)->rlock 89847 [<000000000913f330>] deactivate_slab+0x2ec/0x540 &(&n->list_lock)->rlock 192366 [<00000000fb58074e>] free_debug_processing+0x34/0x26c &(&n->list_lock)->rlock 12305 [<00000000f00be1d8>] kfree+0x30c/0x55c ............................................................................................................................................................................................................................. &rq->lock: 351895 351959 1.88 564.00 9605797.10 27.29 1569868 5892437 0.00 1061.00 172708414.01 29.31 --------- &rq->lock 5739 [<00000000807dcaa6>] task_rq_lock+0x74/0xc4 &rq->lock 7268 [<00000000f033900c>] pick_next_task_fair+0x378/0x7e4 &rq->lock 85000 [<0000000043e94103>] try_to_wake_up+0x1bc/0x554 &rq->lock 46881 [<0000000078183a31>] try_to_wake_up+0x3b8/0x554 --------- &rq->lock 24883 [<00000000f57857d0>] update_blocked_averages+0x50/0xc44 &rq->lock 3577 [<00000000807dcaa6>] task_rq_lock+0x74/0xc4 &rq->lock 14525 [<00000000f033900c>] pick_next_task_fair+0x378/0x7e4 &rq->lock 5 [<0000000007411a3e>] walt_set_window_start+0xd4/0x134 ............................................................................................................................................................................................................................. &sg_policy->update_lock: 259980 259986 1.88 212.12 1681170.67 6.47 1729740 3292457 1.88 270.81 28368055.35 8.62 ----------------------- &sg_policy->update_lock 259986 [<00000000077def88>] sugov_update_shared+0x5c/0x1b0 ----------------------- &sg_policy->update_lock 259986 [<00000000077def88>] sugov_update_shared+0x5c/0x1b0 .............................................................................................................................................................................................................................
上面表示从开机到现在cat这个节点的时间内,lock的所有状态信息,包括
- lock的名字
- contention次数
- lock等待的时间,最大时间,最小时间,平均时间以及等待的总时间
- lock持有时间,最大,最小,平均以及持有的总时间
- 获取这个lock的次数
等等…
cat /proc/lockdep:
目的是将class_lock上面所有相关联的lock stack全部输出
可能的样式如下:
/proc # cat lockdep | head -n 100
all lock classes:
0000000044beef8b ....: logbuf_lock 00000000c80448bf ....: (console_sem).lock 00000000700ad619 ....: console_lock 00000000ef505732 ....: cgroup_mutex 0000000042291e92 ....: console_owner_lock 000000002e29cf8c ....: console_owner 000000004e6f50fd ....: devtree_lock 000000003c9defa6 ....: resource_lock 00000000947b85f2 ....: pm_mutex 0000000010004418 ....: primary_crng.lock 0000000020d31ff4 ....: input_pool.lock 00000000588a9d5b ....: "warn_unseeded_randomness".lock 00000000e13be140 ....: lock 0000000005822fb5 ....: cpu_hotplug_lock.rw_sem 00000000b0cf55a3 ....: cpuhp_state_mutex
cat /proc/lockdep_stats:
信息如下,主要是一些上下文里面的信息:
/proc # cat lockdep_stats lock-classes: 1851 [max: 8191] direct dependencies: 0 [max: 32768] indirect dependencies: 0 all direct dependencies: 0 in-hardirq chains: 0 in-softirq chains: 0 in-process chains: 0 stack-trace entries: 20581 [max: 524288] combined max dependencies: 1 hardirq-safe locks: 0 hardirq-unsafe locks: 0 softirq-safe locks: 0 softirq-unsafe locks: 0 irq-safe locks: 0 irq-unsafe locks: 0 hardirq-read-safe locks: 0 hardirq-read-unsafe locks: 0 softirq-read-safe locks: 0 softirq-read-unsafe locks: 0 irq-read-safe locks: 0 irq-read-unsafe locks: 0 uncategorized locks: 1851 unused locks: 0 max locking depth: 19 debug_locks: 1
lock event信息还是对于lock debug还是非常有帮助的。
实战在另一篇博文:pr_emerg耗时,影响性能原理排查
Kernel space lock contention配置及其使用相关推荐
- Oracle死锁问题: enq: TX - row lock contention
前言 这篇文章也是记录近期遇到的问题以及从中学到的知识 ,近期一直在救火,有些问题自认为还是挺有代表性的,有兴趣的话再继续向下看 问题现象 线上反馈,执行批量处理EXCEL数据时,系统一直卡在进度滚动 ...
- 【故障处理】队列等待之enq: TX - row lock contention
[故障处理]队列等待之enq: TX - row lock contention 1 BLOG文档结构图 2 前言部分 2.1 导读和注意事项 各位技术爱好者,看完本文后,你可以掌握如下的技能, ...
- 等待事件 enq:TX - row lock contention分析与解决
6月30日,数据库发生了大量锁表.大概持续1小时,并且越锁越多.后来通过业务人员停掉程序,并kill掉会话后解决. 几天后再EM上查看CPU占用: CPU发生了明显等待. 主要是由于enq:TX - ...
- optee:kernel space调用user space进程时候的硬件行为
文章目录 1.先看kernel调用userspace所需的的参数 2.先看kernel mode切换user mode的寄存器行为 ★★★ 链接 : 个人博客导读首页-点击此处 ★★★ 我们以opte ...
- 一种User Mode下访问物理内存及Kernel Space的简单实现
一种User Mode下访问物理内存及Kernel Space的实现 一.背景 WinCE发展到6.0之后,内存结构和管理方法进行了完善.对应用程序影响比较大的有Virtual Memory Layo ...
- 详述由ENQ:TX – ROW LOCK CONTENTION引起的全局死锁处理过程
墨墨导读:ENQ: TX - row lock contention等待事件,是Oracle数据库里最常见的等待事件之一,一般是由于不同的session更新或删除同一条记录.唯一索引有重复索引.位图索 ...
- 11 User Space, Kernel Space, and the System Call API(用户空间,核空间,系统api)
1 The Operating System as a Resource 操作系统提供了两项基础功能 抽象:提供了统一接口,而不必关系底层的硬件,提供操作系统API 隔离:不同的程序相互隔离,互不影响 ...
- 十六.linux开发之Kernel移植——内核的配置和编译原理
有道云笔记地址: 详情看这里链接,记录太多,就不一一排版了. http://note.youdao.com/noteshare?id=d25dbce79566963e3699574a74048154& ...
- 队列等待之enq: TX - row lock contention
[性能优化]队列等待之enq: TX - row lock contention 问题背景: 客户反映某条sql DELETE SHAREINNERDOC WHERE SOURCEID=:B1< ...
最新文章
- 内核中的内存申请:kmalloc、vmalloc、kzalloc、kcalloc、get_free_pages【转】
- 动态获得资源的路径大小写问题
- python vs java的rsa加密
- SQL Server根据访问历史日志分析提供优化
- 【mysql】SCOPE_IDENTITY 和 @@IDENTITY的区别
- 泛型--定制泛型接口、泛型类--介绍篇
- 逆向 EasyBase64
- CATia对计算机配置要求,【2人回答】求CATIA对电脑的详细配置要求-3D溜溜网
- css 向左白色箭头,带CSS的工具提示左侧的箭头
- net自带二进制序列化,XML序列化和ProtoBuf序列化的压缩对比
- JimuReport积木报表 — API数据源报表带参制作
- K8s 集群搭建过程中遇到的问题的解决方法
- 随想录(软件逆向与OllyDbg)
- Java时间处理方法
- 程序员如何应对双十一购物的大流量冲击?
- [译]介绍Spark2.4的用于处理复杂数据类型的新内置函数与高阶函数
- Tensorflow(r1.4)API--tf.nn.dropout
- linux下mariadb的服务启动异常,centos7 – CentOS 7 MariaDB错误“无法启动mariadb.service:找不到单位.”...
- itools 安装 ipa
- WCF基础教程(三)——WCF通信过程及配置文件解析
热门文章
- 机器学习的三大主要任务
- (转)Linux——pv、vg和lv的概念
- linux 无法定位程序,三步搞定无法定位程序输入点 于动态链接库上
- Spring Security | 轻松搞定认证授权~
- JavaScript: JSON基本概念带题解
- 嵌入式Linux MIPI接口LCD调试-关于DRM显示与应用调试的干货浓缩
- SpringBoot2基础篇
- python excel 饼图 简书_Python实现绘画多个饼图
- linux 查看内存 udimm rdimm,服务器UDIMM、LRDIMM、RDIMM三种内存如何区别
- 关于内存类型 UDIMM、RDIMM、LRDIMM 的学习结论(转)