SystemTap的常见错误大致可以分为两类.
一, 解析和语义阶段产生的错误
这类错误发生在systemtap解析stp脚本以及转换成C代码的阶段.
错误举例
1. 语义错误, 错误表现

parse error: expected foo, saw bar

例如, 缺失handler部分, 导致语义错误.

[root@db-172-16-3-150 share]# stap -e 'probe vfs.read
probe vfs.write'
parse error: expected one of '. , ( ? ! { = +='
saw: keyword at <input>:2:1
source: probe vfs.write
^
parse error: expected one of '. , ( ? ! { = +='
saw: <input> EOF
2 parse errors.
Pass 1: parse failed.  [man error::pass1]

补充handler即可修正错误 : 

[root@db-172-16-3-150 share]# stap -e 'probe vfs.read {}
probe vfs.write {}'

2. 权限错误

parse error: embedded code in unprivileged script

例如, 在代码中使用了%{ embedded C code }%, 但是未使用stap -g选项会导致这个错误.

[root@db-172-16-3-150 share]# stap -e '
function square:long (i:long) %{
STAP_RETVALUE = STAP_ARG_i * STAP_ARG_i;
%}
probe begin {
i=square(9)
println(i)
exit()
}'
parse error: embedded code in unprivileged script; need stap -g
saw: embedded-code at <input>:2:31
source: function square:long (i:long) %{
^
1 parse error.
Pass 1: parse failed.  [man error::pass1]

使用-g选项修正错误.

[root@db-172-16-3-150 share]# stap -g -e '
function square:long (i:long) %{
STAP_RETVALUE = STAP_ARG_i * STAP_ARG_i;
%}
probe begin {
i=square(9)
println(i)
exit()
}'
81

3. 类型匹配错误

semantic error: type mismatch for identifier 'foo' ... string vs. long

例如 : 

[root@db-172-16-3-150 share]# stap -e '
probe begin {
a = 10
a = execname()
println("a is:", a)
exit()
}'
semantic error: type mismatch (long vs. string): identifier 'a' at <input>:3:3
source:   a = 10
^
semantic error: type was first inferred here (string): identifier 'a' at :3:3
source: a = 10
^
Pass 2: analysis failed. [man error::pass2]

a开始=10, 是long类型, 后来又赋值execname(), 是string, 所以发生了不匹配的错误.
使用一致的类型修正即可.

[root@db-172-16-3-150 share]# stap -e '
probe begin {
a = 10
a = pid()
println("a is:", a)
exit()
}'
a is:23014

4. 不能推测出变量的类型时, 会报这个错误.

semantic error: unresolved type for identifier 'foo'

例如, 在printf函数中使用了一个未初始化的变量.

[root@db-172-16-3-150 share]# stap -e '
probe begin {
println("v is:", v)
exit()
}'
WARNING: never-assigned local variable 'v' : identifier 'v' at <input>:3:20
source:   println("v is:", v)
^
semantic error: unresolved type : identifier 'v' at :3:20
source:   println("v is:", v)
^
semantic error: unresolved type : identifier 'println' at :3:3
source: println("v is:", v)
^
Pass 2: analysis failed. [man error::pass2]

变量初始化即可解决 : 

[root@db-172-16-3-150 share]# stap -e '
probe begin {
v = 100
println("v is:", v)
exit()
}'
v is:100

5. 当赋值对象不是一个有效的变量或数组元素时, 会报如下错误.

semantic error: Expecting symbol or array index expression.

例如 : 

[root@db-172-16-3-150 share]# stap -e '
probe begin {
println("hello") = 1
exit()
}'
semantic error: Expecting symbol or array index expression: identifier 'println' at <input>:3:3
source:   println("hello") = 1
^
Pass 2: analysis failed. [man error::pass2]

6. 调用函数时, 传入的参数个数和函数参数个数不匹配.
或者是数组的索引个数不匹配时报错.

while searching for arity N function, semantic error: unresolved function call

例如 : 
函数参数个数不匹配

[root@db-172-16-3-150 share]# stap -e '
function add:long (a:long, b:long) {
return a+b
}
global arr
probe begin {
println("add(10): ", add(10))
exit()
}'
WARNING: mismatched arity-2 function found: identifier 'add' at <input>:2:10
source: function add:long (a:long, b:long) {
^
semantic error: unresolved arity-1 function: identifier 'add' at :7:24
source:   println("add(10): ", add(10))
^
Pass 2: analysis failed. [man error::pass2]

数组索引个数不匹配

[root@db-172-16-3-150 share]# stap -e '
global arr
probe begin {
arr[1,2,3]="hello"
println("arr: ", arr[1,2])
exit()
}'
semantic error: inconsistent arity (3 vs 2): identifier 'arr' at <input>:5:20
source:   println("arr: ", arr[1,2])
^
semantic error: arity 3 first inferred here: identifier 'arr' at :4:3
source: arr[1,2,3]="hello"
^
Pass 2: analysis failed. [man error::pass2]

7. 当数组变量未定义为全局变量时报错,

semantic error: array locals not supported, missing global declaration?

例如 : 

[root@db-172-16-3-150 share]# stap -e '
probe begin {
arr[1,2]= "hello"
exit()
}'
semantic error: unresolved arity-2 global array arr, missing global declaration?: identifier 'arr' at <input>:3:3
source:   arr[1,2]= "hello"
^
Pass 2: analysis failed. [man error::pass2]

8. 在foreach中, 不允许修改数组的值, 否则会报错. 这样的限制是为了提高stap 一个handler的运行速度. 减少带来的性能问题.

semantic error: variable foo modi?ed during foreach iteration

例如 : 

[root@db-172-16-3-150 share]# stap -e '
global arr
probe begin {
arr[1]="a"
arr[2]="b"
foreach(idx in arr)
arr[idx]="new"
exit()
}'
semantic error: variable 'arr' modified during 'foreach' iteration: identifier 'arr' at <input>:7:5
source:     arr[idx]="new"
^
Pass 2: analysis failed. [man error::pass2]

9. 当event不存在或者在tapset库中无法找到时, 会报如下错误

semantic error: probe point mismatch at position N, while resolving probe point foo

例如 : 

[root@db-172-16-3-150 share]# stap -e '
probe test {
}'
semantic error: while resolving probe point: identifier 'test' at <input>:2:7
source: probe test {
^
semantic error: probe point mismatch(alternatives: __nd_syscall __nfs __scheduler __signal __tcpmib __vm _linuxmib _nfs _signal _sunrpc _syscall _vfs begin begin(number) end end(number) error error(number) generic ioblock ioblock_trace ioscheduler ioscheduler_trace ipmib irq_handler java(number) java(string) kernel kprobe kprocess linuxmib module(string) nd_syscall netdev netfilter never nfs nfsd perf process process(number) process(string) procfs procfs(string) scheduler scsi signal socket softirq stap staprun sunrpc syscall tcp tcpmib timer tty udp vfs vm workqueue): identifier 'test' at :2:7
source: probe test {
^
Pass 2: analysis failed. [man error::pass2]

10. 当探针中的函数不存在时, 报如下错误. 例如kernel.function("test"), test函数不存在.

semantic error: no match for probe point, while resolving probe point foo

例如 : 

[root@db-172-16-3-150 share]# stap -e '
probe kernel.function("test") {
}'
semantic error: while resolving probe point: identifier 'kernel' at <input>:2:7
source: probe kernel.function("test") {
^
semantic error: no match (similar functions: bs, del, dget, dput, eat)
Pass 2: analysis failed. [man error::pass2]

11. 在handler中获取探针处的上下文变量(target variables)的值时, 可能由于变量值不可获取(或变量不存在等)报错 : 

semantic error: unresolved target-symbol expression

例如 : 

[root@db-172-16-3-150 share]# stap -e '
probe vfs.read {
println($$vars)
exit()
}'
file=0xffff8818169bc140 buf=0x7fff453edb70 count=0x2004 pos=0xffff88141aa27f48 ret=?

读取一个不存在的target variable将报错 : 

[root@db-172-16-3-150 share]# stap -e '
probe vfs.read {
println($abc)
exit()
}'
semantic error: unable to find local 'abc', [man error::dwarf] dieoffset 0x125bd59 in kernel, near pc 0xffffffff81181610 in vfs_read fs/read_write.c (alternatives: $file $buf $count $pos $ret): identifier '$abc' at <input>:3:11
source:   println($abc)
^
Pass 2: analysis failed. [man error::pass2]

或者该变量的地址中无法获得相应的值.

[root@db-172-16-3-150 share]# stap -e '
probe vfs.read {
println($ret)
exit()
}'
semantic error: not accessible at this address [man error::dwarf] (0xffffffff81181610, dieoffset: 0x125bdbd): identifier '$ret' at <input>:3:11
source:   println($ret)
^
Pass 2: analysis failed. [man error::pass2]

这个错误也可能是由于代码优化导致的.

This may be a result of compiler optimization of the generated code.

12. 当安装的kernel-debuginfo包和运行的kernel版本不一致, 或者需要探针对应的包的debuginfo但是对应的debuginfo包版本不一致时可能产生如下类型的错误.

semantic error: libdw? failure

例如 : 

[root@db-172-16-3-150 share]# uname -r
2.6.32-358.el6.x86_64

[root@db-172-16-3-150 share]# rpm -qa|grep kernel-debuginfo
kernel-debuginfo-2.6.32-358.23.2.el6.centos.plus.x86_64
kernel-debuginfo-common-x86_64-2.6.32-358.23.2.el6.centos.plus.x86_64

[root@db-172-16-3-150 share]# stap -e '
probe vfs.read {
println($$vars)
exit()
}'
semantic error: while resolving probe point: identifier 'kernel' at /opt/systemtap/share/systemtap/tapset/linux/vfs.stp:768:18
source: probe vfs.read = kernel.function("vfs_read")
^
semantic error: missing x86_64 kernel/module debuginfo [man warning::debuginfo] under '/lib/modules/2.6.32-358.el6.x86_64/build'
semantic error: while resolving probe point: identifier 'vfs' at <input>:2:7
source: probe vfs.read {
^
semantic error: no match
Pass 2: analysis failed. [man error::pass2]


安装与kernel版本对应的kernel-debuginfo包即可.

[root@db-172-16-3-150 share]# yum install -y kernel-debuginfo-2.6.32-358.el6.x86_64

或者本文第13条中的例子中如果使用了不同版本的debuginfo, 也是会报类似错误.

rpm -ivh coreutils-debuginfo.x86_64 0:8.4-19.el6_4.2
[root@db-172-16-3-150 share]# rpm -qa|grep coreutils coreutils-debuginfo-8.4-19.el6_4.2.x86_64 coreutils-libs-8.4-19.el6.x86_64 coreutils-8.4-19.el6.x86_64 policycoreutils-2.0.83-19.30.el6.x86_64
[root@db-172-16-3-150 share]# stap -d /bin/ls --ldd -e 'probe process("ls").function("xmalloc") {print_usyms(ubacktrace())}' -c "ls /"
WARNING: cannot find module /bin/ls debuginfo: No DWARF information found [man warning::debuginfo]
semantic error: while resolving probe point: identifier 'process' at <input>:1:7
source: probe process("ls").function("xmalloc") {print_usyms(ubacktrace())}
^
semantic error: no match
Pass 2: analysis failed. [man error::pass2]


13. 当需要探针对应的包的debuginfo时, 但是该包未安装. 会产生类似如下错误.

semantic error: cannot find foo debuginfo

例如 : 

[root@db-172-16-3-150 pg93]# stap -d /bin/ls --ldd -e 'probe process("ls").function("xmalloc") {print_usyms(ubacktrace())}' -c "ls /"
WARNING: cannot find module /bin/ls debuginfo: No DWARF information found [man warning::debuginfo]
semantic error: while resolving probe point: identifier 'process' at <input>:1:7
source: probe process("ls").function("xmalloc") {print_usyms(ubacktrace())}
^
semantic error: no match
Pass 2: analysis failed.  [man error::pass2]

安装对应的debuginfo即可解决
查找/bin/ls所在的包名

[root@db-172-16-3-150 pg93]# rpm -qf /bin/ls
coreutils-8.4-19.el6.x86_64

安装coreutils对于的debuginfo包.

[root@db-172-16-3-150 pg93]# yum install -y coreutils-debuginfo-8.4-19.el6.x86_64

二, 生产模块后, 模块在内核中运行阶段产生的错误和警告.
这类错误发生在运行时, staprun通过模块与内核交互, 采集数据的阶段.
错误举例
1. 执行过程中产生了多少错误以及跳过了多少probe.

WARNING: Number of errors: N, skipped probes: M

例如

[root@db-172-16-3-150 share]# stap -e '
probe begin {
error("1.error funn\n")
}
probe end {
printf("2.end probe\n")
}
probe error {
printf("3.error probe\n")
}'
ERROR: 1.error funn
3.error probe
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

2. 除数为0时报错

division by 0

例如

[root@db-172-16-3-150 share]# stap -e '
probe begin {
println(10/0)
exit()
}'
ERROR: division by 0 near operator '/' at <input>:3:13
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

3. 当统计类型变量中没有元素, 但是使用了@count, @sum以外的操作符(avg, min, max)时, 会报如下错误

aggregate element not found

例如

[root@db-172-16-3-150 share]# /usr/bin/stap -e '
global s
probe begin {
println(@count(s))
exit()
}'
WARNING: never assigned global variable 's' : identifier 's' at <input>:2:8
source: global s
^
0
[root@db-172-16-3-150 share]# /usr/bin/stap -e '
global s
probe begin {
println(@sum(s))
exit()
}'
WARNING: never assigned global variable 's' : identifier 's' at <input>:2:8
source: global s
^
0
avg, min, max报错
[root@db-172-16-3-150 share]# /usr/bin/stap -e '
global s
probe begin {
println(@avg(s))
exit()
}'
WARNING: never assigned global variable 's' : identifier 's' at <input>:2:8
source: global s
^
ERROR: empty aggregate near identifier '@avg' at <input>:4:11
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  Try again with another '--vp 00001' option.
[root@db-172-16-3-150 share]# /usr/bin/stap -e '
global s
probe begin {
println(@min(s))
exit()
}'
WARNING: never assigned global variable 's' : identifier 's' at <input>:2:8
source: global s
^
ERROR: empty aggregate near identifier '@min' at <input>:4:11
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  Try again with another '--vp 00001' option.
[root@db-172-16-3-150 share]# /usr/bin/stap -e '
global s
probe begin {
println(@max(s))
exit()
}'
WARNING: never assigned global variable 's' : identifier 's' at <input>:2:8
source: global s
^
ERROR: empty aggregate near identifier '@max' at <input>:4:11
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  Try again with another '--vp 00001' option.

4. 数组中包含的索引个数超出数组初始化的元素个数时, 报错

aggregation overflow
Array overflow

例如 : 

[root@db-172-16-3-150 share]# stap -e '
global arr[10]
probe timer.ms(1) {
arr[gettimeofday_ms()] <<< gettimeofday_ms()
}
probe timer.s(1) {
foreach (i in arr) {
println(@count(arr[i]))
}
}'
ERROR: Array overflow, check size limit (10) near identifier 'arr' at <input>:4:3
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

解决办法, 使用-D MAXMAPENTRIES=n 指定更大的元素初始值, 或者使用global arr[n] 定义更大的初始值.
5. 函数嵌套调用次数超出限制

MAXNESTING exceeded

例如

[root@db-172-16-3-150 share]# stap -e '
> function fibonacci(i) {
>     if (i < 1) error ("bad number")
>     if (i == 1) return 1
>     if (i == 2) return 2
>     return fibonacci (i-1) + fibonacci (i-2)
> }
> probe begin {
>   println(fibonacci(10))
>   exit()
> }
> '
89
[root@db-172-16-3-150 share]# stap -e '
function fibonacci(i) {
if (i < 1) error ("bad number")
if (i == 1) return 1
if (i == 2) return 2
return fibonacci (i-1) + fibonacci (i-2)
}
probe begin {
println(fibonacci(100))
exit()
}
'
ERROR: MAXNESTING exceeded near identifier 'fibonacci' at <input>:2:10
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

解决办法, 使用-D MAXNESTING=n指定更大的允许嵌套次数
6. 当handler执行的语句数超出限制时报错

MAXACTION exceeded

例如 : 

[root@db-172-16-3-150 share]# stap -e '
> probe begin {
>   for(i=0;i<10000;i++) {
>   }
>   exit()
> }'
ERROR: MAXACTION exceeded near keyword at <input>:3:3
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

解决办法, 使用-D MAXACTION=n 提高限制数.
7. 当地址不存在, 或者其他原因导致获取制定地址信息错误.

kernel/user string copy fault at ADDR

例如 : 

[root@db-172-16-3-150 share]# stap -e '
> probe begin {
>   println(user_string(123))
>   exit()
> }'
ERROR: user string copy fault -14 at 000000000000007b near identifier 'user_string_n' at /opt/systemtap/share/systemtap/tapset/uconversions.stp:120:10
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

[root@db-172-16-3-150 share]# stap -e '
probe begin {
println(kernel_string(123))
exit()
}'
ERROR: kernel string copy fault at 0x000000000000007b near identifier 'kernel_string' at /opt/systemtap/share/systemtap/tapset/linux/conversions.stp:18:10
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

[root@db-172-16-3-150 share]# stap -e '
probe begin {
println(kernel_int(123))
exit()
}'
ERROR: kernel int copy fault at 0x000000000000007b near identifier 'kernel_int' at /opt/systemtap/share/systemtap/tapset/linux/conversions.stp:198:10
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /opt/systemtap/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]


8. 取消引用上下文指针变量时的报错.

pointer dereference fault
There was a fault encountered during a pointer dereference operation such as a target variable evaluation.

[参考]

1. https://sourceware.org/systemtap/SystemTap_Beginners_Guide/errors.html
2. https://sourceware.org/systemtap/SystemTap_Beginners_Guide/runtimeerror.html
3. https://sourceware.org/systemtap/wiki/TipExhaustedResourceErrors

SystemTap Errors Introduce相关推荐

  1. Using SystemTap

    Using SystemTap I work at Joyent – a cloud computing company – doing performance analysis of small t ...

  2. linux系统分析工具续-SystemTap和火焰图(Flame Graph)

    本文为网上各位大神文章的综合简单实践篇,参考文章较多,有些总结性东西,自认暂无法详细写出,建议读文中列出的参考文档,相信会受益颇多.下面开始吧(本文出自 "cclo的博客" 博客, ...

  3. NFC framework introduce

     NFC framework introduce 1 NFC 简介 对于NFC,是google在android4.0上推出来的,简单介绍下.近场通讯(NFC)是一系列短距离无线技术,一般需要4cm ...

  4. NFC framework introduce(一)

    NFC framework introduce 1 NFC 简介 对于NFC,是google在android4.0上推出来的,简单介绍下.近场通讯(NFC)是一系列短距离无线技术,一般需要4cm或者更 ...

  5. 【技术干货】听阿里云CDN安防技术专家金九讲SystemTap使用技巧

    1.简介 SystemTap是一个Linux非常有用的调试(跟踪/探测)工具,常用于Linux内核或者应用程序的信息采集,比如:获取一个函数里面运行时的变量.调用堆栈,甚至可以直接修改变量的值,对诊断 ...

  6. [转]SQLAlchemy Introduce

    SQLAlchemy Introduce Tao Junjie 2015-12-13 19:17 Source perface 我们每天都要面对数据,数据库CRUD操作的能力对每个任务都至关重要.无论 ...

  7. 深度学习论文 Learning representations by back-propagating errors

    Learning representations by back-propagating errors 小记 2022.3.16 第一次读全英语论文 看了吴恩达的课之后明白了BP原理,但是还是想看看原 ...

  8. systemtap 生成火焰图

    编写SystemTap脚本(另外,春哥也写了很多脚本,参考春哥的github) global bt; global quit = 0 probe timer.profile  { if (pid() ...

  9. SystemTap使用指南

    1.简介 SystemTap是一个Linux非常有用的调试(跟踪/探测)工具,常用于Linux内核或者应用程序的信息采集,比如:获取一个函数里面运行时的变量.调用堆栈,甚至可以直接修改变量的值,对诊断 ...

  10. 用systemtap对sysbench IO测试结果的分析1

    http://www.actionsky.com/docs/archives/171  2016年5月6日  黄炎 近期在一些简单的sysbench IO测试中, 遇到了一些不合常识的测试结果. 从结 ...

最新文章

  1. STL sort解析
  2. 对话Linus:Linux 25岁啦
  3. 1 jquery对checkbox的简单操作
  4. 高手过招:用SQL解决环环相扣的刑侦推理问题(罗海雄版本)
  5. iOS Target-Action模式下内存泄露问题深入探究
  6. Drupal第三方库jQuery UI起死回生,多个漏洞影响网站、企业产品等
  7. arm-linux gcc 指针 取值,GCC存储的字符串常量以及这些指针映射的位置在哪里?
  8. 易用宝项目记录day2-框架搭建
  9. android 手机内存清理,教你彻底清理手机内存的最佳方法,只需一招
  10. 八字起大运php代码,八字起大运方法有几种
  11. mybatis 源码系列(四) 数据库驱动Driver加载方式
  12. 利用Plex和Syncthing搭建媒体中心
  13. spring boot读取resources下面的文件图片
  14. Cesium 三维球转动监听事件(相机监听事件)并且获取当前中心点位置
  15. 微软良心伙伴,OneDrive首发支持iOS11的文件App
  16. 成长经历:浅谈OSINT认知
  17. element-ui换肤,全局换肤
  18. java-net-php-python-ssm巴音学院本科部校园网站计算机毕业设计程序
  19. 【LeetCode刷题】二月汇总篇
  20. 安卓手机软键盘弹起的问题

热门文章

  1. Cocos2d-x for Android iOS开发环境配置最佳实践
  2. PostgreSQL and MySQL lock compare ext.
  3. iOS:关于UIView切角的两种实现方式
  4. .NET URL 301转向方法的实现
  5. PHP去掉Bom标记
  6. 关于虚拟主机软件配置的问题
  7. web开发需要注意/n的问题
  8. Elastic Search 查询语法大全
  9. idea控制台搜索功能
  10. IntelliJ IDEA 中 右键新建时,选项没有Java class