每个Exadata的数据库服务器和存储服务器节点都安装了sundiag.sh脚本(MOS:761868.1)

我们执行下:

[root@erpdb01 ~]# find /opt -name sundiag.sh
/opt/oracle.SupportTools/sundiag.sh
[root@erpdb01 ~]#

[root@erpdb01 oracle.SupportTools]# sh sundiag.sh
Oracle Exadata Database Machine - Diagnostics Collection Tool
Gathering Linux information

Skipping ILOM collection. Use the ilom or snapshot options, or login to ILOM
over the network and run Snapshot separately if necessary.

/tmp/sundiag_erpdb01_1338NML05G_2015_12_21_16_03
Generating diagnostics tarball and removing temp directory

==============================================================================
Done. The report files are bzip2 compressed in /tmp/sundiag_erpdb01_1338NML05G_2015_12_21_16_03.tar.bz2
==============================================================================
[root@erpdb01 oracle.SupportTools]# pwd
/opt/oracle.SupportTools
[root@erpdb01 oracle.SupportTools]#

文件 /tmp/sundiag_erpdb01_1338NML05G_2015_12_21_16_03.tar.bz2正常也就4M左右,解压后十多M不会很大

注:此BZ2包里包含很多的目录和文件,可以根据需要了解的信息直接搜索文件名

messages: 这个就时系统/var/log/messages文件的一个副本,该文件由syslog进程维护,

包括操作系统各类操作与健康情况的重要信息

Dec 19 19:25:06 erpdb01 last message repeated 4 times
Dec 19 19:26:10 erpdb01 last message repeated 4 times
Dec 19 19:27:14 erpdb01 last message repeated 4 times
Dec 19 19:28:18 erpdb01 last message repeated 4 times
Dec 19 19:29:22 erpdb01 last message repeated 4 times
Dec 19 19:30:26 erpdb01 last message repeated 4 times
Dec 19 19:31:30 erpdb01 last message repeated 4 times
Dec 19 19:32:34 erpdb01 last message repeated 4 times
Dec 19 19:33:38 erpdb01 last message repeated 4 times
Dec 19 19:34:08 erpdb01 last message repeated 2 times
Dec 19 19:53:36 erpdb01 auditd[9004]: Audit daemon rotating log files
Dec 20 00:24:56 erpdb01 ntpd[9339]: synchronized to LOCAL(0), stratum 10
Dec 20 00:50:44 erpdb01 ntpd[9339]: synchronized to 10.9.3.79, stratum 3
Dec 20 18:27:59 erpdb01 auditd[9004]: Audit daemon rotating log files
Dec 21 14:40:29 erpdb01 ntpd[9339]: synchronized to LOCAL(0), stratum 10
Dec 21 14:58:40 erpdb01 ntpd[9339]: synchronized to 10.9.3.79, stratum 3

dmesg:该文件由dmesg命令创建,包含来自缓冲区(kernel Ring Buffer)的内核诊断信息。

内核缓冲区包含了从系统外部设备(如磁盘驱动器,键盘,等)接受和发送的消息

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.39-400.128.21.el5uek (mockbuild@ca-build56.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)) #1 SMP Thu Apr 2 15:13:06 PDT 2015
Command line: root=LABEL=DBSYS bootarea=dbsys bootfrom=BOOT ro loglevel=7 panic=60 debug console=ttyS0,115200n8 console=tty1 pci=noaer log_buf_len=1m nmi_watchdog=0 nomce transparent_hugepage=never audit=1 crashkernel=380M@128M numa=off processor.max_cstate=1 intel_idle.max_cstate=0
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
 BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
 BIOS-e820: 0000000100000000 - 0000004080000000 (usable)
NX (Execute Disable) protection: active
SMBIOS 2.7 present.
DMI: Oracle Corporation SUN FIRE X4170 M3     /ASSY,MOTHERBOARD,1U   , BIOS 17100400 04/04/2014
e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
No AGP bridge found

lspci:该文件包含了系统上所有的PCI总线列表

00:00.0 Host bridge: Intel Corporation Xeon E5/Core i7 DMI2 (rev 07)
00:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 3a in PCI Express Mode (rev 07)
00:03.2 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 3c (rev 07)
00:04.7 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel 7 (rev 07)
00:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address Map, VTd_Misc, System Management (rev 07)
00:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control Status and Global Errors (rev 07)
00:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07)
00:11.0 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Virtual Root Port (rev 06)
00:1a.0 USB controller: Intel Corporation C600/X79 series chipset USB2 Enhanced Host Controller #2 (rev 06)
40:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
50:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
61:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 02)
62:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 21)
7f:0a.0 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 0 (rev 07)
7f:0a.3 System peripheral: Intel Corporation Xeon E5/Core i7 Power Control Unit 3 (rev 07)
7f:0b.0 System peripheral: Intel Corporation Xeon E5/Core i7 Interrupt Control Registers (rev 07)

lsscsi:该文件包含了系统上所有SCSI驱动器列表

[0:2:0:0]    disk    LSI      MR9261-8i        2.13  /dev/sda

fdisk-l.out :包含了系统上所有磁盘分区列表

Disk /dev/sda: 896.9 GB, 896998047744 bytes
255 heads, 63 sectors/track, 109053 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          65      522081   83  Linux
/dev/sda2              66      109053   875446110   8e  Linux LVM
Disk /dev/dm-0: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-1: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-2: 25.7 GB, 25769803776 bytes
255 heads, 63 sectors/track, 3133 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/dm-3: 544.3 GB, 544387104768 bytes
255 heads, 63 sectors/track, 66184 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

fdisk-l.err:磁盘报错信息

Disk /dev/dm-0 doesn't contain a valid partition table
Disk /dev/dm-1 doesn't contain a valid partition table
Disk /dev/dm-2 doesn't contain a valid partition table
Disk /dev/dm-3 doesn't contain a valid partition table

service_-status-all.out:这个是服务状态监测结果

有意思的是本结果直接反映了某次集群无发启动的原因

只是巧合!

acpid (pid 9225) is running...
anacron is stopped
auditd (pid  9004) is running...
exadata_mon_hw_asr.pl (pid 9372) is running...
crond (pid  9508) is running...
2015-12-21 16:03:56 +0800  [INFO] New logging session is started
2015-12-21 16:03:56 +0800  [INFO] Command line: /etc/init.d/exachkcfg status
2015-12-21 16:03:56 +0800  The exachkcfg was already run
sudo exist
Usage: /opt/OracleHomes/agent_home/core/12.1.0.3.0/install/unix/scripts/agentstup {start|stop}
Mon Dec 21 16:03:57 CST 2015
 --- Please choice Database env ---
                                   
 1: EBS Database
 2: Peoplesoft Database
                                  
 ---------------------------------
Please Input 1 or 2:
!!! You do not set the ORACLE env. !!!
sudo exist
Usage: /u01/em12c/core/12.1.0.3.0/install/unix/scripts/agentstup {start|stop}
hald is stopped
Usage: /etc/init.d/init.tfa {stop|start|shutdown|restart}
Firewall is stopped.
ipmi_msghandler module loaded.
ipmi_si module loaded.
ipmi_devintf module loaded.
/dev/ipmi0 exists.
Firewall is stopped.
irqbalance (pid 9083) is running...
iscsid is stopped
iscsid is stopped
Kdump is operational
lsidiagd is stopped
lsi_mrdsnmpagent (pid 9284 9268) is running...
mcstransd is stopped
mdadm is stopped
mdmpd is stopped
dbus-daemon is stopped
ERROR: mlx4_vnic module is not loaded
multipathd is stopped
usage: /etc/init.d/netbackup { start | stop | start_msg | stop_msg }
netconsole module not loaded
netplugd is stopped
Configured devices:
lo bondeth0 bondib0 eth0 eth1 eth2 eth3 eth4 eth5 ib0 ib1
Currently active devices:
lo eth0 eth4 eth5 ib0 ib1 bondeth0 bondib0
rpc.mountd is stopped
nfsd is stopped
rpc.rquotad is stopped
rpc.statd is stopped
nscd is stopped
ntpd (pid  9339) is running...
Low level hardware support loaded:
mlx4_ib mlx4_core

Upper layer protocol modules:
rds_rdma rds ib_ipoib

User space access modules:
rdma_ucm ib_ucm ib_uverbs ib_umad

Connection management modules:
rdma_cm ib_cm iw_cm

Configured IPoIB interfaces: none
Currently active IPoIB interfaces: ib0 ib1 bondib0 
portmap is stopped
Process accounting is disabled.
rdisc is stopped
rngd is stopped
rpc.idmapd is stopped
rsyslogd is stopped
saslauthd is stopped
sendmail is stopped
smartd is stopped
snmpd (pid  9250) is running...
snmptrapd is stopped
openssh-daemon (pid  10147) is running...
syslogd (pid  9066) is running...
klogd (pid  9069) is running...
Xvnc is stopped
Symantec Backup Exec Remote Agent for Linux/Unix Servers
Usage: VRTSralus.init { start | stop | restart }
Symantec Private Branch Exchange is running
2015-12-21 16:11:23 +0800 [INFO] No eth interfaces need this work around.
xinetd is stopped

megacli64*:这个就有点多了,因为会以各种选项运行megacli64命令查询MegaRAID控制器收集

的各种控制器和磁盘的配置及状态信息

我们随便看点:

megacli64-AdpAllInfo.out

Adapter #0
==============================================================================
                    Versions
                ================
Product Name    : LSI MegaRAID SAS 9261-8i
Serial No       : SV31502020
FW Package Build: 12.12.0-0178
                    Mfg. Data
                ================
Mfg. Date       : 04/06/13
Rework Date     : 00/00/00
Revision No     : 27B
Battery FRU     : N/A

Image Versions in Flash:
                ================
FW Version         : 2.130.373-2809
BIOS Version       : 3.29.00_4.14.05.00_0x05270000
Preboot CLI Version: 04.04-020:#%00009
WebBIOS Version    : 6.0-51-e_47-Rel
NVDATA Version     : 2.09.03-0047
Boot Block Version : 2.02.00.00-0000
BOOT Version       : 09.250.01.219

megacli64-BbuCmd.out

BBU status for Adapter: 0

BatteryType: iBBU08
Voltage: 3851 mV
Current: 0 mA
Temperature: 27 C
Battery State: Optimal
Design Mode  : 48+ Hrs retention with a non-transparent learn cycle and moderate service life.

BBU Firmware Status:

Charging Status              : None
  Voltage                                 : OK

megacli64-CfgDsply.out:

==============================================================================
Adapter: 0
Product Name: LSI MegaRAID SAS 9261-8i
Memory: 512MB
BBU: Present
Serial No: SV31502020
==============================================================================
Number of DISK GROUPS: 1

DISK GROUP: 0
Number of Spans: 1
SPAN: 0
Span Reference: 0x00
Number of PDs: 4
Number of VDs: 1
Number of dedicated Hotspares: 0
Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :DBSYS
RAID Level          : Primary-5, Secondary-0, RAID Level Qualifier-3
Size                : 835.394 GB
Sector Size         : 512
Is VD emulated      : No
Parity Size         : 278.464 GB
State               : Optimal
Strip Size          : 1.0 MB
Number Of Drives    : 4
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disabled
Encryption Type     : None
Is VD Cached: No
Physical Disk Information:
Physical Disk: 0
Enclosure Device ID: 252
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 11
WWN: 5000CCA022688FB3

其余包括:

megacli64-FwTermLog.out

megacli64-GetEvents-all.out

megacli64-LdInfo.out

megacli64-LdPdInfo.out

除了以上的诊断文件,sundiag还搜集存储节点的配置信心,告警和再数据库服务器上

不存在的其他的特殊的日志文件

Exadata 的诊断工具之 sundiag.sh相关推荐

  1. JVM学习笔记之-JVM性能监控-JVM监控及诊断工具-GUI方式-Visual VM-JProfiler-Arthas

    00-谈GUI工具前的补充 补充1:内存泄漏 内存泄漏的理解与分类 何为内存泄漏( memory leak) 可达性分析算法来判断对象是否是不再使用的对象,本质都是判断一个对象是否还被引用.那么对于这 ...

  2. Arthas - 开源 Java 诊断工具

    转载自  Arthas使用 Authas - 开源的java诊断工具 下载安装 authas是一个jar包,可以直接下载后运行 wget https://alibaba.github.io/artha ...

  3. Arthas : 在线分析诊断工具Arthas(阿尔萨斯)

    1.美图 2.背景 想学JDK自带的工具,BTrace然后,同事说这个过时了,但是我不是很相信,因为是JDK自带的工具,他推荐这个,于是我就来看看这个到底是什么东西. Arthas 是Alibaba开 ...

  4. java线上诊断工具,Java线上诊断神器Arthas-1

    Arthas 是Alibaba 开源的一款线上诊断工具,相比Java 自带的jinfo, jmap,jstat 等工具更方便(起码不用记那么多参数),而且利用字节码增强技术,可以很好的对线上的问题进行 ...

  5. 尚硅谷JVM下篇:性能监控与调优篇_03_JVM监控及诊断工具-GUI篇

    目录 文章目录 目录 01-工具概述 02-JConsole 基本概述 启动 三种连接方式 Local Remote Advanced 主要作用 1.概览 2.内存 3.线程 4.概要 03-Visu ...

  6. 15、JVM监控及诊断工具-GUI篇

    文章目录 第1章.工具概述 第2章.jConsole 1.基本概述 2.启动 3.三种连接方式 [1]Local [2]Remote [3]Advanced 4.主要作用 第3章.Visual VM ...

  7. 20.JVM监控以及诊断工具-GUI篇

    笔记来源:尚硅谷JVM全套教程,百万播放,全网巅峰(宋红康详解java虚拟机) 20. JVM监控及诊断工具-GUI篇 20.1. 工具概述 使用上一章命令行工具或组合能帮您获取目标Java应用性能相 ...

  8. 阿里巴巴开源的 Java 诊断工具Arthas【入门篇】

    前面: 各位老铁们,好久没和大家见面了,最近一直躲在家里不敢出来门,过着像猪的生活..... 吃喝拉撒睡觉.远程在家办公一段时间也是遇到了比较棘手的问题.所以没有顾得上来和大家分享了,不过大家放心,从 ...

  9. Java诊断工具-Arthas入门与实践

    Java诊断工具-Arthas入门与实践 目录 Java诊断工具-Arthas入门与实践 什么是Arthas? Arthas能做什么? 我在哪里可以下载Arthas? 快速入门 1. 下载并运行mat ...

最新文章

  1. java 扫描所有子类,是否可以获取类的所有子类?
  2. Silverlight4.0教程之使用CompositeTransform复合变形特效实现倒影
  3. AutoHotkey纯命令获取Chrome等浏览器的当前网址
  4. java和asp.net core_.NET Core和ASP.NET Core简介与区别
  5. 黑客攻防:从入门到入狱_每日新闻摘要:游戏服务黑客被判入狱27个月
  6. 2017西安交大ACM小学期 毁灭序列[倒跑并查集]
  7. 逻辑分析推理(戴帽子问题)博弈
  8. BAT 力捧的 AI 项目再次爆发,这些程序员都受影响!
  9. UsernamePasswordAuthenticationFilter源码分析
  10. Spark代码生成技术之现象CodeGenerator
  11. 【转】 opengl编程学习笔记(三)(2D绘图)
  12. nginx 反向代理到目录
  13. TI DSP位域寄存器文件(Bit Field and Register-File Struc...
  14. 温泉酒店加颜色透明matlab,通达信颜色透明代码,通达信,有条件的填充背景颜色,求源码...
  15. python怎么左对齐_python中如何用ljust()实现字符串左对齐?
  16. 靓仔的个人邮箱推荐——靓号邮箱!
  17. App Store拒绝原因
  18. Scala中Either两个子类Left/Right
  19. 全球及中国富维生素矿物质食品行业研究及十四五规划分析报告
  20. object_detection源码解析-box_list

热门文章

  1. Java 时间戳比较先后
  2. 用docker弹性部署自己的服务
  3. 省市县数据以及部分处理
  4. 宏观经济学 索洛模型
  5. 100个IT人成为大牛必备的学习经验分享
  6. Linux查看预装Windows,如何判断电脑出厂预装的什么系统
  7. python爬虫,将天气预报可视化
  8. css设置内外边框距离
  9. 沟通是一种伟大的生产力
  10. 安卓 虚拟按键显示与隐藏适配