cdmp不停增长 oracle_系统内存不足导致oracle进程被误杀terminating the instance due to error 822...
今天收到一个报警邮件,oracle进程已经不存在了
Alarm Time:2015-09-21 17:45:38
Trigger: Alive xyxdb_oa
Trigger status: PROBLEM
Trigger severity: High
Trigger URL:
Item values:
1. Alive (x.x.x.x:alive): 0
2. *UNKNOWN* (x.x.x.x :*UNKNOWN*): *UNKNOWN*
Original event ID: 760121
查看到alert日志
System state dump requested by (instance=1, osid=2044 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xyxdbp/xyxdb/trace/xyxdb_diag_2062_20150921174417.trc
Mon Sep 21 17:44:18 2015
PMON (ospid: 2044): terminating the instance due to error 822
Dumping diagnostic data in directory=[cdmp_20150921174417], requested by (instance=1, osid=2044 (PMON)), summary=[abnormal instance termination].
Instance terminated by PMON, pid = 2044
Mon Sep 21 17:46:39 2015
Starting ORACLE instance (normal)
************************ Large Pages Information *******************
Per process system memlock (soft) limit = 64 KB
Total Shared Global Region in Large Pages = 0 KB (0%)
Large Pages used by this instance: 0 (0 KB)
Large Pages unused system wide = 0 (0 KB)
Large Pages configured system wide = 0 (0 KB)
Large Page size = 2048 KB
RECOMMENDATION:
Total System Global Area size is 3282 MB. For optimal performance,
prior to the next instance restart:
1. Increase the number of unused large pages by
at least 1641 (page size 2048 KB, total size 3282 MB) system wide to
RECOMMENDATION:
Total System Global Area size is 3282 MB. For optimal performance,
prior to the next instance restart:
1. Increase the number of unused large pages by
at least 1641 (page size 2048 KB, total size 3282 MB) system wide to
get 100% of the System Global Area allocated with large pages
2. Large pages are automatically locked into physical memory.
Increase the per process memlock (soft) limit to at least 3290 MB to lock
100% System Global Area's large pages into physical memory
********************************************************************
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 6
Number of processor cores in the system is 6
Number of processor sockets in the system is 1
CELL communication is configured to use 0 interface(s):
CELL IP affinity details:
NUMA status: non-NUMA system
cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
Grp 0:
[root@OA01-1-24 scripts]# cat /proc/50966/oom_
oom_adj oom_score oom_score_adj
[root@OA01-1-24 scripts]# cat /proc/50966/oom_adj
0
[root@OA01-1-24 scripts]# vim oomscore.sh
[root@OA01-1-24 scripts]# chmod u+x oomscore.sh
[root@OA01-1-24 scripts]# ./oomscore.sh
63 37608 /usr/bin/java -Djava.util.logging.config.file=/usr
31 51010 ora_mman_xyxdb
20 37579 /usr/bin/java -Djava.util.logging.config.file=/usr
16 51938 /usr/java/jdk1.7.0_79/jre/bin/java -Djava.util.log
14 51496 oraclexyxdb (LOCAL=NO)
13 51026 ora_smon_xyxdb
8 51167 oraclexyxdb (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROT
7 51034 ora_mmon_xyxdb
7 51014 ora_dbw0_xyxdb
6 51480 oraclexyxdb (LOCAL=NO)
查询系统日志
Sep 21 17:44:15 OA01-1-24 kernel: [39519] 500 39519 900699 5848 2 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [39521] 500 39521 900699 5877 5 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [42514] 500 42514 900846 10963 1 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [42578] 500 42578 900706 9012 1 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [43519] 0 43519 24998 1489 5 0 0 sshd
Sep 21 17:44:15 OA01-1-24 kernel: [43533] 0 43533 14309 550 5 0 0 sftp-server
Sep 21 17:44:15 OA01-1-24 kernel: [43557] 0 43557 14432 671 5 0 0 sftp-server
Sep 21 17:44:15 OA01-1-24 kernel: [44331] 89 44331 20234 861 2 0 0 pickup
Sep 21 17:44:15 OA01-1-24 kernel: [44491] 0 44491 1107908 148835 4 0 0 java
Sep 21 17:44:15 OA01-1-24 kernel: [44684] 500 44684 900015 4658 0 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45199] 500 45199 900699 5525 3 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45201] 500 45201 900699 5548 4 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45203] 500 45203 900704 8184 5 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45211] 500 45211 900699 5506 0 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45213] 500 45213 900699 5504 4 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45901] 0 45901 1051478 117538 2 0 0 java
Sep 21 17:44:15 OA01-1-24 kernel: [45943] 500 45943 900956 7194 0 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45945] 500 45945 900315 5444 1 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [45947] 500 45947 900315 5423 5 0 0 oracle
Sep 21 17:44:15 OA01-1-24 kernel: [46232] 0 46232 25226 152 4 0 0 sleep
Sep 21 17:44:15 OA01-1-24 kernel: Out of memory: Kill process 2074 (oracle) score 125 or sacrifice child
Sep 21 17:44:15 OA01-1-24 kernel: Killed process 2074, UID 500, (oracle) total-vm:3600064kB, anon-rss:3444kB, file-rss:1510892kB
通常是因为某时刻应用程序大量请求内存导致系统内存不足造成的,这通常会触发 Linux 内核里的 Out of Memory (OOM) killer,OOM killer 会杀掉某个进程以腾出内存留给系统用,不致于让系统立刻崩溃。
后来查看到开发人员在这台db服务器启用了两个tomcat应用,由于程序故障导致大量内存使用
oracle有一部分相关文档
我们可以配置内核参数来防止进程被杀
通过脚本找出最容易被杀的进程
# vi oomscore.sh
#!/bin/bash
for proc in $(find /proc -maxdepth 1 -regex '/proc/[0-9]+'); do
printf "%2d %5d %s\n" \
"$(cat $proc/oom_score)" \
"$(basename $proc)" \
"$(cat $proc/cmdline | tr '\0' ' ' | head -c 50)"
done 2>/dev/null | sort -nr | head -n 10
[root@OA01-1-24 scripts]# ./oomscore.sh
63 37608 /usr/bin/java -Djava.util.logging.config.file=/usr
31 51010 ora_mman_xyxdb
20 37579 /usr/bin/java -Djava.util.logging.config.file=/usr
16 51938 /usr/java/jdk1.7.0_79/jre/bin/java -Djava.util.log
14 51496 oraclexyxdb (LOCAL=NO)
13 51026 ora_smon_xyxdb
8 51167 oraclexyxdb (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROT
7 51034 ora_mmon_xyxdb
7 51014 ora_dbw0_xyxdb
6 51480 oraclexyxdb (LOCAL=NO)
[root@OA01-1-24 scripts]# cat /proc/51034/oom_score
7
[root@OA01-1-24 scripts]# cat /proc/51034/oom_score_adj
0
[root@OA01-1-24 scripts]# echo -15 >/proc/51034/oom_adj
[root@OA01-1-24 scripts]# cat /proc/51034/oom_score
1
[root@OA01-1-24 scripts]# cat /proc/51026/oom_adj
0
[root@OA01-1-24 scripts]# cat /proc/51026/oom_score
13
[root@OA01-1-24 scripts]# echo -15 >/proc/51026/oom_adj
[root@OA01-1-24 scripts]# cat /proc/51026/oom_adj
-15
[root@OA01-1-24 scripts]# cat /proc/51026/oom_score
1
[root@OA01-1-24 scripts]# echo -15 >/proc/51010/oom_adj
[root@OA01-1-24 scripts]# cat /proc/51010/oom_score
1
[root@OA01-1-24 scripts]# ./
alertbyday.sh oracle_cron.sh sendrman.py updatedb/
installora/ rmanbackup.sh sync_date.sh uploadbackup.sh
oomscore.sh senderrorlog.py tablespace_monitor.py
[root@OA01-1-24 scripts]# ./oomscore.sh
63 37608 /usr/bin/java -Djava.util.logging.config.file=/usr
20 37579 /usr/bin/java -Djava.util.logging.config.file=/usr
16 51938 /usr/java/jdk1.7.0_79/jre/bin/java -Djava.util.log
14 51496 oraclexyxdb (LOCAL=NO)
8 51167 oraclexyxdb (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROT
7 51014 ora_dbw0_xyxdb
6 51480 oraclexyxdb (LOCAL=NO)
5 52007 oraclexyxdb (LOCAL=NO)
5 51474 oraclexyxdb (LOCAL=NO)
5 51466 oraclexyxdb (LOCAL=NO)
后来还检查到一个问题,关于swap使用配置
[root@OA01-1-24 scripts]# cat /proc/sys/vm/swappiness
0
这里0代表不使用swap
系统工程师更改的时候没有注意,oracle最好不要关掉swap
重新修改
[root@OA01-1-24 scripts]# cat /proc/sys/vm/swappiness
60
总结:DB服务器尽量专用,不然会出现很多意想不到事儿
cdmp不停增长 oracle_系统内存不足导致oracle进程被误杀terminating the instance due to error 822...相关推荐
- cdmp不停增长 oracle_Stream异常导致Oracle不断产生trc文件
某省的生产库,收到告警短信Oracle目录100%,登陆系统查看兼职被报错刷屏了. Mon Oct 23 23:25:18 EAT 2017 Thread 1 advanced to log sequ ...
- cdmp不停增长 oracle_oracle trace cdmp
International The Advancement of Professionalism through the CDMP Presented by ...Oracle DBA \u0012 ...
- 连接oracle内存溢出,Linux主机内存溢出导致oracle的SYS用户无法正常登陆
一般情况下,ORACLE DBA看到如下情况的第一反应是,数据库实例没有启动或者是数据库环境变量没有设置正确,今天遇到的情况均不是以上两种情况,有点特别,且来看看为哪般. oracle@POC-SV1 ...
- oracle 报错3113,内存不足导致安装时报错ORA-3113(一)
安装RAC数据库时报错ORA-3113. 检查dbca对应的信息: bash-3.00$ cd /data/oracle/cfgtoollogs/dbca/testrac/ bash-3.00$ mo ...
- oracle11内存建议,环境:oracle 11.2.0.1 +aix6内存问题
环境:oracle 11.2.0.1 +rac +AIX 6.1建立两套 1问题描述 2010年11月29日下午15点左右,p570a主机telnet不进去,应用新建连接不成功,严重影响到业务,16点 ...
- 超融合硬件损坏导致Oracle RAC异常恢复实录
墨墨导读:一套Oracle RAC环境运行在HW超融合环境中,由于硬件问题导致数据库crash,期间出现了不少数据坏块,本文详述整个恢复过程,希望对大家有帮助. 前几天某客户遇到一个棘手问题:其一套O ...
- SQLServer 资源池没有足够的系统内存来运行此查询
背景: 因系统内存不足导致数据库最大内存设置错误 导致数据库无法访问,且无法通过用户界面(包括Sql查询界面)设置 解决方法: 通过命令行 以最小模式启动数据库,通过语句修改最小内存,重启服务器后正常 ...
- 清理apache共享内存引起的oracle宕机
我的平台是redhat as 3 ,oracle 9204. 其他应用是apache,resin等. 因为以前发现apache运行时间长以后会出现共享内存不足的错误,具体错误信息如下: [Fri Ap ...
- Linux服务器Cache占用过多内存导致系统内存不足问题的排查解决
Linux服务器Cache占用过多内存导致系统内存不足问题的排查解决 参考文章: (1)Linux服务器Cache占用过多内存导致系统内存不足问题的排查解决 (2)https://www.cnblog ...
最新文章
- android 9 pie公司,谷歌Android 9 Pie,真正的安卓派
- Robot Framework(十八) 支持工具
- 数据库隔离级别---MySQL的默认隔离级别就是Repeatable,Oracle默认Read committed,最高级别Serializable
- string与stringBuilder的效率与内存占用实测
- 速度一半永远追不上_您将永远不会知道自己应该怎么做的一半-没关系。
- Uncaught (in promise) Error: Avoided redundant navigation to current location: “/index“. 解决方法
- 你们要的Windows IDEA 快捷键终极大全,速度收藏!
- 单片机C语言控制16*16LED显示屏,基于单片机的pwm控制16*16led点阵亮度调节怎么做啊,...
- nutch mysql hadoop_nutch+hadoop 配置使用
- 光无线通信理论知识学习1
- 录音文件下载_录音内容如何导出?对于小白来说是难题,一招教你搞定它
- Excel 2016 做线性回归分析【高尔顿数据集】与【Anscombe四重奏数据集】
- 小本生意,请各位博友多多支持
- JAVA如何判断两个字符串是否相等
- Pintos project 1 实验报告(代码分享)
- usb外接耳机声音过大解决方法
- Java开源项目部署在99元阿里云centos8上
- 方管图纸标注_结构图纸悬挑梁的标注方式
- OFDM子载波频率 知乎_5G的速度到底能有多快?
- Windows资源监控工具大全