今天收到一个报警邮件,oracle进程已经不存在了

Alarm Time:2015-09-21 17:45:38

Trigger: Alive xyxdb_oa

Trigger status: PROBLEM

Trigger severity: High

Trigger URL:

Item values:

1. Alive (x.x.x.x:alive): 0

2. *UNKNOWN* (x.x.x.x :*UNKNOWN*): *UNKNOWN*

Original event ID: 760121

查看到alert日志

System state dump requested by (instance=1, osid=2044 (PMON)), summary=[abnormal instance termination].

System State dumped to trace file /u01/app/oracle/diag/rdbms/xyxdbp/xyxdb/trace/xyxdb_diag_2062_20150921174417.trc

Mon Sep 21 17:44:18 2015

PMON (ospid: 2044): terminating the instance due to error 822

Dumping diagnostic data in directory=[cdmp_20150921174417], requested by (instance=1, osid=2044 (PMON)), summary=[abnormal instance termination].

Instance terminated by PMON, pid = 2044

Mon Sep 21 17:46:39 2015

Starting ORACLE instance (normal)

************************ Large Pages Information *******************

Per process system memlock (soft) limit = 64 KB

Total Shared Global Region in Large Pages = 0 KB (0%)

Large Pages used by this instance: 0 (0 KB)

Large Pages unused system wide = 0 (0 KB)

Large Pages configured system wide = 0 (0 KB)

Large Page size = 2048 KB

RECOMMENDATION:

Total System Global Area size is 3282 MB. For optimal performance,

prior to the next instance restart:

1. Increase the number of unused large pages by

at least 1641 (page size 2048 KB, total size 3282 MB) system wide to

RECOMMENDATION:

Total System Global Area size is 3282 MB. For optimal performance,

prior to the next instance restart:

1. Increase the number of unused large pages by

at least 1641 (page size 2048 KB, total size 3282 MB) system wide to

get 100% of the System Global Area allocated with large pages

2. Large pages are automatically locked into physical memory.

Increase the per process memlock (soft) limit to at least 3290 MB to lock

100% System Global Area's large pages into physical memory

********************************************************************

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

Initial number of CPU is 6

Number of processor cores in the system is 6

Number of processor sockets in the system is 1

CELL communication is configured to use 0 interface(s):

CELL IP affinity details:

NUMA status: non-NUMA system

cellaffinity.ora status: N/A

CELL communication will use 1 IP group(s):

Grp 0:

[root@OA01-1-24 scripts]# cat /proc/50966/oom_

oom_adj        oom_score      oom_score_adj

[root@OA01-1-24 scripts]# cat /proc/50966/oom_adj

0

[root@OA01-1-24 scripts]# vim oomscore.sh

[root@OA01-1-24 scripts]# chmod u+x oomscore.sh

[root@OA01-1-24 scripts]# ./oomscore.sh

63 37608 /usr/bin/java -Djava.util.logging.config.file=/usr

31 51010 ora_mman_xyxdb

20 37579 /usr/bin/java -Djava.util.logging.config.file=/usr

16 51938 /usr/java/jdk1.7.0_79/jre/bin/java -Djava.util.log

14 51496 oraclexyxdb (LOCAL=NO)

13 51026 ora_smon_xyxdb

8 51167 oraclexyxdb (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROT

7 51034 ora_mmon_xyxdb

7 51014 ora_dbw0_xyxdb

6 51480 oraclexyxdb (LOCAL=NO)

查询系统日志

Sep 21 17:44:15 OA01-1-24 kernel: [39519]   500 39519   900699     5848   2       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [39521]   500 39521   900699     5877   5       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [42514]   500 42514   900846    10963   1       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [42578]   500 42578   900706     9012   1       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [43519]     0 43519    24998     1489   5       0             0 sshd

Sep 21 17:44:15 OA01-1-24 kernel: [43533]     0 43533    14309      550   5       0             0 sftp-server

Sep 21 17:44:15 OA01-1-24 kernel: [43557]     0 43557    14432      671   5       0             0 sftp-server

Sep 21 17:44:15 OA01-1-24 kernel: [44331]    89 44331    20234      861   2       0             0 pickup

Sep 21 17:44:15 OA01-1-24 kernel: [44491]     0 44491  1107908   148835   4       0             0 java

Sep 21 17:44:15 OA01-1-24 kernel: [44684]   500 44684   900015     4658   0       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45199]   500 45199   900699     5525   3       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45201]   500 45201   900699     5548   4       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45203]   500 45203   900704     8184   5       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45211]   500 45211   900699     5506   0       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45213]   500 45213   900699     5504   4       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45901]     0 45901  1051478   117538   2       0             0 java

Sep 21 17:44:15 OA01-1-24 kernel: [45943]   500 45943   900956     7194   0       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45945]   500 45945   900315     5444   1       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [45947]   500 45947   900315     5423   5       0             0 oracle

Sep 21 17:44:15 OA01-1-24 kernel: [46232]     0 46232    25226      152   4       0             0 sleep

Sep 21 17:44:15 OA01-1-24 kernel: Out of memory: Kill process 2074 (oracle) score 125 or sacrifice child

Sep 21 17:44:15 OA01-1-24 kernel: Killed process 2074, UID 500, (oracle) total-vm:3600064kB, anon-rss:3444kB, file-rss:1510892kB

通常是因为某时刻应用程序大量请求内存导致系统内存不足造成的,这通常会触发 Linux 内核里的 Out of Memory (OOM) killer,OOM killer 会杀掉某个进程以腾出内存留给系统用,不致于让系统立刻崩溃。

后来查看到开发人员在这台db服务器启用了两个tomcat应用,由于程序故障导致大量内存使用

oracle有一部分相关文档

我们可以配置内核参数来防止进程被杀

通过脚本找出最容易被杀的进程

# vi oomscore.sh

#!/bin/bash

for proc in $(find /proc -maxdepth 1 -regex '/proc/[0-9]+'); do

printf "%2d %5d %s\n" \

"$(cat $proc/oom_score)" \

"$(basename $proc)" \

"$(cat $proc/cmdline | tr '\0' ' ' | head -c 50)"

done 2>/dev/null | sort -nr | head -n 10

[root@OA01-1-24 scripts]# ./oomscore.sh

63 37608 /usr/bin/java -Djava.util.logging.config.file=/usr

31 51010 ora_mman_xyxdb

20 37579 /usr/bin/java -Djava.util.logging.config.file=/usr

16 51938 /usr/java/jdk1.7.0_79/jre/bin/java -Djava.util.log

14 51496 oraclexyxdb (LOCAL=NO)

13 51026 ora_smon_xyxdb

8 51167 oraclexyxdb (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROT

7 51034 ora_mmon_xyxdb

7 51014 ora_dbw0_xyxdb

6 51480 oraclexyxdb (LOCAL=NO)

[root@OA01-1-24 scripts]# cat /proc/51034/oom_score

7

[root@OA01-1-24 scripts]# cat /proc/51034/oom_score_adj

0

[root@OA01-1-24 scripts]# echo -15 >/proc/51034/oom_adj

[root@OA01-1-24 scripts]# cat /proc/51034/oom_score

1

[root@OA01-1-24 scripts]# cat /proc/51026/oom_adj

0

[root@OA01-1-24 scripts]# cat /proc/51026/oom_score

13

[root@OA01-1-24 scripts]# echo -15 >/proc/51026/oom_adj

[root@OA01-1-24 scripts]# cat /proc/51026/oom_adj

-15

[root@OA01-1-24 scripts]# cat /proc/51026/oom_score

1

[root@OA01-1-24 scripts]# echo -15 >/proc/51010/oom_adj

[root@OA01-1-24 scripts]# cat /proc/51010/oom_score

1

[root@OA01-1-24 scripts]# ./

alertbyday.sh          oracle_cron.sh         sendrman.py            updatedb/

installora/            rmanbackup.sh          sync_date.sh           uploadbackup.sh

oomscore.sh            senderrorlog.py        tablespace_monitor.py

[root@OA01-1-24 scripts]# ./oomscore.sh

63 37608 /usr/bin/java -Djava.util.logging.config.file=/usr

20 37579 /usr/bin/java -Djava.util.logging.config.file=/usr

16 51938 /usr/java/jdk1.7.0_79/jre/bin/java -Djava.util.log

14 51496 oraclexyxdb (LOCAL=NO)

8 51167 oraclexyxdb (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROT

7 51014 ora_dbw0_xyxdb

6 51480 oraclexyxdb (LOCAL=NO)

5 52007 oraclexyxdb (LOCAL=NO)

5 51474 oraclexyxdb (LOCAL=NO)

5 51466 oraclexyxdb (LOCAL=NO)

后来还检查到一个问题,关于swap使用配置

[root@OA01-1-24 scripts]# cat /proc/sys/vm/swappiness

0

这里0代表不使用swap

系统工程师更改的时候没有注意,oracle最好不要关掉swap

重新修改

[root@OA01-1-24 scripts]# cat /proc/sys/vm/swappiness

60

总结:DB服务器尽量专用,不然会出现很多意想不到事儿

cdmp不停增长 oracle_系统内存不足导致oracle进程被误杀terminating the instance due to error 822...相关推荐

  1. cdmp不停增长 oracle_Stream异常导致Oracle不断产生trc文件

    某省的生产库,收到告警短信Oracle目录100%,登陆系统查看兼职被报错刷屏了. Mon Oct 23 23:25:18 EAT 2017 Thread 1 advanced to log sequ ...

  2. cdmp不停增长 oracle_oracle trace cdmp

    International The Advancement of Professionalism through the CDMP Presented by ...Oracle DBA \u0012 ...

  3. 连接oracle内存溢出,Linux主机内存溢出导致oracle的SYS用户无法正常登陆

    一般情况下,ORACLE DBA看到如下情况的第一反应是,数据库实例没有启动或者是数据库环境变量没有设置正确,今天遇到的情况均不是以上两种情况,有点特别,且来看看为哪般. oracle@POC-SV1 ...

  4. oracle 报错3113,内存不足导致安装时报错ORA-3113(一)

    安装RAC数据库时报错ORA-3113. 检查dbca对应的信息: bash-3.00$ cd /data/oracle/cfgtoollogs/dbca/testrac/ bash-3.00$ mo ...

  5. oracle11内存建议,环境:oracle 11.2.0.1 +aix6内存问题

    环境:oracle 11.2.0.1 +rac +AIX 6.1建立两套 1问题描述 2010年11月29日下午15点左右,p570a主机telnet不进去,应用新建连接不成功,严重影响到业务,16点 ...

  6. 超融合硬件损坏导致Oracle RAC异常恢复实录

    墨墨导读:一套Oracle RAC环境运行在HW超融合环境中,由于硬件问题导致数据库crash,期间出现了不少数据坏块,本文详述整个恢复过程,希望对大家有帮助. 前几天某客户遇到一个棘手问题:其一套O ...

  7. SQLServer 资源池没有足够的系统内存来运行此查询

    背景: 因系统内存不足导致数据库最大内存设置错误 导致数据库无法访问,且无法通过用户界面(包括Sql查询界面)设置 解决方法: 通过命令行 以最小模式启动数据库,通过语句修改最小内存,重启服务器后正常 ...

  8. 清理apache共享内存引起的oracle宕机

    我的平台是redhat as 3 ,oracle 9204. 其他应用是apache,resin等. 因为以前发现apache运行时间长以后会出现共享内存不足的错误,具体错误信息如下: [Fri Ap ...

  9. Linux服务器Cache占用过多内存导致系统内存不足问题的排查解决

    Linux服务器Cache占用过多内存导致系统内存不足问题的排查解决 参考文章: (1)Linux服务器Cache占用过多内存导致系统内存不足问题的排查解决 (2)https://www.cnblog ...

最新文章

  1. android 9 pie公司,谷歌Android 9 Pie,真正的安卓派
  2. Robot Framework(十八) 支持工具
  3. 数据库隔离级别---MySQL的默认隔离级别就是Repeatable,Oracle默认Read committed,最高级别Serializable
  4. string与stringBuilder的效率与内存占用实测
  5. 速度一半永远追不上_您将永远不会知道自己应该怎么做的一半-没关系。
  6. Uncaught (in promise) Error: Avoided redundant navigation to current location: “/index“. 解决方法
  7. 你们要的Windows IDEA 快捷键终极大全,速度收藏!
  8. 单片机C语言控制16*16LED显示屏,基于单片机的pwm控制16*16led点阵亮度调节怎么做啊,...
  9. nutch mysql hadoop_nutch+hadoop 配置使用
  10. 光无线通信理论知识学习1
  11. 录音文件下载_录音内容如何导出?对于小白来说是难题,一招教你搞定它
  12. Excel 2016 做线性回归分析【高尔顿数据集】与【Anscombe四重奏数据集】
  13. 小本生意,请各位博友多多支持
  14. JAVA如何判断两个字符串是否相等
  15. Pintos project 1 实验报告(代码分享)
  16. usb外接耳机声音过大解决方法
  17. Java开源项目部署在99元阿里云centos8上
  18. 方管图纸标注_结构图纸悬挑梁的标注方式
  19. OFDM子载波频率 知乎_5G的速度到底能有多快?
  20. Windows资源监控工具大全

热门文章

  1. 我的第一个 JSP (SSH) 个人网站【开源】
  2. SQL server根据值搜表名和字段
  3. php 实现 html转js
  4. DroidPilot 发布微信公众帐号啦~
  5. DELL通过LCD简单的判别服务器的硬件故障
  6. javascript的location/history
  7. 为什么说比特币的交易属性优于储值属性
  8. 熔断,限流,降级 一些理解
  9. PHP7 学习笔记(七)如何使用zephir编译一个扩展记录
  10. Kafka manager安装 (支持0.10以后版本consumer)