天萃荷净

Oracle研究中心学习笔记:分享一篇关于Oracle数据库RAC环境中节点间管理的文章,详细介绍了RAC节点驱逐条件和管理方法。

本站文章除注明转载外,均为本站原创: 转载自loveOracleо wife & love life —Roger 的Oracle技术博客

本文链接地址: 关于oracle rac节点的驱逐

我这里主要是想讲讲rac中关于node被驱逐进而导致reboot的情况。

首先说说rac中的心跳, 关于oracle clusterware中的心跳有两种,如下:

1. Disk heartbeat (voting device) - IOT

2. Network heartbeat (across the interconnect) - misscount

这里的disk hearbeat是指votedisk心跳,我们都知道votedisk是仲裁盘,

那么它到底有什么作用呢?Oracle文档有这样描述的:

votedisk:

The Voting Disk is used by the Oracle cluster manager in various layers.

The Node Monitor (NM)uses the Voting Disk for the Disk Hearbeat, which is

essential in the detection and resolution of cluster "split brain".

NM monitors the Voting Disk for other competing sub-clusters and uses it for the

eviction phase. Hence the availability from the Voting Disk is critical for the

operation of the Oracle Cluster Manager.

The shared volumes created for the OCR and the voting disk should be configured

using RAID to protect against media failure. This requires the use of an external

cluster volume manager, cluster file system, or storage hardware that provides

RAID protection.

Disk heartbeat:

Each node writes a disk heartbeat to each voting disk once per second。Each node

reads their kill block once per second, if the kill block is overwritten node commits

suicide.During reconfig (join or leave) CSSD monitors all nodes and determines whether

a node has a disk heartbeat, including those with no network heartbeat.If no disk

heartbeat within I/O timeout (MissCount during cluster reconfiguration) then node is declared as dead.

Voting disk needs to be mirrored, should it become unavailable, cluster will come down.

If an I/O error is reported immediately on access to the vote disk, we immediately mark the vote disk

as offline so it isn't at the party anymore. So now we have (in our case) just two voting disks available.

We do keep retrying access to that dead disk, and if it becomes available again and the data is

uncorrupted we mark it online again. If a second vote disk suffered an I/O error in the window

that the first disk was marked offline. So now we don't have  quorum. Bang reboot.

网络心跳就不多说了,指的是rac私网心跳。如下一段摘自网络:

Voting files are used by CSS to ensure data integrity of the database by detecting and resolving network

problems that could lead to a split-brain, so must be accessible at all times.  There are other techniques

used by other cluster managers, like quorum server, and quorum disks which function differently, but serve

the same purpose.

Note that a majority of vote disks, i.e. N/2 + 1, must be accessible by each node to ensure that all pairs

node have at least one voting file that they both see, which allows proper resolution of network issues;

this is to address the possible complaint that 2 voting files provide redundancy, so a third should not be

necessary.

During normal processing, each node writes a disk heartbeat once per second and also reads its kill block

once per second. When the kill block indicates that the node has been evicted, the node exits, causing a node

reboot.As long as we have enough voting disks online, the node can survive, but when the number of offline

voting disks is greater than or equal to the number of online voting disks, the Cluster Communication Service

daemon will fail resulting in a reboot. The rationale for this is that as long as each node is required to

have a majority of voting disks online, there is guaranteed to be one voting disk that both nodes in a 2

node pair can see.

上面提到了对于votedisk(仲裁盘),当cluster中有节点出现故障时,offline的votedisk个数必须小于存活votedisk个数,否则会导致存活的节点reboot。

下面再来说说几个重要的参数值.

misscount: 网络心跳可以丢失的次数(单位是秒)

不同平台和版本的misscount默认值是不一样的,详见如下表格:

OS

10g (R1 &R2)

11g

Linux

60

30

Unix

30

30

VMS

30

30

Windows

30

30

另外如果使用了第三方cluster软件时,那么misscount值默认即为600s,这里脑裂也是第三方cluster软件来完成的。

diskhearbeat即disktimeout,在10.2.0.1+版本以后(打了patch 4896338)默认值是200s。

disktimeout也简称为DTO,但是文档上又把DTO细分为两种,如下:

-- SDTO,是short disk time out的简称,即节点添加或删除时cluster需要进行reconfigure的时间。

-- LDTO,是指正常的rac操作中允许votedisk i/o完成超时的时间。

rebootime:在10g~11g中默认都为3s,即是rac出现脑裂或节点被驱逐的时候,该节点将会在rebootime时间内被重启。

文档中提到node的驱逐从10201版本以后,不在根据DTO来决定,而是基于disktimeout,

默认情况下,misscount的值小于disktimeout。

那么, 在什么情况下会导致node被驱逐呢?如下:

· Node is not pinging via the network heartbeat

· Node is not pinging the Voting disk

· Node is hung/busy and is unable to perform either of the earlier tasks

根据文档中的描述,翻译过来的node reboot条件表格:

Network Ping

Disk Ping

Reboot(是否重启)

在misscount值 内完成

在misscount值内完成

在misscount值内完成

Disk ping时间超过misscount值,但是小于disktimeout值。

在misscount值内完成

Disk ping时间超过disktimeout值

Network ping时间超过misscount值

在misscount值内完成

修改如上几个参数值的方法如下:

$ORA_CRS_HOME/bin/crsctl set css misscount

where  is the maximum i/o latency to the voting disk +1 second

10.2.0.1+版本,如果应用了patch4896338,那么还需要有如下的操作:

$CRS_HOME/bin/crsctl set css reboottime  [-force]  ( is seconds)

$CRS_HOME/bin/crsctl set css disktimeout  [-force] ( is seco

最后需要说明一下的是,在使用了第三方集群如cluster软件以后,oracle就不再推荐修改misscount,

可以引发潜在的错误。oracle是这样解释的:

Do not change default misscount values if you are  running Vendor Clusterware along with Oracle Clusterware.

The default values for misscount should not be changed when using vendor clusterware. Modifying misscount in

this environment may cause clusterwide outages and potential corruptions.

如上的信息大家可以参考如下的几个mos文档:

10g RAC- Steps To Increase CSS Misscount- Reboottime and Disktimeout

CSS Timeout Computation in Oracle Clusterware

Reconfiguring the CSS disktimeout of 10gR2 Clusterware for Proper LUN Failover of the

Dell MD3000i iSCSI Storage [ID 462616.1]

How to start/stop the 10g CRS ClusterWare [ID 309542.1]

--------------------------------------ORACLE-DBA----------------------------------------

最权威、专业的Oracle案例资源汇总之【学习笔记】深入研究Oracle RAC节点驱逐的条件和案例

oracle rac 仲裁盘_【学习笔记】深入研究Oracle RAC节点驱逐的条件和案例相关推荐

  1. oracle如何往dg加盘_学习笔记:Oracle DG系统 主备库中表空间和数据文件增加删除等管...

    天萃荷净 Oracle Data Guard表空间和数据文件管理汇总 汇总日常工作中操作,在Oracle DG结构系统中,如何删除备库表空间和数据文件,如何管理主库与备库之间的文件系统,详见文章内容. ...

  2. DHCP服务_学习笔记

    DHCP服务_学习笔记 DHCP(Dynamic Host Configuration Protocol):动态主机配置协议 Lease:租约    续租时间需要是租期时间的一半 UDP协议: Ser ...

  3. db4o_8.0对象数据库官方文档翻译_学习笔记三

    紧接上篇:db4o_8.0对象数据库官方文档翻译_学习笔记二 3. Object Manager Enterprise Overview(OME视图)即OME插件的使用 If you did not  ...

  4. 韩顺平php可爱屋源码_韩顺平_php从入门到精通_视频教程_第20讲_仿sohu主页面布局_可爱屋首页面_学习笔记_源代码图解_PPT文档整理...

    韩顺平_php从入门到精通_视频教程_第20讲_仿sohu首页面布局_可爱屋首页面_学习笔记_源代码图解_PPT文档整理 对sohu页面的分析 注释很重要 经验:写一点,测试一点,这是一个很好的方法. ...

  5. 友盟统计+渠道包_学习笔记

    友盟统计+渠道包_学习笔记 资料: 官网:https://developer.umeng.com/docs/66632/detail/66889#h3-u5E38u89C1u95EEu9898 视频资 ...

  6. oracle访问控制策略查看,【学习笔记】oracle fga 细粒度访问控制研究笔记

    天萃荷净 oracle研究中心学习笔记:分享一篇关于Oracle数据库细粒度访问控制的学习笔记,详细介绍了Implement fine-grained access control (Fine-Gra ...

  7. 图论01.最短路专题_学习笔记+模板

    图论01.最短路专题_学习笔记+模板 一.定义与性质 ● 需要的前导知识点 路径 最短路 有向图中的最短路.无向图中的最短路 单源最短路.每对结点之间的最短路 ● 最短路的性质 对于边权为正的图,任意 ...

  8. JDBC学习笔记02【ResultSet类详解、JDBC登录案例练习、PreparedStatement类详解】

    黑马程序员-JDBC文档(腾讯微云)JDBC笔记.pdf:https://share.weiyun.com/Kxy7LmRm JDBC学习笔记01[JDBC快速入门.JDBC各个类详解.JDBC之CR ...

  9. ROS学习笔记四:理解ROS节点

    ROS学习笔记四:理解ROS节点 本节主要介绍ROS图形概念,讨论ROS命令行工具roscore.rosnode和rosrun. 要求 要求已经在Linux系统中安装一个学习用的ros软件包例子: s ...

最新文章

  1. 【前沿技术】Facebook 硬件负责人,带摄像头的智能眼镜将在 10 年内成为常态
  2. 十进制转换二进制(原码)
  3. git commit如何修改默认编辑器为vim
  4. python制作自动抢票_python自动抢票
  5. python如何修改excel数据库_python修改excel数据库
  6. JavaScriptSerializer类
  7. Linux 关闭桌面方法
  8. Bootstrap中DropDown插件显示下拉列表,点击下拉列表区域,不会再自动关闭。
  9. ensp桥接云ping不通_谁偷了我的云主机文件?五大场景避坑指南
  10. 互联网思维PK大数据思维
  11. 模块化的机器学习系统就够了吗?Bengio师生告诉你答案
  12. 洛谷—— P1069 细胞分裂
  13. 什么是实体-联系图(ER图)
  14. http状态码全解读
  15. linux 进程的vss rss uss,内存VSS,RSS,PSS,USS解读
  16. 《算法笔记》4.3小节——算法初步->递归
  17. vue项目启动报错Cannot find module ‘xxx’
  18. ARM体系结构(重制版)——九鼎创展 x210V3s
  19. matlab的实验报告,MATLAB实验报告(8个实验).doc
  20. 计算机含金量最高的证书

热门文章

  1. 为什么大量的人会觉得FPGA难学?
  2. ZStack Cloud助力南京四方亿能升级配电自动化系统
  3. 深入了解word-break和 word-wrap的区别
  4. 基于Roberta进行微博情感分析
  5. APN设置中界面显示及默认接入点配置
  6. poi下载模板含下拉框
  7. 全局过滤器/局部过滤器
  8. 漏洞扫描器 - OS识别 - TTL与Nmap方式
  9. 使用libwebsocket搭建websocket服务器实例
  10. 小程序跳转公众号文章