Oracle 集群心跳机制:
       Oracle集群如何维护集群的一致性,所谓的集群一致性就是指集群中每个成员能够了解其他成员的状态,而且每个成员获得的集群中其他节点的状态和集群中节点成员列表信息是一致的,这也是集群最基本的要求。

Oracle通过三种机制来实现集群的一致性:
网络心跳: 确定节点与节点间的连通性,以便节点之间能够了解彼此的状态。
磁盘心跳: 用一个或多个共享的位置来保存节点之间的连通性信息,以便在集群需要进行重新配置时,能够做出正确的决定并记录集群最新的状态.
本地心跳: 本地节点自我监控机制,以便当本地节点出现问题时能够主动离开集群,避免不一致的产生

一:网络心跳

ocssd进程每秒钟通过集群的私网会向集群的其他节点发送网络心跳.
      例如一个4节点的集群,集群的每一个节点每一秒钟都会向集群中的其他三个几点发送网络心跳信息,也就是说每个节点每一秒钟也会收到集群中其他节点发送的网络心跳。既然节点间互相发送网络心跳,就需要有一种机制来确定节点之间的连通性,以及当网络心跳出现问题时的处理机制。

网络心跳主要通过以下的ocssd.bin线程实现:
      发送线程:该线程每秒钟向集群中所有的节点发送网络心跳信息。
      分析线程:该线程会分析收到的网络心跳信息并进行处理,如果发现集群中的某一个节点持续丢失网络心跳,就会通知集群进行重新配置。
      集群重新配置线程:负责对集群进行重新配置
      派遣线程:该线程负责接收从远程节点传递过来的信息,之后,根据信息的种类发送给相关线程进行处理。

工作机制:
1.发送线程负责每秒钟发送网络心跳到其他远程节点。
2.派遣线程负责接收从远程节点发送过来的网络心跳信息。
3.分析线程会处理由派遣线程接收到的网络心跳信息,确认节点连通性.

例如:当分析线程发现某些节点的连通性出现问题时,也就是说连续一段时间内没有发现某一个节点或几个节点的网络心跳,集群就会进行重新配置。而这种情况下重新配置的结果往往就是某一节点或几个节点离开集群,所以节点间的私网通信问题会破环集群的一致性。

二:磁盘心跳
      磁盘心跳的主要目的就是当集群发生脑裂时帮助制定脑裂的解决方案。
解释
      Oracle集群的每一个节点每秒钟都会向集群的所有表决盘VF注册本地节点的磁盘心跳信息,同时也会将自己能够联系到的集群中其他节点的信息写入表决盘中,一旦发生脑裂,css的重新配置线程就可以通过表决盘中的信息了解集群中节点之间的连通性,从而决定集群会分裂成几个子集群,以及每个子集群包含的节点情况和每个节点的状态。

示例:

一个两节点的集群(node1,node2)配置了三块VF(VF1,VF2,VF3),node1无法访问VF1,node2无法访问VF2,这意味着两个节点仍然同时能够访问VF3。而当集群中某一节点无法访问大多数VF时([VF/2]+1),这就意味着当需要通过VF中的信息决定节点去留时,可能会出现没有任何一个VF可被集群中的全部节点访问到的情况,这也意味着无法决定哪些节点应该离开集群,哪些节点应该被保留。

一个两节点的集群(node1,node2)配置了3块VF(VF1,VF2,VF3),node1无法访问VF1,VF2 node2无法访问VF3,这意味着当出现网络问题时,集群无法通过VF的信息获得一致的所有节点的状态,也就无法完成集群的重新配置。所以无论如何变化,只要节点必须能够访问到[VF/2]+1个VF的规则,就一定能够保证至少一个VF能够被所有节点访问到。

三:本地心跳
      本地心跳的作用是监控ocssd.bin进程以及本地节点的状态。
      cssdagent和cssdmonitor的功能就是监控本地节点的ocssd.bin进程状态和本地节点的状态,对于ocssd.bin进程的监控是通过本地心跳来实现的,Oracle会在每一秒钟,在发送网络心跳的同时向cssdagent和cssdmonitor发送本地ocssd.bin进程的状态(本地心跳)。如果本地心跳没有问题,cssdagent就认为ocssd.bin进程正常。如果ocssd.bin进程持续丢失本地心跳(到达misscount的时间)ocssdagent就会认为本地节点的ocssd.bin进程出现了问题,并重启该节点。

脑裂:
      集群的网络心跳丢失,但是磁盘心跳正常。当脑裂出现后,集群会分裂成为若干个子集群。对于这种情况的出现,集群需要进行重新配置。
      重新配置的基本原则:节点数多的子集群存活,如果子集群包含的节点数相同,那么包含最小编号节点的子集群存活。

四:网络心跳misscount和磁盘心跳disktimeout查询及设置
misscount:用来定义集群网络心跳的超时时间,默认值是30s。当集群中的一个或多个节点连续丢失网络心跳超过misscount时间后,集群需要进行重新配置,某一个或多个节点需要离开集群。在11gR2版本的集群,这个值也是节点本地心跳超时时间,因为本地心跳和网络心跳是由相同的线程发送的。

查询网络心跳NHB misscount
[root@node1 ~]# crsctl get css misscount;
CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.

查询磁盘心跳DHB disktimeout
[root@node1 ~]# crsctl get css disktimeout;
CRS-4678: Successful get disktimeout 200 for Cluster Synchronization Services.

修改网络心跳NHB misscount
[root@node1 ~]# crsctl set css misscount 50;

CRS-4684: Successful set of parameter misscount to 50 for Cluster Synchronization Services.

[root@node1 ~]# crsctl set css misscount 30;
CRS-4684: Successful set of parameter misscount to 30 for Cluster Synchronization Services.

修改磁盘心跳disktimeout

crsctl set css disktimeout 300

-----Oracle心跳能不能直接连接

1、rac心跳的作用:

检测集群节点间的网络健康状态,还可用做缓存同步刷新及全局资源维护。在grid control出现后还传输数据块,其内联数据通信量比较大,通常是千兆网,当然使用万兆更好。

2、rac心跳能否用直连网线?

直连网线限制RAC至两节点,另外直连网线不稳定,由此造成的BUG和技术问题,ORACLE不提供相应的技术支持。

具体看ORACLE官方解释:

RAC: Frequently Asked Questions [ID 220970.1]中描述

Is crossover cable supported as an interconnect with RAC on any platform ?

NO. CROSS OVER CABLES ARE NOT SUPPORTED. The requirement is to use a switch:

Detailed Reasons:

1) cross-cabling limits the expansion of RAC to two nodes

2) cross-cabling is unstable:

a) Some NIC cards do not work properly with it. They are not able to negotiate the DTE/DCE clocking, and will thus not function. These NICS were made cheaper by assuming that the switch was going to have the clock. Unfortunately there is no way to know which NICs do not have that clock.

b) Media sense behaviour on various OS's (most notably Windows) will bring a NIC down when a cable is disconnected. Either of these issues can lead to cluster instability and lead to ORA-29740 errors (node evictions).

Due to the benefits and stability provided by a switch, and their afforability ($200 for a simple 16 port GigE switch), and the expense and time related to dealing with issues when one does not exist, this is the only supported configuration.

From a purely technology point of view Oracle does not care if the customer uses cross over cable or router or switches to deliver a message. However, we know from experience that a lot of adapters misbehave when used in a crossover configuration and cause a lot of problems for RAC. Hence we have stated on certify that we do not support crossover cables to avoid false bugs and finger pointing amongst the various parties: Oracle, Hardware vendors, Os vendors etc...

3、rac心跳的高可用

rac心跳实现高可用,可使用双网口绑定的技术,操作系统层面实现。双网口绑定常见有负载均衡和主备模式。负载均衡可提供两倍的带宽(实际并达不到,只是可快一些),但从可靠性角度来说,建议主备模式。在主备模式下,当一个网络接口失效时(例如主交换机掉电等),不会出现网络中断,系统会按照/etc/rc.d/rc.local里指定的网卡顺序工作,机器仍能对外服务,起到了失效保护的功能。

补充资料:

linux系统下bond mode参数说明:(mode=4 在交换机支持LACP时推荐使用,其能提供更好的性能和稳定性)0-轮询模式,所绑定的网卡会针对访问以轮询算法进行平分。

1-高可用模式,运行时只使用一个网卡,其余网卡作为备份,在负载不超过单块网卡带宽或压力时建议使用。

2-基于HASH算法的负载均衡模式,网卡的分流按照xmit_hash_policy的TCP协议层设置来进行HASH计算分流,使各种不同处理来源的访问都尽量在同一个网卡上进行处理。

3-广播模式,所有被绑定的网卡都将得到相同的数据,一般用于十分特殊的网络需求,如需要对两个互相没有连接的交换机发送相同的数据。

4-802.3ab负载均衡模式,要求交换机也支持802.3ab模式,理论上服务器及交换机都支持此模式时,网卡带宽最高可以翻倍(如从1Gbps翻到2Gbps)

5-适配器输出负载均衡模式,输出的数据会通过所有被绑定的网卡输出,接收数据时则只选定其中一块网卡。如果正在用于接收数据的网卡发生故障,则由其他网卡接管,要求所用的网卡及网卡驱动可通过ethtool命令得到speed信息。

6-适配器输入/输出负载均衡模式,在”模式5″的基础上,在接收数据的同时实现负载均衡,除要求ethtool命令可得到speed信息外,还要求支持对网卡MAC地址的动态修改功能。

4、rac双心跳的可行性

rac心跳使用双网口绑定后,是一个私有的地址隶属于一个vlan,采用主备模式,两条网线分别连接两个不同的交换机。这是操作系统层面就可实现的。如果rac心跳采用两个私有VLAN,那么心跳就会有两个私有地址。双心跳地址间如何做负载均衡或主备模式,就由ORACLE数据库自己来实现(操作系统层不再做绑定)。oracle在11G R2之后的版本11.2.0.2里支持这种方式,由于这个HAIP新特性刚推出有BUG,建议大家使用11.2.0.4版更稳定。官方的举例是针对多个数据库instance高互连带宽要求的。

官方具体说明请参见http://docs.oracle.com/database/121/RACAD/admin.htm#RACAD7295

文档ID 1210883.1详细介绍了HAIP,其中对HAIP的描述如下:

Redundant Interconnect without any 3rd-party IP failover technology (bond, IPMP or similar) is supported natively by Grid Infrastructure starting from 11.2.0.2.  Multiple private network adapters can be defined either during the installation phase or afterward using the oifcfg.  Oracle Database, CSS, OCR, CRS, CTSS, and EVM components in 11.2.0.2 employ it automatically.

Grid Infrastructure can activate a maximum of four private network adapters at a time even if more are defined. The ora.cluster_interconnect.haip resource will start one to four link local  HAIP on private network adapters for interconnect communication for Oracle RAC, Oracle ASM, and Oracle ACFS etc.

Grid automatically picks free link local addresses from reserved 169.254.*.* subnet for HAIP. According to RFC-3927, link local subnet 169.254.*.* should not be used for any other purpose. With HAIP, by default, interconnect traffic will be load balanced across all active interconnect interfaces, and corresponding HAIP address will be failed over transparently to other adapters if one fails or becomes non-communicative. .

The number of HAIP addresses is decided by how many private network adapters are active when Grid comes up on the first node in the cluster .  If there's only one active private network, Grid will create one; if two, Grid will create two; and if more than two, Grid will create four HAIPs. The number of HAIPs won't change even if more private network adapters are activated later, a restart of clusterware on all nodes is required for the number to change, however, the newly activated adapters can be used for fail over purpose.

5、每一套业务系统数据库的RAC心跳是否需要做vlan隔离?

oracle官方没有明确说明,出于安全的特定要求,自己可以做VLAN隔离,小的VLAN比较多则会增加一些管理和配置成本。

-------

1、节点1网卡损坏,无法接受到其他节点的心跳。 
节点2能够接受到节点三的心跳,节点3能够接收到节点2的心跳。 
节点1,心跳信息给votingdisk说:“只有我活着!” 
节点2、3,心跳信息给votingdisk说:“我和2,我和3,都活着”。 
votingdisk将在自身节点1的部分上写一个“赐死块”(kill block),节点1读取到后自杀。 
(保留最大节点数部分的原则)

2、节点1能连接到votingdisk1、2、3,节点2只能连接votingdisk3。 
则votingdisk在自身上面节点2的区域写下一个赐死块,节点2读取到后自杀。 
(可访问的votingdisk数量大于不可访问的votingdisk数量时,节点可存活。可访问的votingdisk数量小于不可访问的votingdisk数量时,该节点不可存活。)  
 

3、在两节点rac中,节点1或2的网卡损坏,造成无法通信。则节点2被赐死。 
(脑裂的两部分节点数相同的情况下,instance number小的节点存活下来。)

4、各节点与votingdisk之间的连接全部中断,但各节点间心跳全通。则全部节点都将重启!

-------------------------------------Oracle Clusterware Software Concepts and Requirements

Oracle Clusterware uses voting disk files to provide fencing and cluster node membership determination. OCR provides cluster configuration information. You can place the Oracle Clusterware files on either Oracle ASM or on shared common disk storage. If you configure Oracle Clusterware on storage that does not provide file redundancy, then Oracle recommends that you configure multiple locations for OCR and voting disks. The voting disks and OCR are described as follows:

·         Voting Disks

Oracle Clusterware uses voting disk files to determine which nodes are members of a cluster. You can configure voting disks on Oracle ASM, or you can configure voting disks on shared storage.

If you configure voting disks on Oracle ASM, then you do not need to manually configure the voting disks. Depending on the redundancy of your disk group, an appropriate number of voting disks are created.

If you do not configure voting disks on Oracle ASM, then for high availability, Oracle recommends that you have a minimum of three voting disks on physically separate storage. This avoids having a single point of failure. If you configure a single voting disk, then you must use external mirroring to provide redundancy.

You should have at least three voting disks, unless you have a storage device, such as a disk array that provides external redundancy. Oracle recommends that you do not use more than five voting disks. The maximum number of voting disks that is supported is 15.  1-3-5 都可以

·         Oracle Cluster Registry

Oracle Clusterware uses the Oracle Cluster Registry (OCR) to store and manage information about the components that Oracle Clusterware controls, such as Oracle RAC databases, listeners, virtual IP addresses (VIPs), and services and any applications. OCR stores configuration information in a series of key-value pairs in a tree structure. To ensure cluster high availability, Oracle recommends that you define multiple OCR locations. In addition:

o    You can have up to five OCR locations

o    Each OCR location must reside on shared storage that is accessible by all of the nodes in the cluster

o    You can replace a failed OCR location online if it is not the only OCR location

o    You must update OCR through supported utilities such as Oracle Enterprise Manager, the Server Control Utility (SRVCTL), the OCR configuration utility (OCRCONFIG), or the Database Configuration Assistant (DBCA)

See Also:

Chapter 2, "Administering Oracle Clusterware" for more information about voting disks and OCR

Oracle Clusterware Network Configuration Concepts
Oracle Clusterware enables a dynamic Grid Infrastructure through the self-management of the network requirements for the cluster. Oracle Clusterware 11g release 2 (11.2) supports the use of dynamic host configuration protocol (DHCP) for all private interconnect addresses, as well as for most of the VIP addresses. DHCP provides dynamic configuration of the host's IP address, but it does not provide an optimal method of producing names that are useful to external clients.

When you are using Oracle RAC, all of the clients must be able to reach the database. This means that the VIP addresses must be resolved by the clients. This problem is solved by the addition of the Oracle Grid Naming Service (GNS) to the cluster. GNS is linked to the corporate domain name service (DNS) so that clients can easily connect to the cluster and the databases running there. Activating GNS in a cluster requires a DHCP service on the public network.  最好都不用

Implementing GNS
To implement GNS, you must collaborate with your network administrator to obtain an IP address on the public network for the GNS VIP. DNS uses the GNS VIP to forward requests for access to the cluster to GNS. The network administrator must delegate a subdomain in the network to the cluster. The subdomain forwards all requests for addresses in the subdomain to the GNS VIP.

GNS and the GNS VIP run on one node in the cluster. The GNS daemon listens on the GNS VIP using port 53 for DNS requests. Oracle Clusterware manages the GNS and the GNS VIP to ensure that they are always available. If the server on which GNS is running fails, then Oracle Clusterware fails GNS over, along with the GNS VIP, to another node in the cluster.

With DHCP on the network, Oracle Clusterware obtains an IP address from the server along with other network information, such as what gateway to use, what DNS servers to use, what domain to use, and what NTP server to use. Oracle Clusterware initially obtains the necessary IP addresses during cluster configuration and it updates the Oracle Clusterware resources with the correct information obtained from the DHCP server.

Single Client Access Name (SCAN)
Oracle RAC 11g release 2 (11.2) introduces the Single Client Access Name (SCAN). The SCAN is a single name that resolves to three IP addresses in the public network. When using GNS and DHCP, Oracle Clusterware configures the VIP addresses for the SCAN name that is provided during cluster configuration.

The node VIP and the three SCAN VIPs are obtained from the DHCP server when using GNS. If a new server joins the cluster, then Oracle Clusterware dynamically obtains the required VIP address from the DHCP server, updates the cluster resource, and makes the server accessible through GNS.

Example 1-1 shows the DNS entries that delegate a domain to the cluster.

Example 1-1 DNS Entries

# Delegate to gns on mycluster
mycluster.example.com NS myclustergns.example.com
#Let the world know to go to the GNS vip
myclustergns.example.com. 10.9.8.7
See Also:

Oracle Grid Infrastructure Installation Guide for details about establishing resolution through DNS

Configuring Addresses Manually
Alternatively, you can choose manual address configuration, in which you configure the following:

·         One public host name for each node.

·         One VIP address for each node.

You must assign a VIP address to each node in the cluster. Each VIP address must be on the same subnet as the public IP address for the node and should be an address that is assigned a name in the DNS. Each VIP address must also be unused and unpingable from within the network before you install Oracle Clusterware.

·         Up to three SCAN addresses for the entire cluster.

Note:

The SCAN must resolve to at least one address on the public network. For high availability and scalability, Oracle recommends that you configure the SCAN to resolve to three addresses.

See Also:

Your platform-specific Oracle Grid Infrastructure Installation Guide installation documentation for information about system requirements and configuring network addresses

Overview of Oracle Clusterware Platform-Specific Software Components
When Oracle Clusterware is operational, several platform-specific processes or services run on each node in the cluster. This section describes these various processes and services.

----nodeapps

----cluster resource

The Oracle Clusterware Stack
Oracle Clusterware consists of two separate stacks: an upper stack anchored by the Cluster Ready Services (CRS) daemon (crsd) and a lower stack anchored by the Oracle High Availability Services daemon (ohasd). These two stacks have several processes that facilitate cluster operations. The following sections describe these stacks in more detail:     cssd呢

·         The Cluster Ready Services Stack

·         The Oracle High Availability Services Stack

The Cluster Ready Services Stack
The list in this section describes the processes that comprise CRS. The list includes components that are processes on Linux and UNIX operating systems, or services on Windows.

·         Cluster Ready Services (CRS): The primary program for managing high availability operations in a cluster.

The CRS daemon (crsd) manages cluster resources based on the configuration information that is stored in OCR for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes. When you have Oracle RAC installed, the crsd process monitors the Oracle database instance, listener, and so on, and automatically restarts these components when a failure occurs.

·         Cluster Synchronization Services (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interface with your clusterware to manage node membership information.

The cssdagent process monitors the cluster and provides I/O fencing. This service formerly was provided by Oracle Process Monitor Daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure may result in Oracle Clusterware restarting the node.   nodeapps

·         Oracle ASM: Provides disk management for Oracle Clusterware and Oracle Database.

·         Cluster Time Synchronization Service (CTSS): Provides time management in a cluster for Oracle Clusterware.

·         Event Management (EVM): A background process that publishes events that Oracle Clusterware creates.

·         Oracle Notification Service (ONS): A publish and subscribe service for communicating Fast Application Notification (FAN) events.

·         Oracle Agent (oraagent): Extends clusterware to support Oracle-specific requirements and complex resources. This process runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g release 1 (11.1).

·         Oracle Root Agent (orarootagent): A specialized oraagent process that helps crsd manage resources owned by root, such as the network, and the Grid virtual IP address.

The Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Notification Services (ONS) components communicate with other cluster component layers on other nodes in the same cluster database environment. These components are also the main communication links between Oracle Database, applications, and the Oracle Clusterware high availability components. In addition, these background processes monitor and manage database operations.

css ons evmd

The Oracle High Availability Services Stack
This section describes the processes that comprise the Oracle High Availability Services stack. The list includes components that are processes on Linux and UNIX operating systems, or services on Windows.

·         Cluster Logger Service (ologgerd): Receives information from all the nodes in the cluster and persists in a CHM Repository-based database. This service runs on only two nodes in a cluster.

·         System Monitor Service (osysmond): The monitoring and operating system metric collection service that sends the data to the cluster logger service. This service runs on every node in a cluster.

·         Grid Plug and Play (GPNPD): Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile. OLR?

·         Grid Interprocess Communication (GIPC): A support daemon that enables Redundant Interconnect Usage.

·         Multicast Domain Name Service (mDNS): Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

·         Oracle Grid Naming Service (GNS): Handles requests sent by external DNS servers, performing name resolution for names defined by the cluster.

二. 查看OHASD 资源

Oracle High Availability Services Daemon (OHASD) :This process anchors the lower part of the Oracle Clusterware stack, which consists of processes that facilitate cluster operations.

在11gR2里面启动CRS的时候,会提示ohasd已经启动。 那么这个OHASD到底包含哪些资源。 我们可以通过如下命令来查看:

[grid@racnode1 ~]$ crsctl stat res -init -t

---------------------------------------------------------

NAME           TARGET  STATE    SERVER    STATE_DETAILS

---------------------------------------------------------

Cluster Resources

---------------------------------------------------------

ora.asm

1        ONLINE  ONLINE   racnode1  Started

ora.crsd

1        ONLINE  ONLINE   racnode1

ora.cssd

1        ONLINE  ONLINE   racnode1

ora.cssdmonitor

1        ONLINE  ONLINE   racnode1

ora.ctssd

1        ONLINE  ONLINE   racnode1  OBSERVER

ora.diskmon

1        ONLINE  ONLINE   racnode1

ora.drivers.acfs

1        ONLINE  UNKNOWN  racnode1

ora.evmd

1        ONLINE  ONLINE   racnode1

ora.gipcd

1        ONLINE  ONLINE   racnode1

ora.gpnpd

1        ONLINE  ONLINE   racnode1

ora.mdnsd

1        ONLINE  ONLINE   racnode1

分别看下这些进程:

(1)ora.asm:这个是asm 实例的进程。 在10g里, OCR和Voting disk 是放在其他共享设备上的。 11gR2里面,默认是放在ASM里面。 在Clusterware启动的时候需要读取这些信息,所以在集群启动的时候需要先启动ASM实例。

(2)ora.crsd,ora.cssd 和 ora.evmd:

这三个进程是Clusterware中最重要的3个进程.

在10g中,在安装clusterware的最后阶段,会要求在每个节点执行root.sh 脚本, 这个脚本会在/etc/inittab 文件的最后把这3个进程加入启动项,这样以后每次系统启动时,Clusterware 也会自动启动,其中EVMD和CRSD 两个进程如果出现异常,则系统会自动重启这两个进程,如果是CSSD 进程异常,系统会立即重启。

在11gR2中,只会将ohasd 写入/etc/inittab 文件。

[grid@racnode1 init.d]$ cat /etc/inittab

h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null

所以在10g中常用的/etc/init.d/init.crs 之类的命令都没有了。 就剩下一个/etc/init.d/init.ohasd 进程。

OCSSD :这个进程是Clusterware最关键的进程,如果这个进程出现异常,会导致系统重启,这个进程提供CSS(Cluster Synchronization Service)服务。 CSS 服务通过多种心跳机制实时监控集群状态,提供脑裂保护等基础集群服务功能。

CRSD:是实现"高可用性(HA)"的主要进程,它提供的服务叫作CRS(Cluster Ready Service) 服务。所有需要 高可用性 的组件,都会在安装配置的时候,以CRS Resource的形式登记到OCR中,而CRSD 进程就是根据OCR中的内容,决定监控哪些进程,如何监控,出现问题时又如何解决。也就是说,CRSD 进程负责监控CRS Resource 的运行状态,并要启动,停止,监控,Failover这些资源。 默认情况下,CRS 会自动尝试重启资源5次,如果还是失败,则放弃尝试。

CRS Resource 包括GSD(Global Serveice Daemon),ONS(Oracle Notification Service),VIP, Database, Instance 和 Service.    nodeapps resource 是本地资源。

EVMD:负责发布CRS 产生的各种事件(Event). 这些Event可以通过2种方式发布给客户:ONS 和 Callout Script.

这三个进程各自的作用,具体参考

RAC 的一些概念性和原理性的知识

http://www.cndba.cn/Dave/article/1021

中的说明。

(3)Grid Plug and Play (GPNPD):

Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.

(4)Grid Interprocess Communication (GIPC):

A support daemon that enables Redundant Interconnect Usage.

(5)ora.mdns

Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

(6)Cluster Time Synchronization Service (CTSS):

Provides time management in a cluster for Oracle Clusterware. 在上面的查询结果中,我们看到CTSS 的状态是OBSERVER。即旁观者。

在11gR2中,RAC在安装的时候,时间同步可以用两种方式来实现,一是NTP,还有就是CTSS. 当安装程序发现 NTP 协议处于非活动状态时,安装集群时间同步服务将以活动模式自动进行安装并通过所有节点的时间。如果发现配置了 NTP,则以观察者模式启动集群时间同步服务,Oracle Clusterware 不会在集群中进行活动的时间同步。

(7)Automatic Storage Management Cluster File System (Oracle ACFS):

Oracle Automatic Storage Management Cluster File System (Oracle ACFS) is a multi-platform, scalable file system, and storage management technology that extends Oracle Automatic Storage Management (Oracle ASM) functionality to support customer files maintained outside of Oracle Database. Oracle ACFS supports many database and application files, including executables, database trace files, database alert logs, application reports, BFILEs, and configuration files. Other supported files are video, audio, text, images, engineering drawings, and other general-purpose application file data.

An Oracle ACFS file system is a layer on Oracle ASM and is configured with Oracle ASM storage, as shown in Figure 5-1. Oracle ACFS leverages Oracle ASM functionality that enables:

·         Oracle ACFS dynamic file system resizing

·         Maximized performance through direct access to Oracle ASM disk group storage

·         Balanced distribution of Oracle ACFS across Oracle ASM disk group storage for increased I/O parallelism

·         Data reliability through Oracle ASM mirroring protection mechanisms

更多内容参考:

http://download.oracle.com/docs/cd/E11882_01/server.112/e16102/asmfilesystem.htm#OSTMG31000

三. 查看CRS资源

在11.2中,对CRSD资源进行了重新分类: Local Resources 和 Cluster Resources。 OHASD 指的就是Cluster Resource.

[grid@racnode1 ~]$ crsctl stat res -t

---------------------------------------------------------

NAME           TARGET  STATE    SERVER     STATE_DETAILS

---------------------------------------------------------

Local Resources

---------------------------------------------------------

ora.CRS.dg

ONLINE  ONLINE   racnode1

ONLINE  ONLINE   racnode2

ora.DATA.dg

ONLINE  ONLINE   racnode1

ONLINE  ONLINE   racnode2

ora.FRA.dg

ONLINE  ONLINE   racnode1

ONLINE  ONLINE   racnode2

ora.LISTENER.lsnr

ONLINE  ONLINE   racnode1

ONLINE  ONLINE   racnode2

ora.asm

ONLINE  ONLINE   racnode1   Started

ONLINE  ONLINE   racnode2   Started

ora.eons

ONLINE  ONLINE   racnode1

ONLINE  ONLINE   racnode2

ora.gsd

OFFLINE OFFLINE  racnode1

OFFLINE OFFLINE  racnode2

ora.net1.network

ONLINE  ONLINE   racnode1

ONLINE  ONLINE   racnode2

ora.ons

ONLINE  ONLINE   racnode1

ONLINE  ONLINE   racnode2

ora.registry.acfs

ONLINE  UNKNOWN  racnode1

ONLINE  ONLINE   racnode2

---------------------------------------------------------

Cluster Resources

---------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1        ONLINE  ONLINE   racnode2

ora.oc4j

1        OFFLINE OFFLINE

ora.racdb.db

1        ONLINE  ONLINE   racnode1   Open

2        ONLINE  ONLINE   racnode2   Open

ora.racnode1.vip

1        ONLINE  ONLINE   racnode1

ora.racnode2.vip

1        ONLINE  ONLINE   racnode2

ora.scan1.vip

1        ONLINE  ONLINE   racnode2

[grid@racnode1 ~]$

从上面的查询结果可以看出,在11gR2中把network,disgroup,eons,和 asm 也作为了一种资源。

还有一点需要注意:就是gsd 和 oc4j 这两资源,他们是offlie的。 说明如下:

ora.gsd  is OFFLINE by default if there is no 9i database in the cluster.

ora.oc4j is OFFLINE in 11.2.0.1 as Database Workload Management(DBWLM) is unavailable.  these can be ignored in 11gR2 RAC.

也可用如下命令来查看进程:

[root@racnode1 ~]# crs_stat -t

Name           Type           Target    State     Host

------------------------------------------------------------

ora.CRS.dg     ora....up.type     ONLINE    ONLINE    racnode1

ora.DATA.dg    ora....up.type    ONLINE    ONLINE    racnode1

ora.FRA.dg     ora....up.type     ONLINE    ONLINE    racnode1

ora....ER.lsnr      ora....er.type       ONLINE    ONLINE    racnode1

ora....N1.lsnr      ora....er.type       ONLINE    ONLINE    racnode2

ora.asm        ora.asm.type    ONLINE    ONLINE    racnode1

ora.eons       ora.eons.type     ONLINE    ONLINE    racnode1

ora.gsd        ora.gsd.type     OFFLINE   OFFLINE

ora....network     ora....rk.type       ONLINE    ONLINE    racnode1

ora.oc4j       ora.oc4j.type       OFFLINE   OFFLINE

ora.ons        ora.ons.type      ONLINE    ONLINE    racnode1

ora.racdb.db    ora....se.type       ONLINE    ONLINE    racnode1

ora....SM1.asm   application       ONLINE    ONLINE    racnode1

ora....E1.lsnr       application       ONLINE    ONLINE    racnode1

ora....de1.gsd     application       OFFLINE   OFFLINE

ora....de1.ons      application       ONLINE    ONLINE    racnode1

ora....de1.vip      ora....t1.type       ONLINE    ONLINE    racnode1

ora....SM2.asm   application       ONLINE    ONLINE    racnode2

ora....E2.lsnr       application       ONLINE    ONLINE    racnode2

ora....de2.gsd     application       OFFLINE   OFFLINE

ora....de2.ons      application       ONLINE    ONLINE    racnode2

ora....de2.vip      ora....t1.type       ONLINE    ONLINE    racnode2

ora....ry.acfs        ora....fs.type       ONLINE    ONLINE    racnode2

ora.scan1.vip     ora....ip.type       ONLINE    ONLINE    racnode1

ora.scan2.vip     ora....ip.type       ONLINE    ONLINE    racnode2

[root@racnode1 ~]#

四. 查看各种资源之间的依赖关系

比如DG resource依赖于ASM,VIP依赖于network。这些可以从资源的详细属性看出:

[root@racnode1 ~]# crsctl stat res ora.DATA.dg -p

NAME=ora.DATA.dg

TYPE=ora.diskgroup.type

ACL=owner:grid:rwx,pgrp:oinstall:rwx,other::r--

ACTION_FAILURE_TEMPLATE=

ACTION_SCRIPT=

AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%

ALIAS_NAME=

AUTO_START=never

CHECK_INTERVAL=300

CHECK_TIMEOUT=600

DEFAULT_TEMPLATE=

DEGREE=1

DESCRIPTION=CRS resource type definition for ASM disk group resource

ENABLED=1

LOAD=1

LOGGING_LEVEL=1

NLS_LANG=

NOT_RESTARTING_TEMPLATE=

OFFLINE_CHECK_INTERVAL=0

PROFILE_CHANGE_TEMPLATE=

RESTART_ATTEMPTS=5

SCRIPT_TIMEOUT=60

START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

START_TIMEOUT=900

STATE_CHANGE_TEMPLATE=

STOP_DEPENDENCIES=hard(intermediate:ora.asm)

STOP_TIMEOUT=180

UPTIME_THRESHOLD=1d

USR_ORA_ENV=

USR_ORA_OPI=false

USR_ORA_STOP_MODE=

VERSION=11.2.0.1.0

[grid@racnode1 ~]$ crsctl stat res ora.racnode1.vip -p

NAME=ora.racnode1.vip

TYPE=ora.cluster_vip_net1.type

ACL=owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:grid:r-x

ACTION_FAILURE_TEMPLATE=

ACTION_SCRIPT=

ACTIVE_PLACEMENT=1

AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%

AUTO_START=restore

CARDINALITY=1

CHECK_INTERVAL=1

DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=vip)

DEGREE=1

DESCRIPTION=Oracle VIP resource

ENABLED=1

FAILOVER_DELAY=0

FAILURE_INTERVAL=0

FAILURE_THRESHOLD=0

HOSTING_MEMBERS=racnode1

LOAD=1

LOGGING_LEVEL=1

NLS_LANG=

NOT_RESTARTING_TEMPLATE=

OFFLINE_CHECK_INTERVAL=0

PLACEMENT=favored

PROFILE_CHANGE_TEMPLATE=

RESTART_ATTEMPTS=0

SCRIPT_TIMEOUT=60

SERVER_POOLS=*

START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network)

START_TIMEOUT=0

STATE_CHANGE_TEMPLATE=

STOP_DEPENDENCIES=hard(ora.net1.network)

STOP_TIMEOUT=0

UPTIME_THRESHOLD=1h

USR_ORA_ENV=

USR_ORA_VIP=racnode1-vip

VERSION=11.2.0.1.0

Oracle RAC集群三种心跳机制相关推荐

  1. Oracle RAC集群资源的两种配置方式,Admin Managed 和 Policy Manager,以及实验

    对于Oracle RAC集群数据库,有两种资源管理方式:Administrator Managed(管理员管理的),Policy Managed(策略管理的) 要理解这两个概念,首先应该了解Serve ...

  2. 浅谈Oracle RAC --集群管理软件GI

    浅谈Oracle RAC --集群管理软件GI基本架构 今天周五,想想可以过周末,心情大好.一周中最喜欢过的就是周五晚上,最不喜欢过的是周日晚上和周一,看来我不是个热爱劳动的人啊.趁着现在心情愉悦,赶 ...

  3. 在 Oracle Enterprise Linux 和 iSCSI 上构建您自己的 Oracle RAC 集群(续)

    DBA:Linux    下载  Oracle 数据库 11g    标签 linux, rac, clustering, 全部 在 Oracle Enterprise Linux 和 iSCSI 上 ...

  4. 【DB宝44】Oracle rac集群中的IP类型简介

    文章目录 Oracle rac集群中的IP类型简介 (一)Public IP (二)Private IP (三)Virtual IP(VIP) (四)SCAN IP (五)GNS VIP (六)HAI ...

  5. linux下安装oracle集群,【Oracle 集群】Linux下Oracle RAC集群搭建之Oracle DataBase安装(八)...

    目录 数据库安装 继oracle集群安装之后,接下来也是最重要的数据库安装,整个数据库安装难度不大,用户以oracle用户身份登录RAC1主节点,对解压后的文件安装.主节点下安装后,其他所有结点自动安 ...

  6. Oracle Dataguard(主库为 Oracle rac 集群)配置教程(01)—— dataguard 服务器安装 Oracle 软件

    Oracle Dataguard(主库为 Oracle rac 集群)配置教程(01)-- dataguard 服务器安装 Oracle 软件 / 本专栏详细讲解 Oracle Dataguard(O ...

  7. oracle rac数据库特点,Oracle RAC集群结构的特点和缺点

    Oracle RAC,全称是Oracle Real Application Cluster,顾名思义即为真正的应用集群,整个集群系统由Oracle Clusterware (集群就绪软件)和 Real ...

  8. 资源放送丨《Oracle RAC 集群安装部署》PPT视频

    点击上方"蓝字" 关注我们,享更多干货! 前段时间,墨天轮邀请数据库资深专家 邦德 老师分享了<Oracle RAC 集群安装部署>,在这里我们将课件PPT和实况录像分 ...

  9. 今晚8点直播(内含福利)丨 Oracle RAC集群安装部署

    Oracle RAC集群安装部署-9月16日20:00 Oracle RAC真正的应用集群,它可以多个主机共同分散业务,来达到负载均衡和高可用,目前企业也大规模应用,具有稳定,很好的扩展性等特点. 作 ...

最新文章

  1. 如何在Windows Server 2008 Core里面添加Role~~~
  2. hdu 4778 Gems Fight! 状压dp
  3. 开发日记-20190817 关键词 Hello Unix
  4. pytorch 笔记: 协同过滤user item bias 实现
  5. java面试32问_学员分享:JAVA面试32问(11-20)
  6. golang学习之旅(1)
  7. r roc函数_如何处理R(pROC包)中的多类ROC分析?
  8. SpringMVC-@RequestMapping的参数和用法
  9. Mac cmake命令不可用-bash: cmake: command not found
  10. 信域安全云网产品架构
  11. glassfish插件_可扩展GlassFish v3的JavaEE 6平台
  12. Android VideoView播放avi格式视频有声音无图像问题
  13. python 爬取漫画《黑鹭尸体宅配便》
  14. 瞰见|迷失的开源乌托邦
  15. 本地用户和组 无法访问计算机 无效的语法,找不到Windows NT用户或组“DOMAIN \ USER”?...
  16. 智能采油管理系统介绍
  17. getoutputstream java_Java已为此响应调用getOutputStream()
  18. 天秀,手机也可以写Python代码了,还支持Numpy,Pandas等库的安装!
  19. kwgt 歌词_eight for kwgt
  20. 谷歌的广告业务是如何赚钱的?

热门文章

  1. 关于主机的思维导图_关于开展思维导图培训的通知
  2. 网站Cache全分析
  3. mysql数据库管理贡酒_【多选题】下列名酒产于四川的有(2.0分)A. 茅台酒 B. 五粮液 C. 古井贡酒 D. 剑南春...
  4. java get/set方法好处
  5. BUG处理:javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path buil
  6. SAP UI5 进阶 - XML 视图里定义的 UI 控件,运行时实例化的技术细节剖析试读版
  7. 计算机锁定不能强制选项无法关机,Win10无法关机只能按电源强制关机的解决方法...
  8. 51单片机——8X8点阵显示
  9. 吃透浏览器安全(同源限制/XSS/CSRF/中间人攻击)
  10. 支付宝服务商模式支付