nslang oracle_RAC11g+DG 高可用容灾方案部署

在写本博文之前，详细阅读了dave大神的博客http://blog.csdn.net/tianlesoftware/article/details/8212349，得到很多启发。

软件环境

prise Edition Release 11.2.0.4.0 - 64bit Production

PL/SQL Release 11.2.0.4.0 - Production

CORE 11.2.0.4.0 Production

TNS for Linux: Version 11.2.0.4.0 - Production

NLSRTL Version 11.2.0.4.0 - Production

more /etc/redhat-release

系统环境

Linux rac1.localdomain 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

集群数据库准备前的工作

网络配置

format : .

rac1

10.37.4.120 rac1.localdomain rac1

192.168.56.120 rac1-pri.localdomain rac1-pri

10.37.4.122 rac1-vip.localdomain rac1-vip

rac2

10.37.4.121 rac2.localdomain rac2

192.168.56.121 rac2-pri.localdomain rac2-pri

10.37.4.124 rac2-vip.localdomain rac2-vip

scan ip

10.37.4.131 rac-scan.localdomain rac-scan

**问题：

1、 Error: No suitable device found: no device found for connection ‘ System eth0′.[FAILED]

1、删除/etc/udev/rules.d/70-persistent-net.rules文件，重启系统。

2、如果上面的不起作用，那么去看ifcfg-eth0文件中的HWADDR是否正确，改成正确的HWADDR。

系统内核配置

1、

vi /etc/sysctl.conf

fs.aio-max-nr = 1048576

fs.file-max = 6815744

kernel.shmmni = 4096

kernel.sem = 250 32000 100 128

net.ipv4.ip_local_port_range = 9000 65500

net.core.rmem_default = 262144

net.core.rmem_max = 4194304

net.core.wmem_default = 262144

net.core.wmem_max = 1048586

sysctl -p

2、

vi /etc/security/limits.conf

oracle soft nproc 2047

oracle hard nproc 16384

oracle soft nofile 1024

oracle hard nofile 65536

oracle soft stack 10240

grid soft nproc 2047

grid hard nproc 16384

grid soft nofile 1024

grid hard nofile 65536

grid soft stack 10240

用户权限

2、用户组规划：

userdel oracle

groupdel oinstall

groupdel dba

groupdel oper

groupdel asmdba

groupadd -g 501 oinstall

groupadd -g 502 dba

groupadd -g 503 oper

groupadd -g 504 asmadmin

groupadd -g 505 asmoper

groupadd -g 506 asmdba

groupdel asmadmin

groupdel asmoper

userdel grid

userdel oracle

useradd -u 5002 -g oinstall -G dba,oper,asmadmin,asmoper,asmdba -d /home/grid grid

useradd -u 5001 -g oinstall -G dba,asmdba -d /home/oracle oracle

修改用户参数文件：

—-oracle

PATH=$PATH:$HOME/bin

export PATH

export PS1="`/bin/hostname -s`-> "

export TMP=/tmp

export TMPDIR=$TMP

export ORACLE_HOSTNAME=node2.localdomain

export ORACLE_SID=jhdb1

export ORACLE_BASE=/u01/app/oracle

export ORACLE_HOME=$ORACLE_BASE/product/11.2.0/db_1

export ORACLE_UNQNAME=devdb

export TNS_ADMIN=$ORACLE_HOME/network/admin

export ORACLE_TERM=xterm

export PATH=/usr/sbin:$PATH

export PATH=$ORACLE_HOME/bin:$PATH

export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib

export CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib

export EDITOR=vi

export LANG=en_US

export NLS_LANG=american_america.AL32UTF8

export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'

umask 022

—-grid

PATH=$PATH:$HOME/bin

export PATH

export PS1="`/bin/hostname -s`-> "

export TMP=/tmp

export TMPDIR=$TMP

export ORACLE_SID=+ASM1

export ORACLE_BASE=/u01/app/grid

export ORACLE_HOME=/u01/app/11.2.0/grid

export ORACLE_TERM=xterm

export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'

export TNS_ADMIN=$ORACLE_HOME/network/admin

export PATH=/usr/sbin:$PATH

export PATH=$ORACLE_HOME/bin:$PATH

export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib

export CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib

export EDITOR=vi

export LANG=en_US

export NLS_LANG=american_america.AL32UTF8

umask 022

—目录权限：

mkdir -p /u01/app/grid

mkdir -p /u01/app/11.2.0/grid

mkdir -p /u01/app/oracle

chown -R oracle:oinstall /u01

chown -R grid:oinstall /u01/app/grid

chown -R grid:oinstall /u01/app/11.2.0

chmod -R 775 /u01

系统服务配置

1.停止NTP服务

grid 时间同步服务：grid提供了cluster time synchronization service(cssd),使用此功能就需要关闭NTP服务。

service ntpd stop

chkconfig ntpd off

cp /etc/ntp.conf /etc/ntp.conf.bak

rm -rf /etc/ntp.conf

2.开启vsftpd服务

service iptables stop

setenforce 0

3.关闭防火墙

vi /etc/selinux/config

4.开启VNC服务

centos 开启VNC

查看vncserver配置文件

rpm -qc tigervnc-server

SSH对等性设置

SSH对等性配置：

双节点：

rm -rf ~/.ssh

mkdir ~/.ssh

ssh-keygen -t rsa

ssh-keygen -t dsa

cat ~/.ssh/id_dsa.pub>>~/.ssh/authorized_keys

cat ~/.ssh/id_rsa.pub>>~/.ssh/authorized_keys

scp ~/.ssh/authorized_keys rac2:~/.ssh/keys

rac2 :

cat ~/.ssh/keys >> authorized_keys

scp authorized_keys rac1:/home/grid/.ssh/authorized_keys

精简

对等性验证：

ssh rac1 date

ssh rac2 date

DNS服务器配置

1.软件包：

bind-9.3.6-4.Pl.el5_4.2.x86_64.rpm

bind-chroot-9.3.6-4.Pl.el5_4.2.x86_64.rpm

caching-nameserver-9.3.6-4.Pl.el5_4.2.x86_64.rpm

2.配置/var/named/chroot/etc/named.conf文件

由拷贝而来

cp -p named.caching-nameserver.conf named.conf

把 127.0.0.1 改成 “ any; ”允许所有的ip去访问

3.配置ZONE文件

修改/var/named/chroot/etc/named.rfrc1912.zons文件

主要目的是为了能正确解析SCAN-IP,

正向zone文件

zone “localdomain” IN {

type master;

file “localdomain.zone”;

allow-update { none; };

};

scan:

192.168.56.140 rac.scan.localdomain rac-scan

反向zone文件

zone “56.168.192.in-addr.arpa” IN{

type master;

file “56.168.192.in-addr.arpa”;

allow-update { none; };

};

配置正反向解析数据库文件：

/var/named/chroot/var/named

正向解析数据库文件 localdomain.zone

rac-scan IN A 192.168.56.140

反向解析数据库文件 cp -p named.local 56.168.192.in-addr.arpa

140 IN PTR rac-scan.localdomain.

启动DNS服务器 /etc/init.d/named start

校验

rac1:

配置 /etc/resolv.conf

search localdomain

nameserver 192.168.56.120

rac2:

配置 /etc/resolv.conf

search localdomain

nameserver 192.168.56.120

验证 nslookup rac-scan、nslookup rac-scan.localdomain nslookup 10.37.4.173

centos 6 DNS配置：

1：yum -y install bind-chroot.x86_64 bind.x86_64

2: vi /etc/named.conf

把 127.0.0.1 、localhost 改成 any

//反向解析

zone “4.37.10.in-addr.arpa” IN {

type master;

file “4.37.10.in-addr.arpa.zone”;

allow-update { none; };

};

//正向解析

zone “localdomain” IN {

type master;

file “named.localhost”;

allow-update { none; };

};

3：

//正向解析DNS库

vi /var/named/named.localhost

$TTL 86400

@ IN SOA @ root.localdomain. (

42 ; serial

3H ; refresh

15M ; retry

15W ; expire

1D ) ; minimum

NS @

A 10.37.4.170

rac-scan IN A 10.37.4.173

//反向解析DNS库

vi /var/named/4.37.10.in-addr.arpa.zone

$TTL 86400

@ IN SOA 4.37.10.in-addr.arpa. localhost.localdomain. (

0 ; serial

1D ; refresh

1H ; retry

1W ; expire

3H ) ; minimum

NS @

A 10.37.4.170

173 IN PTR rac-scan.localdomain.

重启DNS服务器

service named restart

**CentOS DNS resolv重启无效的解决方法

在此要强调一点的是，直接修改/etc/resolv.conf这个文件是没用的，网络服务重启以后会根据/etc/sysconfig/network-scripts/ifcfg-eth0来重载配置，如果ifcfg-eth0没有配置DNS，那么resolv.conf会被冲掉，重新变成空值。

解决方案：在/etc/sysconfig/network-scripts/ifcfg-eth0单独增加一行“DNS=10.37.4.170”

查看会话端口 netstat -ltunp |grep named

ASM共享存储配置

在11g 之前，可以直接使用裸设备安装rac。

11g之后，必须用asm，所以这里就会涉及到设备持久话的问题，持久化有3种方法：

(1) udev

(2) multipath

(3) Oracle asmlib

udev 绑定raw裸磁盘：

raw挂载：

这里是使用客户端链接ISCIS共享的磁盘：

1. 1. 检查系统是否已安装iSCSI initiator

2. 因为安装iSCSI 驱动需要配合核心来编译，所以会使用到内核源代码，此外，也需要编译器(compiler)的帮助，因此，确认Linux 系统当中存有下列软件 kernel-source 、kernel、gcc、perl、Apache。打开一个终端，使用命令检查：

3. [root@rac1 mnt]# rpm -qa|grep iscsi

4. iscsi-initiator-utils-6.2.0.872-6.el5

5. 2. 查看Initiator IQNname IQN写入盘阵里

6. [root@rac1 mnt]# more /etc/iscsi/initiatorname.iscsi

7. InitiatorName=iqn.1994-05.com.redhat:e9cbe2cae7e4 写入iscsi target

8. 3.开启ISCSI服务

9. [root@rac1 mnt]# service iscsid start

10. Starting iSCSI daemon:

11. [ OK ]

12. 4.监测ISCSI target(节点)

13. #iscsiadm -m discovery -t sendtargets -p ipaddress(盘阵的IP地址)

14. root@rac1 mnt]# iscsiadm -m discovery -t sendtargets -p 10.37.55.2

15.

16. 5. 登录target

17. #iscsiadm -m node -T targetname -p ipaddress –login(盘阵的IP地址)

18. # iscsiadm -m node -T iqn.1991-05.com.microsoft:jhzg-cc01-data-55.2-target -p 10.37.55.2:3260 -l

19.

20. 6.查看ISCSI

21. [root@rac1 mnt]# iscsiadm -m node -p 10.37.55.2 - l

22. 7.设置ISCSI target自动挂载

23. # iscsiadm -m node -T iqn.1991-05.com.microsoft:jhzg-cc01-data-55.2-target -p 10.37.55.2:3260 –op update -n node.startup -v automatic

24. (iqn.1991-05.com.microsoft:jhzg-cc01-4.100-target)

25. 检查iscsi服务是否开机自动启动： chkconfig –list|grep iscsi

26. 8重启系统，然后fdisk -l 查看ISCSI挂载的磁盘。

27. mkfs.ext3 /dev/sdj

28. mount -t /dev/sdj /mnt

29. vi /etc/fstab

30. /dev/sdj mnt ext3 defaults 0 0

31. 卸载命令：

32. 1：删除/var/lib/iscsi/nodes和/var/lib/iscsi/sendtargets目录下所有信息

33. /var/lib/iscsi/nodes 目录下存放的是所有iscsi挂载的iqn

34. /var/lib/iscsi/sendtargets 目录下存放的是iscsi挂载存储IP

35. 2：重启服务器

36. [root@node1 /]# reboot

37. 3：如果不想重启服务器卸载就使用卸载命令

38. iscsiadm -m node -T iqn.1991-05.com.microsoft:jhzg-cc01-datastore-55.2-target -u

39. iscsiadm -m node -T iqn.1991-05.com.microsoft:jhzg-cc01-data-55.2-target -u

40. 卸载命令：

41. iscsiadm -m node -T iscsi存储iqn -u

udev绑定

(猜想：是不是只有iscsi设备才能使用scsi_id获取uuid)

：使用udev把设备编号跟设备物理ID绑定，从而实现固化效果。

udev配置文件

主要的udev配置文件是/etc/udev/udev.conf。这个文件通常很短，他可能只是包含几行#开头的注释，然后有几行选项：

42. udev_root=“/dev/”

udev_rules=“/etc/udev/rules.d/”

udev_log=“err“

43. 上面的第二行非常重要，因为他表示udev规则存储的目录，这个目录存储的是以.rules结束的文件。每一个文件处理一系列规则来帮助udev分配名字给设备文件以保证能被内核识别。

你的/etc/udev/rules.d下面可能有好几个udev规则文件，这些文件一部分是udev包安装的，另外一部分则是可能是别的硬件或者软件包生成的。

RHEL5：

for i in b c d e f ;

echo "KERNEL=="sd*",BUS=="scsi",PROGRAM=="/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/$name",RESULT=="`/sbin/scsi_id -g -u -s /dev/sd$i`",NAME="asm-disk$i",OWNER="grid",GROUP="asmadmin",MODE="0660"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules

done

运行结果如下：

KERNEL==”sd*”,BUS==”scsi”,PROGRAM==”/sbin/scsi_id –whitelisted –replace-whitespace –device=/dev/$name”,RESULT==”36001438009b01c320000300000500000”,NAME=”asm-arch1”,OWNER=”grid”,GROUP=”asmadmin”,MODE=”0660”

RHEL6

for i in b c d e f ;

echo “KERNEL==”sd*”,RESULT==”/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/sd$i”,NAME=”asm-disk$i”,MODE=”0660”” >> /etc/udev/rules.d/99-oracle-asmdevices.rules

done

为了避免系统在重启后磁盘路径改变从而导致外部应用无法读取的问题，通过磁盘RESULT编码来识别磁盘并逻辑固定。

(2) 对于磁盘子分区的绑定

62.

Redhat Enterprise Linux 5 用如下参数

KERNEL==”sd?[1-2]”,PROGRAM==”/sbin/scsi_id -g -u -s %p”,RESULT==”1ATA_VBOX_HARDDISK_VBaef9fa71-c32978c8”,NAME=”asm-ocr%n”,GROUP=”asmdba”,MODE=”0660”

Redhat Enterprise Linux 6 用如下参数

KERNEL==”sd?[1-2]”,PROGRAM==”/sbin/scsi_id -g -u /dev/$name”,MODE=”0660”

KERNEL==”sdb1”,PROGRAM==”/sbin/scsi_id -g -u -d /dev/$parent” RESULT==”1ATA_VBOX_HARDDISK_VB8383313d-441fd502”,NAME=”asm-crs1”,MODE=”0660”

more /etc/udev/rules.d/99-oracle-asmdevices.rules 如下：

KERNEL==”sd*”,RESULT==”360003ff44dc75adc9f3304ea673307c8”,NAM

E=”asm-diskc”,MODE=”0660”

KERNEL==”sd*”,RESULT==”360003ff44dc75adc8055e84e9f6c5362”,NAM

E=”asm-diskd”,RESULT==”360003ff44dc75adc8479b2efc6d42b45”,NAM

E=”asm-diskb”,MODE=”0660”

start_udev会自动加载/etc/udev/rules.d/99-oracle-asmdevices.rules

On SLES10:

# /etc/init.d/boot.udev stop

/etc/init.d/boot.udev start

On RHEL5/OEL5/OL5:

# /sbin/udevcontrol reload_rules

/sbin/start_udev

On RHEL6/OL6:

#/sbin/udevadm control –reload-rules

/sbin/start_udev

ls -al /dev/asm* //验证udev是否正确

brw-rw—-. 1 grid asmadmin 8,64 Apr 25 09:00 /dev/asm-diskb

brw-rw—-. 1 grid asmadmin 8,80 Apr 25 09:00 /dev/asm-diskc

brw-rw—-. 1 grid asmadmin 8,96 Apr 25 09:00 /dev/asm-diskd

自此udev绑定磁盘完毕！！

如果ORACLE不提供相应的asmlib包，那么我们需要使用raw裸盘来创建asmdiskgroup

oracle asmlib

第三种方案：

oracle asmlib:

7.1 ASMlib驱动包安装：

7.1.1 ：确定系统版本

[root@rac2 2.6.18-238.el5]# cat /etc/redhat-release

CentOS release 6.6 (Final)

7.1.2 ：确定内核版本

[root@rac2 2.6.18-238.el5]# uname -r

2.6.32-504.el6.x86_64

7.1.3：安装所需的asmlib包

kmod-oracleasm-2.0.8-4.el6_6.x86_64

oracleasmlib-2.0.12-1.el6.x86_64

oracleasm-support-2.1.8-1.el5.x86_64

7.1.4：磁盘分区 fdisk /dev/sdb

7.1.5 ：初始化asmlib

/etc/init.d/oracleasm configure

7.1.6：创建asmdisk磁盘，在其中一个节点执行

/etc/init.d/oracleasm createdisk VOL2 /dev/sdc1

问题1：在linux下oracleasm configure -i 其实修改的是一个配置文件，这个文件是位于/etc/sysconfig/oracleasm，大家可以通过oracleasm configure -i 来修改，也可以直接修改这个文件，为了安全起见，一般最好是通过oracleasm configure -i 来修改吧

问题2：Device “/dev/sdc1” is already labeled for ASM disk “VOL2”

格式化磁盘头，dd if=/dev/zero of=/dev/sdc1 bs=1024 count=100

 GRID软件安装

1、配置yum：

mount -t iso9660 /dev/cdrom /mnt

mount /dev/cdrom /mnt

vi /etc/yum.repos.d/rhel-source.repo

mount -o loop CentOS-6.5-x86_64-bin-DVD1.iso /mnt //挂载档案

[rhel-source]

name=Red Hat Enterprise Linux $releasever - Source

baseurl=file:///medir/Server

enabled=1

gpgcheck=0

2、安装软件包：

yum -y install binutils compat-libstdc++-33 elfutils-libelf elfutils-libelf-devel gcc gcc-c++ glibc-2.5 glibc-common glibc-devel glibc-headers ksh libaio libaio-devel libgcc libstdc++ libstdc++-devel make sysstat

**注意

[root@rac1 oracle]# rpm -Uvh pdksh-5.2.14-37.el5_8.1.x86_64.rpm

warning: pdksh-5.2.14-37.el5_8.1.x86_64.rpm: Header V3 DSA/SHA1 Signature,key ID e8562897: NOKEY

error: Failed dependencies:

pdksh conflicts with ksh-20120801-21.el6.x86_64

据网上资料说是由于安装了旧版的GPG Keys造成的。

解决办法如下：

rpm -ivh pdksh-5.2.14-37.el5_8.1.x86_64.rpm –force –nodeps

–force:强制安装

–nodeps:忽略依赖关系

rpm -e oracleasm-2.6.18-238.el5-2.0.5-1.el5.x86_64 –nodeps

强制删除

3、 grid安装前的环境验证:

*注意：

如果之前有安装过grid，失败过，在第二次安装前一定要卸载干净否则第二次安装会很困难

也可以把/etc/oracle/* 、/etc/oratab 、/etc/oraInst.loc删除.

su - grid

./runcluvfy.sh stage -pre crsinst -n rac1,rac2 -fixup -verbose

User equivalence unavailable on all the specified nodes

节点对等性没检查

*错误解决方法：

Checking multicast communication...

Checking subnet "10.37.4.0" for multicast communication with multicast group "230.0.1.0"...

PRVG-11134 : Interface "10.37.4.170" on node "rac2" is not able to communicate with interface "10.37.4.170" on node "rac2"

PRVG-11134 : Interface "10.37.4.170" on node "rac2" is not able to communicate with interface "10.37.4.160" on node "rac1"

PRVG-11134 : Interface "10.37.4.160" on node "rac1" is not able to communicate with interface "10.37.4.170" on node "rac2"

Checking subnet "10.37.4.0" for multicast communication with multicast group "224.0.0.251"...

PRVG-11134 : Interface "10.37.4.170" on node "rac2" is not able to communicate with interface "10.37.4.170" on node "rac2"

PRVG-11134 : Interface "10.37.4.170" on node "rac2" is not able to communicate with interface "10.37.4.160" on node "rac1"

PRVG-11134 : Interface "10.37.4.160" on node "rac1" is not able to communicate with interface "10.37.4.170" on node "rac2"

Checking subnet "192.168.56.0" for multicast communication with multicast group "230.0.1.0"...

PRVG-11134 : Interface "192.168.56.140" on node "rac2" is not able to communicate with interface "192.168.56.140" on node "rac2"

PRVG-11134 : Interface "192.168.56.140" on node "rac2" is not able to communicate with interface "192.168.56.130" on node "rac1"

PRVG-11134 : Interface "192.168.56.130" on node "rac1" is not able to communicate with interface "192.168.56.140" on node "rac2"

Checking subnet "192.168.56.0" for multicast communication with multicast group "224.0.0.251"...

PRVG-11134 : Interface "192.168.56.140" on node "rac2" is not able to communicate with interface "192.168.56.140" on node "rac2"

PRVG-11134 : Interface "192.168.56.140" on node "rac2" is not able to communicate with interface "192.168.56.130" on node "rac1"

PRVG-11134 : Interface "192.168.56.130" on node "rac1" is not able to communicate with interface "192.168.56.140" on node "rac2"

———-解决方案：关掉防火墙

2、在Oracle grid 11.2.0.3/4安装的时候报PRVF-5636: The DNS response time for an unreachable node exceeded “15000″ ms on following nodes,意思是查找一个不存在的主机名时在规定的时间内超时，

问题解决办法：

修改DNS服务器的/etc/named.conf文件，添加file “/dev/null”;信息即可。

zone “.” IN {

type hint;

// file “named.ca”;

file “/dev/null”;

}

在RAC节点主机分别添加如下参数：

[root@rac2 ~]# vi /etc/resolv.conf

search prudentwoo.com

nameserver 192.168.7.51

nameserver 192.168.7.52

options rotate

options timeout:2

options attempts:5

service named restart

重新安装grid.(一定要关闭grid安装窗口，重新检查不行)

3、使用raw在进行ASM验证的时候出错：

Device Checks for ASM

- This is a pre-check to verify if the specified devices meet the requirements for configuration through the Oracle Universal Storage Manager Configuration Assistant.?Error:

rac2:Unable to determine the shareability of device /dev/sdb on nodes:

The problem occurred on nodes: rac2 ? Cause:cause Of Problem Not Available ? Action:user Action Not Available

rac1:Unable to determine the shareability of device /dev/sde on nodes:

The problem occurred on nodes: rac1 ? Cause:cause Of Problem Not Available ? Action:user Action Not Available

Check Failed on Nodes: [rac2,rac1]

Verification result of failed node: rac2

details:

Unable to determine the shareability of device /dev/asm-diskb on nodes: [rac2,rac1] ?

Cause:cause Of Problem Not Available ? Action:user Action Not Available

PRVF-9802 : Attempt to get udev info from node “rac2” failed ?

Cause:?Attempt to read the udev permissions file failed,probably due to missing permissions directory,missing or invalid permissions file,or permissions file not accessible to use account running the check. ? Action:?Make sure that the udev permissions directory is created,the udev permissions file is available,and it has correct read permissions for access by the user running the check.

Verification result of failed node: rac1

details:

Unable to determine the shareability of device /dev/asm-diskb on nodes: [rac2,rac1] Cause:cause Of Problem Not Available ? Action:user Action Not Available

PRVF-9802 : Attempt to get udev info from node “rac1” failed ?

Cause:?Attempt to read the udev permissions file failed,and it has correct read permissions for access by the user running the check.

解决方法：

Cause

There’s multiple likely causes:

Cause 1. OCR and Voting Disk devices are not managed by UDEV,thus UDEV rules doesn’t have proper configuration for OCR and Voting disk devices

Cause 2. Due to unpublished bug 8726128,even UDEV is configured properly,the error still showed up on local node on RHEL4/OEL4

Solution

To keep OCR and Voting Disk device name persistent across node reboot with proper ownership and permission,they need to be managed by UDEV; if you have other tool for the same purpose,you can safely ignore the warning.

Unpublished bug 8726128 will be fixed in 11.2.0.2,if “ls -l name_of_ocr” command shows expected result,the error can be ignored

Note: its recommended to migrate OCR and Voting Disk to ASM once upgrade is finished; if Voting Disk is migrated to ASM,it will inherit underlying diskgroup redundance:

To migrate OCR to ASM:

GRIDHOME/bin/ocrconfig−addDiskGroupNameGRID_HOME/bin/ocrconfig -delete Current_OCR_NAME

Example:

GRIDHOME/bin/ocrconfig−add+DGOCWGRID_HOME/bin/ocrconfig -delete /dev/raw/raw1

To migrate Voting Disk to ASM:

$GRID_HOME/bin/crsctl replace votedisk +DiskGroupName

Example:

$GRID_HOME/bin/crsctl replace votedisk +DGOCW

解决方法二：

在使用udev绑定磁盘路径的情况下，确保两个节点的磁盘名称一致，只有这样才能共享存储。

其实这个问题是个bug，可以忽略…。

PRVF-10208 : Following error occurred during the VIP subnet configuration check PRCT-1302 : The “OCR,”

各个节点网卡名称不一致，导致在GUI安装选择网卡的时候出现错误

解决办法：修改/etc/udev/rules.d/70-persistent-net.rules ，按照HWaddr改网卡名称就可以。

4、

ERROR:

Unable to obtain network interface list from Oracle ClusterwarePRCT-1011 : Failed to run “oifcfg”. Detailed error: null

Verification cannot proceed

Issue#1 - Wrong ORA_NLS10 Setting

Issue#2 - Incorrect Network Setting in OCR

Issue#1:

Either unset ORA_NLS10 or set it to correct location before starting OUI:

unset ORA_NLS10

export ORA_NLS10=GRID_HOME/nls/data

RefertoNote1050472.1formoredetails.

Issue#2:

Inthisexample,eth1istheprivatenetworkandeth3isthepublicnetwork.

2.1.Asroot,executethefollowingtogenerateanocrdumpfile:GRID_HOME/bin/ocrdump /tmp/dump.ocr1

2.2. As root,find out all network information in the OCR:

grep ‘css.interfaces’ /tmp/dump.ocr1 | awk -F ] ‘{print 1}'|awk-F.'{print$5}’|sort-u

eth1

eth3

2.3.Asroot,removenetworkinfofromOCR:GRID_HOME/bin/oifcfg delif -global eth1 -force

GRIDHOME/bin/oifcfgdelif−globaleth3−force2.4.Asgriduser,findoutthenetworkinformationatOSlevel GRIDHOME/bin/oifcfgiflist−p−n..eth110.1.0.128PRIVATE255.255.255.128eth1169.254.0.0UNKNOWN255.255.192.0..eth310.1.0.0PRIVATE255.255.255.128Note:HAIP(169.254.x.x)shouldbeignoredintheoutput2.5.Asgriduser,setnetworkinfoinOCR: GRIDHOME/bin/oifcfgsetif−globaleth1/10.1.0.128:clusterinterconnect GRIDHOME/bin/oifcfgsetif−globaleth3/10.1.0.0:public2.6.Asgriduser,confirmthechange: $GRID_HOME/bin/oifcfg getif

eth1 10.1.0.128 global cluster_interconnect

eth3 10.1.0.0 global public

4、安装grid

./runInstaller

[root@rac1 etc]# sh /u01/app/oraInventory/orainstRoot.sh

Changing permissions of /u01/app/oraInventory.

Adding read,write permissions for group.

Removing read,write,execute permissions for world.

Changing groupname of /u01/app/oraInventory to oinstall.

The execution of the script is complete.

[root@rac1 etc]# sh /u01/app/11.2.0/grid/root.sh

Performing root user operation for Oracle 11g

The following environment variables are set as:

ORACLE_OWNER= grid

ORACLE_HOME= /u01/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin …

Copying oraenv to /usr/local/bin …

Copying coraenv to /usr/local/bin …

Creating /etc/oratab file…

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params

Creating trace directory

User ignored Prerequisites during installation

Installing Trace File Analyzer

OLR initialization - successful

root wallet

root wallet cert

root cert export

peer wallet

profile reader wallet

pa wallet

peer wallet keys

pa wallet keys

peer cert request

pa cert request

peer cert

pa cert

peer root cert TP

profile reader root cert TP

pa root cert TP

peer pa cert TP

pa peer cert TP

profile reader pa cert TP

profile reader peer cert TP

peer user cert

pa user cert

Adding Clusterware entries to inittab

CRS-2672: Attempting to start ‘ora.mdnsd’ on ‘rac1’

CRS-2676: Start of ‘ora.mdnsd’ on ‘rac1’ succeeded

CRS-2672: Attempting to start ‘ora.gpnpd’ on ‘rac1’

CRS-2676: Start of ‘ora.gpnpd’ on ‘rac1’ succeeded

CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘rac1’

CRS-2672: Attempting to start ‘ora.gipcd’ on ‘rac1’

CRS-2676: Start of ‘ora.cssdmonitor’ on ‘rac1’ succeeded

CRS-2676: Start of ‘ora.gipcd’ on ‘rac1’ succeeded

CRS-2672: Attempting to start ‘ora.cssd’ on ‘rac1’

CRS-2672: Attempting to start ‘ora.diskmon’ on ‘rac1’

CRS-2676: Start of ‘ora.diskmon’ on ‘rac1’ succeeded

CRS-2676: Start of ‘ora.cssd’ on ‘rac1’ succeeded

ASM created and started successfully.

Disk Group DATA created successfully.

clscfg: -install mode specified

Successfully accumulated necessary OCR keys.

Creating OCR keys for user ‘root’,privgrp ‘root’..

Operation successful.

CRS-4256: Updating the profile

Successful addition of voting disk 1776e78be0cf4fb8bfb0836f7921566b.

Successfully replaced voting disk group with +DATA.

CRS-4256: Updating the profile

CRS-4266: Voting file(s) successfully replaced

STATE File Universal Id File Name Disk group

– —– —————– ——— ———

1. ONLINE 1776e78be0cf4fb8bfb0836f7921566b (ORCL:DATA1) [DATA]

Located 1 voting disk(s).

CRS-2672: Attempting to start ‘ora.asm’ on ‘rac1’

CRS-2676: Start of ‘ora.asm’ on ‘rac1’ succeeded

CRS-2672: Attempting to start ‘ora.DATA.dg’ on ‘rac1’

CRS-2676: Start of ‘ora.DATA.dg’ on ‘rac1’ succeeded

Configure Oracle Grid Infrastructure for a Cluster … succeeded

验证rac环境：

[root@rac1 bin]# ./crsctl stop cluster

CRS-2673: Attempting to stop ‘ora.crsd’ on ‘rac1’

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on ‘rac1’

CRS-2673: Attempting to stop ‘ora.ARCHIVE.dg’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.BACKUP.dg’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.OCR.dg’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.registry.acfs’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.DATA.dg’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.FLASH.dg’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.LISTENER.lsnr’ on ‘rac1’

CRS-2677: Stop of ‘ora.LISTENER.lsnr’ on ‘rac1’ succeeded

CRS-2673: Attempting to stop ‘ora.rac1.vip’ on ‘rac1’

CRS-2677: Stop of ‘ora.registry.acfs’ on ‘rac1’ succeeded

CRS-2677: Stop of ‘ora.rac1.vip’ on ‘rac1’ succeeded

CRS-2672: Attempting to start ‘ora.rac1.vip’ on ‘rac2’

CRS-2676: Start of ‘ora.rac1.vip’ on ‘rac2’ succeeded

CRS-2677: Stop of ‘ora.ARCHIVE.dg’ on ‘rac1’ succeeded

CRS-2677: Stop of ‘ora.FLASH.dg’ on ‘rac1’ succeeded

CRS-2677: Stop of ‘ora.BACKUP.dg’ on ‘rac1’ succeeded

CRS-2677: Stop of ‘ora.OCR.dg’ on ‘rac1’ succeeded

CRS-2677: Stop of ‘ora.DATA.dg’ on ‘rac1’ succeeded

CRS-2673: Attempting to stop ‘ora.asm’ on ‘rac1’

CRS-2677: Stop of ‘ora.asm’ on ‘rac1’ succeeded

CRS-2673: Attempting to stop ‘ora.ons’ on ‘rac1’

CRS-2677: Stop of ‘ora.ons’ on ‘rac1’ succeeded

CRS-2673: Attempting to stop ‘ora.net1.network’ on ‘rac1’

CRS-2677: Stop of ‘ora.net1.network’ on ‘rac1’ succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on ‘rac1’ has completed

CRS-2677: Stop of ‘ora.crsd’ on ‘rac1’ succeeded

CRS-2673: Attempting to stop ‘ora.ctssd’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.evmd’ on ‘rac1’

CRS-2673: Attempting to stop ‘ora.asm’ on ‘rac1’

CRS-2677: Stop of ‘ora.evmd’ on ‘rac1’ succeeded

CRS-2677: Stop of ‘ora.ctssd’ on ‘rac1’ succeeded

CRS-2677: Stop of ‘ora.asm’ on ‘rac1’ succeeded

CRS-2673: Attempting to stop ‘ora.cluster_interconnect.haip’ on ‘rac1’

CRS-2677: Stop of ‘ora.cluster_interconnect.haip’ on ‘rac1’ succeeded

CRS-2673: Attempting to stop ‘ora.cssd’ on ‘rac1’

CRS-2677: Stop of ‘ora.cssd’ on ‘rac1’ succeeded

[root@rac1 bin]#

在执行root.sh报错：

1：Failed to create keys in the OLR,rc = 127,Message:

/u01/app/11.2.0/grid/bin/clscfg.bin: error while loading shared libraries: libcap.so.1: cannot open shared object file: No such file or directory

Failed to create keys in the OLR at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 7660.

/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

**很明显是缺libcap包

在网上看可以这么做：

find / -name libcap*

ln -s libcap.so.2.16 libcap.so.1

2：

Disk Group DATA creation failed with the following message:

ORA-15018: diskgroup cannot be created

ORA-15020: discovered duplicate ASM disk “DATA_0000”

原因是安装grid出错，又重新安装了一次，原因是”DATA_0000”磁盘重复使用。

3：oracle11.2.0.4 rac搭建中的crs-4000错误解析

NO KEYS WERE WRITTEN. Supply -force parameter to override.

-force is destructive and will destroy any previous cluster

configuration.

Failed to create voting files on disk group DATA.

Change to configuration failed,but was successfully rolled back.

CRS-4000: Command Replace failed,or completed with errors.

Voting file add failed

Failed to add voting disks at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 6930.

/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

修改/etc/syscofig/oracleasm 配置文件，使oracleasm 可以使用逻辑快

将ORACLEASM_USE_LOGICAL_BLOCK_SIZE=false改为true

卸载grid：

./runInstaller -silent -force -ignorePrereq -ignoreSysPreReqs -responseFile /home/oracle/db.rsp

Device Checks for ASM - This is a pre-check to verify if the specified devices meet the requirements for configuration through the Oracle Universal Storage Manager Configuration Assistant.

oracle asm 磁盘为检测到，磁盘不可用。

3：CRS-4046、CRS-4000，因为之前执行过，所以需要把配置文件覆盖掉在执行

CRS-4046: Invalid Oracle Clusterware configuration.

CRS-4000: Command Create failed,or completed with errors.

Failure initializing entries in /etc/oracle/scls_scr/rac1

/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

通过 /u01/app/11.2.0/grid/crs/install/rootcrs.pl -verbose -deconfig -force 清除CRS配置信息

遇到问题：

1、

WARNING: [WARNING] [INS-32091] Software installation was successful. But some configuration assistants failed,were cancelled or skipped.

ACTION: Refer to the logs or contact Oracle Support Services.

解决思路：看字面无法判定是什么问题，查看log日志 /u01/app/oraInventory/logs/ installActions2016-04-25_10-42-31AM.log，有如下信息：

INFO: Check: Node connectivity for interface “eth1”

INFO: WARNING:

INFO: Make sure IP address “eth1 : 10.37.4.122 [10.37.4.0] ” is up and is a valid IP address on node “rac1”

INFO: ERROR:

INFO: PRVF-7616 : Node connectivity failed for subnet “10.37.4.0” between “rac2 - eth1 : 10.37.4.121” and “rac1 - eth1 : 10.37.4.122”

INFO: ERROR:

INFO: PRVF-7616 : Node connectivity failed for subnet “10.37.4.0” between “rac2 - eth1 : 10.37.4.124” and “rac1 - eth1 : 10.37.4.122”

INFO: ERROR:

INFO: PRVF-7616 : Node connectivity failed for subnet “10.37.4.0” between “rac1 - eth1 : 10.37.4.120” and “rac1 - eth1 : 10.37.4.122”

INFO: ERROR:

INFO: PRVF-7616 : Node connectivity failed for subnet “10.37.4.0” between “rac1 - eth1 : 10.37.4.122” and “rac1 - eth1 : 10.37.4.131”

INFO: Node connectivity failed for interface “eth1”

INFO: ERROR:

INFO: PRVF-7617 : Node connectivity between “rac2 : 10.37.4.121” and “rac1 : 10.37.4.122” failed

INFO: TCP connectivity check failed for subnet “10.37.4.0”

解决方法：暂无。

./asmca 创建oracle asm磁盘

 ORACLE软件安装

yum -y install

binutils-2.19

gcc-4.3

gcc-c++-4.3

glibc-2.9

glibc-devel-2.9

ksh-93t

libstdc++33-3.3.3

libstdc++43-4.3.3_20081022

libstdc++43-devel-4.3.3_20081022

libaio-0.3.104

libaio-devel-0.3.104

libgcc43-4.3.3_20081022

libstdc++-devel-4.3

make-3.81

sysstat-8.1.5

./runIstaller

案例：

CRS-5017: The resource action “ora.orcl.db start” encountered the following error:

ORA-01078: failure in processing system parameters

ORA-17510: Attempt to do i/o beyond file size

. For details refer to “(:CLSN00107:)” in “/u01/app/11.2.0/grid/log/rac2/agent/crsd/oraagent_oracle/oraagent_oracle.log”.

 ORACLE实例安装

./dbca

PRVF-10208 : Following error occurred during the VIP subnet configuration check PRCT-1302 : The “OCR,” has an invalid IP address format ? Cause:?An error occurred while performing the VIP subnet check. ? Action:?Look at the accompanying messages for details on the cause of failure.

 RAC测试：

CRS相关及操作

crs_stat -t -v ;查看资源状态

crsctl check crs;检查crs(cluster resource system)

crsctl status nodeapps -n rac1;//查看节点上的服务

crsctl start|stop crs;//

crsctl start|stop|status nodeapps -n rac1

crsctl start|stop listener -n rac1 ;//

crsctl check cluster;

Crsctl start resource ora.oc4j

在oracle用户下

crs_start -all;//启动所有crs服务

crs_stop -all;//停止所有crs服务

$ORACLE_HOME/bin/crsctl start resource|server|type;//

crsctl check ctss;检查集群时间同步服务

olsnodes -i -n -s;查看节点信息

crsctl query crs activeversion;

crsctl query css votedisk;

crsctl query dns -servers;

crsctl stop cluster -all;//停止节点集群资源

crsctl stop crs 或/etc/init.d/init.crs stop或crs_stop -all;//停止CRS

crsctl start crs或/etc/init.d/inint.crs start ;//启动 CRS

/etc/init.d/init.crs disable|enable;//禁止|允许集群在系统重启后自动启动

SRVCTL 命令可以控制RAC数据库中的实例，监听和服务。通常SRVCTL在oracle用户下执行，

srvctl status database -d jhdb ;//查看数据库状态

srvctl status nodeapps -n rac1;//查询节点资源(VIP、网络、GSD、ONS)

srvctl config scan;

srvctl config scan_listener;

srvctl status|start|stop database -d jhdb; //查询jhdb运行状态

srvctl status|start|stop instance -d jhdb -i jhdb1; //查询数据库实例状态

svrctl config database -d jhdb

srvctl status|start|stop nodeapps -n rac1 ;//查询节点应用状态

srvctl start|status|stop asm -n rac1 [-i +ASM1 ] [-o ]

srvctl getenv database -d jhdb [-i jhdb1 [] ]

srvctl setenv database -d jhdb -t LANG=en;

svrctl remove database -d jhdb ;//移除OCR中所有的数据库

srvctl remove instance -d jhdb -i rac1

srvctl stop asm -n rac1 -f

srvctl stop vip -n rac1 -f

srvctl stop oc4j ;//

srvctl stop scan -f ;//

srvctl stop diskgroup -g ARCH -n rac1,rac2 -f ;// 关闭磁盘组

srvctl add database -d jhdb -o $ORACLE_HOME [-m ][-p][-A

nslang oracle_RAC11g+DG 高可用容灾方案部署相关推荐

周五下午3.5h直播丨今年第1期大咖讲坛：数据库高可用容灾方案的实践与探索...
03月12日 14:00 - 17:30 线上直播活动概述随着互联网应用的高速发展,海量数据呈爆炸式增长,肩负信息系统存储和管理使命的数据库技术,在守护企业核心资产中,发挥着日益重要的决定性作用. ...
强势回归丨2021数据库大咖讲坛(第1期)：数据库高可用容灾方案的实践与探索
活动概述随着互联网应用的高速发展,海量数据呈爆炸式增长,肩负信息系统存储和管理使命的数据库技术,在守护企业核心资产中,发挥着日益重要的决定性作用. 对于数据库运维的重要环节--容灾技术来讲,从数据 ...
技术解读丨GaussDB数仓高可用容灾利器之逻辑备份
摘要:GaussDB数仓的Roach工具,同时提供物理备份和逻辑备份两种主要形态的备份.逻辑备份针对数据库的逻辑对象进行抽取和备份,能够有效地应对单表.schema级等较细粒度的备份,较为灵活和便利. ...
Mysql 5.7 主从高可用容灾最佳实践
Mysql容灾环境规划主机名 IP/Port ROLE OS Version MySQL Version GTID Mode Binlog Format Lixora 192.168.1.99/33 ...
SharePlex for Oracle应用系统高可用和容灾方案
第1章前言在企业信息化进程不断加快的今天,保持业务的连续性是企业用户进行数据存储时必须考虑的重要方面.灾难的出现可能导致生产停顿.客户满意度 ...
【软件工程】容错、高可用、灾备の辨析
容错.高可用.灾备标题里面的三个术语,很容易混淆,专业人员有时也会用错. 本文就用图片解释它们有何区别. 容错容错(fault tolerance)指的是, 发生故障时,系统还能继续运行. 飞机有 ...
世界领先！详解蚂蚁金服自研数据库OceanBase的高可用及容灾方案
小蚂蚁说: 关于蚂蚁金服自研的金融级分布式关系型数据库OceanBase的故事相信大家已经不再陌生了(新来的同学可以移步<厉害了,蚂蚁金服!创造了中国自己的数据库OceanBase>了解更 ...
微博更经济的异地容灾方案是怎么搞的
写在前面中国的互联网独角兽的体量都是非常大的,由于中国人口众多,任何一家互联网企业想在中国的互联网圈子立足,都需要生长到一个非常大的规模,也就是说这家独角兽企业承载的数据与服务的量都相当巨大. 在如 ...
混合云存储开启企业上云新路径--阿里云混合云备份容灾方案发布
摘要:当前,数据已经成为了企业的核心资产.而如果数据中心发生故障不仅会给企业带来巨大损失,甚至会直接迫使企业走向倒闭.对于企业而言,每一字节业务数据的丢失都是一场重大的灾难!那么,如何保证企业的核心数 ...

nslang oracle_RAC11g+DG 高可用容灾方案部署

nslang oracle_RAC11g+DG 高可用容灾方案部署相关推荐

最新文章

热门文章