Corosync:

corosync: votes

corosync: votequorum

cman+corosync

cman+rgmanager, cman+pacemaker

corosync+pacemaker

前提

1)本配置共有两个测试节点,分别hadoop1.abc.com和hadoop2.abc.com,相的IP地址分别为172.16.100.15和172.16.100.16;

2)集群服务为apache的httpd服务;

3)提供web服务的地址为172.16.100.11,即vip;

4)系统为CentOS 6.4 64bits

1、准备工作

为了配置一台Linux主机成为HA的节点,通常需要做出如下的准备工作:

1)所有节点的主机名称和对应的IP地址解析服务可以正常工作,且每个节点的主机名称需要跟"uname -n“命令的结果保持一致;因此,需要保证两个节点上的/etc/hosts文件均为下面的内容:

192.168.1.3   hadoop1.abc.com hadoop1
192.168.1.4   hadoop2.abc.com hadoop2

为了使得重新启动系统后仍能保持如上的主机名称,还分别需要在各节点执行类似如下的命令:

Node1:

# sed -i 's@\(HOSTNAME=\).*@\1hadoop1.abc.com@g'  /etc/sysconfig/network
# hostname hadoop1.abc.com

Node2:

# sed -i 's@\(HOSTNAME=\).*@\1hadoop2.abc.com@g' /etc/sysconfig/network
# hostname hadoop2.abc.com

2、安装pacemaker

[root@hadoop1 corosync]# yum install pacemaker
[root@hadoop2 corosync]# yum install pacemaker

3、配置corosync

[root@hadoop1 ~]# yum install corosync
[root@hadoop1 ~]# cd /etc/corosync/
[root@hadoop1 corosync]# ll
总用量 16
-rw-r--r--. 1 root root 2663 10月 15 2014 corosync.conf.example
-rw-r--r--. 1 root root 1073 10月 15 2014 corosync.conf.example.udpu
drwxr-xr-x. 2 root root 4096 10月 15 2014 service.d
drwxr-xr-x. 2 root root 4096 10月 15 2014 uidgid.d
[root@hadoop1 corosync]# cp corosync.conf.example corosync.conf
[root@hadoop1 corosync]# vim corosync.conf
接着编辑corosync.conf,添加如下内容:表示corosync启动自动启动pacemaker
service {ver:  0name: pacemaker# use_mgmtd: yes
}aisexec {user: rootgroup:  root
}
并设定此配置文件中 bindnetaddr后面的IP地址为你的网卡所在网络的网络地址,我们这里的两个节点在192.168.1.0网络,因此这里将其设定为172.16.0.0;如下
bindnetaddr: 172.16.0.0

4、安装crmsh

RHEL自6.4起不再提供集群的命令行配置工具crmsh,转而使用pcs;如果你习惯了使用crm命令,可下载相关的程序包自行安装即可。crmsh依赖于pssh,因此需要一并下载。

[root@hadoop1 ~]# cd /etc/yum.repos.d/
[root@hadoop1 yum.repos.d]# wget http://download.opensuse.org/repositories/network:ha-clustering:Stable/RedHat_RHEL-6/network:ha-clustering:Stable.repo
[root@hadoop1 yum.repos.d]# yum install crmsh
[root@hadoop1 yum.repos.d]# yum install pssh
[root@hadoop1 corosync]# ll
总用量 28
-rw-r--r--. 1 root root  989 7月  14 19:05 \
-r--------. 1 root root  128 7月  14 19:30 authkey //自动征收authkey文件了
-rw-r--r--. 1 root root 2811 7月  14 19:15 corosync.conf
-rw-r--r--. 1 root root 2663 10月 15 2014 corosync.conf.example
-rw-r--r--. 1 root root 1073 10月 15 2014 corosync.conf.example.udpu
drwxr-xr-x. 2 root root 4096 10月 15 2014 service.d
drwxr-xr-x. 2 root root 4096 10月 15 2014 uidgid.d

将corosync和authkey复制至hadoop2:

[root@hadoop1 corosync]# scp -p authkey corosync.conf hadoop2:/etc/corosync/
authkey                                       100%  128     0.1KB/s   00:00
corosync.conf                                 100% 2811     2.8KB/s   00:00

5、启动corosync

[root@hadoop1 corosync]# service corosync start
Starting Corosync Cluster Engine (corosync):               [确定]
[root@hadoop1 corosync]# ssh hadoop2 'service corosync start'
Starting Corosync Cluster Engine (corosync): [确定]
[root@hadoop1 corosync]#

查看corosync引擎是否正常启动:

[root@hadoop1 cluster]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Jul 14 19:36:33 corosync [MAIN  ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service.
Jul 14 19:36:33 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

查看初始化成员节点通知是否正常发出:

[root@hadoop1 cluster]# grep  TOTEM  /var/log/cluster/corosync.log
Jul 14 19:36:33 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Jul 14 19:36:33 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jul 14 19:36:33 corosync [TOTEM ] The network interface [192.168.1.3] is now up.
Jul 14 19:36:33 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.

检查启动过程中是否有错误产生。下面的错误信息表示packmaker不久之后将不再作为corosync的插件运行,因此,建议使用cman作为集群基础架构服务;此处可安全忽略。

[root@hadoop1 cluster]# grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources
Jul 14 19:36:33 corosync [pcmk  ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin for Corosync. The plugin is not supported in this environment and will be removed very soon.
Jul 14 19:36:33 corosync [pcmk  ] ERROR: process_ais_conf:  Please see Chapter 8 of 'Clusters from Scratch' (http://www.clusterlabs.org/doc) for details on using Pacemaker with CMAN

查看pacemaker是否正常启动:

[root@hadoop1 cluster]# grep pcmk_startup /var/log/cluster/corosync.log
Jul 14 19:36:33 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
Jul 14 19:36:33 corosync [pcmk  ] Logging: Initialized pcmk_startup
Jul 14 19:36:33 corosync [pcmk  ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Jul 14 19:36:33 corosync [pcmk  ] info: pcmk_startup: Service: 9
Jul 14 19:36:33 corosync [pcmk  ] info: pcmk_startup: Local hostname: hadoop1.abc.com

如果上面命令执行均没有问题,接着可以执行如下命令启动hadoop2上的corosync

[root@hadoop1 ~]# ssh hadoop2 -- /etc/init.d/corosync startStarting Corosync Cluster Engine (corosync): [确定]

注意:启动hadoop2需要在hadoop1上使用如上命令进行,不要在hadoop2节点上直接启动。下面是node1上的相关日志。

 [root@hadoop1 ~]# tail /var/log/cluster/corosync.log
Jul 15 15:44:28 [1771] hadoop1.abc.com    pengine:     info: determine_online_status:  Node hadoop2.abc.com is online
Jul 15 15:44:28 [1771] hadoop1.abc.com    pengine:   notice: stage6:  Delaying fencing operations until there are resources to manage
Jul 15 15:44:28 [1772] hadoop1.abc.com       crmd:     info: do_state_transition:  State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Jul 15 15:44:28 [1772] hadoop1.abc.com       crmd:     info: do_te_invoke:  Processing graph 6 (ref=pe_calc-dc-1436946268-37) derived from /var/lib/pacemaker/pengine/pe-input-14.bz2
Jul 15 15:44:28 [1772] hadoop1.abc.com       crmd:   notice: run_graph:  Transition 6 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-14.bz2): Complete
Jul 15 15:44:28 [1772] hadoop1.abc.com       crmd:     info: do_log:  FSA: Input I_TE_SUCCESS from notify_crmd() received in state S_TRANSITION_ENGINE
Jul 15 15:44:28 [1772] hadoop1.abc.com       crmd:   notice: do_state_transition:  State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Jul 15 15:44:28 [1771] hadoop1.abc.com    pengine:   notice: process_pe_message:  Calculated Transition 6: /var/lib/pacemaker/pengine/pe-input-14.bz2
Jul 15 15:44:28 [1771] hadoop1.abc.com    pengine:   notice: process_pe_message:  Configuration ERRORs found during PE processing.  Please run "crm_verify -L" to identify issues.
Jul 15 15:44:33 [1767] hadoop1.abc.com        cib:     info: cib_process_ping:  Reporting our current digest to hadoop1.abc.com: 24973b4c6ef4c32f7c580bdd07cc1753 for 0.5.28 (0x277e390 0)

如果安装了crmsh,可使用如下命令查看集群节点的启动状态:

[root@hadoop1 ~]# crm status
Last updated: Wed Jul 15 15:49:09 2015
Last change: Wed Jul 15 15:37:07 2015
Stack: classic openais (with plugin)
Current DC: hadoop1.abc.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
0 Resources configuredOnline: [ hadoop1.abc.com hadoop2.abc.com ]

6、

配置集群的工作属性,禁用stonith

corosync默认启用了stonith,而当前集群并没有相应的stonith设备,因此此默认配置目前尚不可用,这可以通过如下命令验正:

[root@hadoop1 ~]# crm_verify -L -V
   error: unpack_resources:  Resource start-up disabled since no STONITH resources have been definederror: unpack_resources:  Either configure some or disable STONITH with the stonith-enabled optionerror: unpack_resources:  NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid

可以通过如下命令先禁用stonith:

[root@hadoop1 ~]# crm configure

crm(live)configure# property  stonith-enabled=false

使用如下命令查看当前的配置信息:

[root@hadoop1 ~]# crm configure show
node hadoop1.abc.com
node hadoop2.abc.com
property cib-bootstrap-options: \dc-version=1.1.11-97629de \cluster-infrastructure="classic openais (with plugin)" \expected-quorum-votes=2stonith-enabled=false

7、为集群添加集群资源

corosync支持heartbeat,LSB和ocf等类型的资源代理,目前较为常用的类型为LSB和OCF两类,stonith类专为配置stonith设备而用;

可以通过如下命令查看当前集群系统所支持的类型:

[root@hadoop1 ~]# crm ra
crm(live)ra#helpcd             Navigate the level structureclasses        List classes and providershelp           Show help (help topics for list of topics)info           Show meta data for a RAlist           List RA for a class (and provider)ls             List levels and commandsproviders      Show providers for a RA and a classquit           Exit the interactive shellup             Go back to previous level

列出类别

crm(live)ra# classes
lsb
ocf / heartbeat pacemaker
service
stonith

如果想要查看某种类别下的所用资源代理的列表,可以使用类似如下命令实现:

# crm ra list lsb
# crm ra list ocf heartbeat
# crm ra list ocf pacemaker
# crm ra list stonith

crm(live)ra# list ocf
CTDB                ClusterMon          Delay               Dummy
Filesystem          HealthCPU           HealthSMART         IPaddr
IPaddr2             IPsrcaddr           LVM                 MailTo
Route               SendArp             Squid               Stateful
SysInfo             SystemHealth        VirtualDomain       Xinetd
apache              conntrackd          controld            db2
dhcpd               ethmonitor          exportfs            iSCSILogicalUnit
mysql               named               nfsnotify           nfsserver
pgsql               ping                pingd               postfix
remote              rsyncd              symlink             tomcat
# crm ra info [class:[provider:]]resource_agent

如:crm(live)ra# info ocf:heartbeat:IPaddr

8、接下来要创建的web集群创建一个IP地址资源,以在通过集群提供web服务时使用;这可以通过如下方式实现:

语法:
primitive <rsc> [<class>:[<provider>:]]<type>
          [params attr_list]
          [operations id_spec]
            [op op_type [<attribute>=<value>...] ...]

op_type :: start | stop | monitor

例子:

 primitive apcfence stonith:apcsmart \params ttydev=/dev/ttyS0 hostlist="node1 node2" \op start timeout=60s \op monitor interval=30m timeout=60s

应用:

crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.12
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node hadoop1.abc.com
node hadoop2.abc.com
primitive webip IPaddr \params ip=192.168.1.12
property cib-bootstrap-options: \dc-version=1.1.11-97629de \cluster-infrastructure="classic openais (with plugin)" \expected-quorum-votes=2 \stonith-enabled=false
[root@hadoop1 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host loinet6 ::1/128 scope host valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000link/ether 00:0c:29:50:3b:a4 brd ff:ff:ff:ff:ff:ffinet 192.168.1.3/24 brd 192.168.1.255 scope global eth0inet 192.168.1.12/24 brd 192.168.1.255 scope global secondary eth0inet6 fe80::20c:29ff:fe50:3ba4/64 scope link valid_lft forever preferred_lft forever
[root@hadoop2~]# ssh hadoop1 '/etc/init.d/corosync stop'
Signaling Corosync Cluster Engine (corosync) to terminate: [确定]
Waiting for corosync services to unload:.[确定]
[root@hadoop2~]# crm status
Last updated: Wed Jul 15 23:07:07 2015
Last change: Wed Jul 15 21:53:01 2015
Stack: classic openais (with plugin)
Current DC: hadoop1.abc.com - partition WITHOUT quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
1 Resources configuredOnline: [ hadoop2.abc.com ]
OFFLINE: [ hadoop1.abc.com ]

上面的信息显示hadoop1.abc.com已经离线,但资源WebIP却没能在hadoop2.abc.com上启动。这是因为此时的集群状态为"WITHOUT quorum",即已经失去了quorum,此时集群服务本身已经不满足正常运行的条件,这对于只有两节点的集群来讲是不合理的。因此,我们可以通过如下的命令来修改忽略quorum不能满足的集群状态检查:

[root@hadoop2 ~]# crm
crm(live)# configure
crm(live)configure# property no-quorum-policy=ignore

正常启动hadoop1.abc.com后,集群资源WebIP很可能会重新从hadoop2.abc.com转移回hadoop1.abc.com。资源的这种在节点间每一次的来回流动都会造成那段时间内其无法正常被访问,所以,我们有时候需要在资源因为节点故障转移到其它节点后,即便原来的节点恢复正常也禁止资源再次流转回来。这可以通过定义资源的黏性(stickiness)来实现。在创建资源时或在创建资源后,都可以指定指定资源黏性。

资源黏性值范围及其作用:
0:这是默认选项。资源放置在系统中的最适合位置。这意味着当负载能力“较好”或较差的节点变得可用时才转移资源。此选项的作用基本等同于自动故障回复,只是资源可能会转移到非之前活动的节点上;
大于0:资源更愿意留在当前位置,但是如果有更合适的节点可用时会移动。值越高表示资源越愿意留在当前位置;
小于0:资源更愿意移离当前位置。绝对值越高表示资源越愿意离开当前位置;
INFINITY:如果不是因节点不适合运行资源(节点关机、节点待机、达到migration-threshold 或配置更改)而强制资源转移,资源总是留在当前位置。此选项的作用几乎等同于完全禁用自动故障回复;
-INFINITY:资源总是移离当前位置;

9、

结合上面已经配置好的IP地址资源,将此集群配置成为一个active/passive模型的web(httpd)服务集群

为了将此集群启用为web(httpd)服务器集群,我们得先在各节点上安装httpd,并配置其能在本地各自提供一个测试页面。

[root@hadoop1 ~]# echo "<h1>hadoop1</h1>">/var/www/html/index.html
[root@hadoop1 ~]# service httpd stop
停止 httpd:                                               [失败]
[root@hadoop1 ~]# service httpd start
正在启动 httpd:                                           [确定]
[root@hadoop1 ~]# service httpd stop
停止 httpd:                                               [确定]
[root@hadoop1 ~]# chkconfig httpd off
crm(live)configure# primitive webserver lsb:httpd
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node hadoop1.abc.com
node hadoop2.abc.com
primitive webip IPaddr \params ip=192.168.1.12
primitive webserver lsb:httpd
property cib-bootstrap-options: \dc-version=1.1.11-97629de \cluster-infrastructure="classic openais (with plugin)" \expected-quorum-votes=2 \stonith-enabled=false

接下来我们将此httpd服务添加为集群资源。将httpd添加为集群资源有两处资源代理可用:lsb和ocf:heartbeat,为了简单起见,我们这里使用lsb类型:

首先可以使用如下命令查看lsb类型的httpd资源的语法格式:

crm(live)# ra info lsb:httpd
start and stop Apache HTTP Server (lsb:httpd)
The Apache HTTP Server is an efficient and extensible  \server implementing the current HTTP standards.
Operations' defaults (advisory minimum):start         timeout=15stop          timeout=15status        timeout=15restart       timeout=15force-reload  timeout=15monitor       timeout=15 interval=15

接下来新建资源WebSite:

crm(live)# configure primitive WebSever lsb:httpd
configure  corosync
crm(live)# configure
crm(live)configure# verify
crm(live)configure# commit
INFO: apparently there is nothing to commit
INFO: try changing something first
crm(live)configure# show
node hadoop1.abc.com
node hadoop2.abc.com
primitive WebSever lsb:httpd
primitive webip IPaddr \params ip=192.168.1.12
primitive webserver lsb:httpd
property cib-bootstrap-options: \dc-version=1.1.11-97629de \cluster-infrastructure="classic openais (with plugin)" \expected-quorum-votes=2 \stonith-enabled=false

先停止,再删除

[root@hadoop1 ~]# crm status
Last updated: Thu Jul 16 01:15:18 2015
Last change: Thu Jul 16 01:11:16 2015
Stack: classic openais (with plugin)
Current DC: hadoop2.abc.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configuredOnline: [ hadoop1.abc.com hadoop2.abc.com ]webip (ocf::heartbeat:IPaddr): Started hadoop1.abc.com webserver (lsb:httpd): Started hadoop2.abc.com WebSever (lsb:httpd): Started hadoop2.abc.com
crm(live)resource# stop WebSever
crm(live)resource# status WebSever
resource WebSever is NOT running

验证一下

[root@hadoop1 ~]# crm status
Last updated: Thu Jul 16 05:11:27 2015
Last change: Thu Jul 16 05:08:50 2015
Stack: classic openais (with plugin)
Current DC: hadoop2.abc.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
3 Resources configuredOnline: [ hadoop1.abc.com hadoop2.abc.com ]webip (ocf::heartbeat:IPaddr): Started hadoop1.abc.com webserver (lsb:httpd): Started hadoop2.abc.com

让hadoop1成为备节点

crm(live)node# standby hadoop1.abc.com
crm(live)node# cd ..
crm(live)# status
Last updated: Thu Jul 16 05:28:46 2015
Last change: Thu Jul 16 05:28:32 2015
Stack: classic openais (with plugin)
Current DC: hadoop2.abc.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configuredNode hadoop1.abc.com: standby
Online: [ hadoop2.abc.com ]webip (ocf::heartbeat:IPaddr): Started hadoop2.abc.com webserver (lsb:httpd): Started hadoop2.abc.com

此时在浏览器输入192.168.1.12显示的网页内容是haddop2

10、定义排列约束

crm(live)configure# colocation webserver_with_webip -inf: webserver webip
crm(live)configure# show xml
<?xml version="1.0" ?>
<cib num_updates="2" dc-uuid="hadoop2.abc.com" crm_feature_set="3.0.9" validate-with="pacemaker-2.0" epoch="20" admin_epoch="0" cib-last-written="Thu Jul 16 05:42:29 2015" have-quorum="1"><configuration><crm_config><cluster_property_set id="cib-bootstrap-options"><nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.11-97629de"/><nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="classic openais (with plugin)"/><nvpair id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes" value="2"/><nvpair name="stonith-enabled" value="false" id="cib-bootstrap-options-stonith-enabled"/></cluster_property_set></crm_config><nodes><node id="hadoop2.abc.com" uname="hadoop2.abc.com"><instance_attributes id="nodes-hadoop2.abc.com"><nvpair id="nodes-hadoop2.abc.com-standby" name="standby" value="off"/></instance_attributes></node><node id="hadoop1.abc.com" uname="hadoop1.abc.com"><instance_attributes id="nodes-hadoop1.abc.com"><nvpair id="nodes-hadoop1.abc.com-standby" name="standby" value="on"/></instance_attributes></node></nodes><resources><primitive id="webip" class="ocf" provider="heartbeat" type="IPaddr"><instance_attributes id="webip-instance_attributes"><nvpair name="ip" value="192.168.1.12" id="webip-instance_attributes-ip"/></instance_attributes></primitive><primitive id="webserver" class="lsb" type="httpd"/></resources><constraints>以下这一行显示,谁先谁后<rsc_colocation id="webserver_with_webip" score="-INFINITY" rsc="webserver" with-rsc="webip"/></constraints></configuration>

11、定义顺序约束

先启动webip再启动webserver

crm(live)configure# order webip_befor_webserver  Mandatory: webip:start webserver
crm(live)configure# show xm
<?xml version="1.0" ?>
<cib num_updates="1" dc-uuid="hadoop2.abc.com" crm_feature_set="3.0.9" validate-with="pacemaker-2.0" epoch="22" admin_epoch="0" cib-last-written="Thu Jul 16 06:03:57 2015" have-quorum="1"><configuration><crm_config><cluster_property_set id="cib-bootstrap-options"><nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.11-97629de"/><nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="classic openais (with plugin)"/><nvpair id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes" value="2"/><nvpair name="stonith-enabled" value="false" id="cib-bootstrap-options-stonith-enabled"/></cluster_property_set></crm_config><nodes><node id="hadoop2.abc.com" uname="hadoop2.abc.com"><instance_attributes id="nodes-hadoop2.abc.com"><nvpair id="nodes-hadoop2.abc.com-standby" name="standby" value="off"/></instance_attributes></node><node id="hadoop1.abc.com" uname="hadoop1.abc.com"><instance_attributes id="nodes-hadoop1.abc.com"><nvpair id="nodes-hadoop1.abc.com-standby" name="standby" value="on"/></instance_attributes></node></nodes><resources><primitive id="webip" class="ocf" provider="heartbeat" type="IPaddr"><instance_attributes id="webip-instance_attributes"><nvpair name="ip" value="192.168.1.12" id="webip-instance_attributes-ip"/></instance_attributes></primitive><primitive id="webserver" class="lsb" type="httpd"/>
<?xml version="1.0" ?>
<cib num_updates="1" dc-uuid="hadoop2.abc.com" crm_feature_set="3.0.9" validate-with="pacemaker-2.0" epoch="22" admin_epoch="0" cib-last-written="Thu Jul 16 06:03:57 2015" have-quorum="1"><configuration><crm_config><cluster_property_set id="cib-bootstrap-options"><nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.11-97629de"/><nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="classic openais (with plugin)"/><nvpair id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes" value="2"/><nvpair name="stonith-enabled" value="false" id="cib-bootstrap-options-stonith-enabled"/></cluster_property_set></crm_config><nodes><node id="hadoop2.abc.com" uname="hadoop2.abc.com"><instance_attributes id="nodes-hadoop2.abc.com"><nvpair id="nodes-hadoop2.abc.com-standby" name="standby" value="off"/></instance_attributes></node><node id="hadoop1.abc.com" uname="hadoop1.abc.com"><instance_attributes id="nodes-hadoop1.abc.com"><nvpair id="nodes-hadoop1.abc.com-standby" name="standby" value="on"/></instance_attributes></node></nodes><resources><primitive id="webip" class="ocf" provider="heartbeat" type="IPaddr"><instance_attributes id="webip-instance_attributes"><nvpair name="ip" value="192.168.1.12" id="webip-instance_attributes-ip"/></instance_attributes></primitive><primitive id="webserver" class="lsb" type="httpd"/>
<?xml version="1.0" ?>
<cib num_updates="1" dc-uuid="hadoop2.abc.com" crm_feature_set="3.0.9" validate-with="pacemaker-2.0" epoch="22" admin_epoch="0" cib-last-written="Thu Jul 16 06:03:57 2015" have-quorum="1"><configuration><crm_config><cluster_property_set id="cib-bootstrap-options"><nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.11-97629de"/><nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="classic openais (with plugin)"/><nvpair id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes" value="2"/><nvpair name="stonith-enabled" value="false" id="cib-bootstrap-options-stonith-enabled"/></cluster_property_set></crm_config><nodes><node id="hadoop2.abc.com" uname="hadoop2.abc.com"><instance_attributes id="nodes-hadoop2.abc.com"><nvpair id="nodes-hadoop2.abc.com-standby" name="standby" value="off"/></instance_attributes></node><node id="hadoop1.abc.com" uname="hadoop1.abc.com"><instance_attributes id="nodes-hadoop1.abc.com"><nvpair id="nodes-hadoop1.abc.com-standby" name="standby" value="on"/></instance_attributes></node></nodes><resources><primitive id="webip" class="ocf" provider="heartbeat" type="IPaddr"><instance_attributes id="webip-instance_attributes"><nvpair name="ip" value="192.168.1.12" id="webip-instance_attributes-ip"/></instance_attributes></primitive><primitive id="webserver" class="lsb" type="httpd"/></resources><constraints><rsc_order id="webip_befor_webserver" kind="Mandatory" first="webip" first-action="start" then="webserver"/><rsc_colocation id="webserver_with_webip" score="-INFINITY" rsc="webserver" with-rsc="webip"/></constraints></configuration>
</cib>

12、更倾向运行在hadoop2节点上

crm(live)configure# location webserver_hadoop2 webserver 200: hadoop2.abc.com
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd ..

13、还可以定义资源默认属性

14、定义监控功能

crm(live)resource# stop webip
crm(live)resource# stop webserver
crm(live)resource# statuswebip (ocf::heartbeat:IPaddr): Stopped webserver (lsb:httpd): Stopped

资源非法关掉,最后做一次清理

crm(live)resource# cleanup webip
Cleaning up webip on hadoop1.abc.com
Cleaning up webip on hadoop2.abc.com
Waiting for 2 replies from the CRMd.. OK
crm(live)resource# cleanup webserver
Cleaning up webserver on hadoop1.abc.com
Cleaning up webserver on hadoop2.abc.com
Waiting for 2 replies from the CRMd.. OK
crm(live)resource# cd ..
crm(live)# configure
crm(live)configure# help monitor
crm(live)configure#
crm(live)configure# monitor webserver 20s:10s
crm(live)configure# verify
WARNING: webserver: specified timeout 10s for monitor is smaller than the advised 15
crm(live)configure# edit
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
crm(live)configure# cd ..
crm(live)# status
Last updated: Thu Jul 16 23:13:42 2015
Last change: Thu Jul 16 22:43:40 2015
Stack: classic openais (with plugin)
Current DC: hadoop2.abc.com - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configuredOnline: [ hadoop1.abc.com hadoop2.abc.com ]
crm(live)# resource
crm(live)resource# start webip
crm(live)resource# start webserver
[root@hadoop2 ~]# ss -tnl | grep 80
LISTEN     0      128                      :::80                      :::*
[root@hadoop2 ~]# service httpd stop
停止 httpd:                                               [确定]
[root@hadoop2 ~]# tail -f /var/log/cluster/corosync.log
Jul 16 23:23:03 [7736] hadoop2.abc.com        cib:     info: cib_perform_op:  Diff: --- 0.61.25 2
Jul 16 23:23:03 [7736] hadoop2.abc.com        cib:     info: cib_perform_op:  Diff: +++ 0.61.26 (null)
Jul 16 23:23:03 [7736] hadoop2.abc.com        cib:     info: cib_perform_op:  +  /cib:  @num_updates=26
Jul 16 23:23:03 [7736] hadoop2.abc.com        cib:     info: cib_perform_op:  +  /cib/status/node_state[@id='hadoop2.abc.com']/lrm[@id='hadoop2.abc.com']/lrm_resources/lrm_resource[@id='webserver']/lrm_rsc_op[@id='webserver_monitor_30000']:  @transition-key=1:45:0:5c125c03-7d52-4d11-b5ee-ec4bc424ed07, @transition-magic=0:0;1:45:0:5c125c03-7d52-4d11-b5ee-ec4bc424ed07, @call-id=68, @last-rc-change=1437060183, @exec-time=31
Jul 16 23:23:03 [7736] hadoop2.abc.com        cib:     info: cib_process_request:  Completed cib_modify operation for section status: OK (rc=0, origin=hadoop2.abc.com/crmd/278, version=0.61.26)
Jul 16 23:23:03 [7741] hadoop2.abc.com       crmd:     info: match_graph_event:  Action webserver_monitor_30000 (1) confirmed on hadoop2.abc.com (rc=0)
Jul 16 23:23:03 [7741] hadoop2.abc.com       crmd:   notice: run_graph:  Transition 45 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-95.bz2): Complete
Jul 16 23:23:03 [7741] hadoop2.abc.com       crmd:     info: do_log:  FSA: Input I_TE_SUCCESS from notify_crmd() received in state S_TRANSITION_ENGINE
Jul 16 23:23:03 [7741] hadoop2.abc.com       crmd:   notice: do_state_transition:  State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Jul 16 23:23:08 [7736] hadoop2.abc.com        cib:     info: cib_process_ping:  Reporting our current digest to hadoop2.abc.com: ae8ef3d1bb7af4518c2c6ce7c4db1f08 for 0.61.26 (0x1b35dc0 0)
[root@hadoop2 ~]# ss -tnl
State      Recv-Q Send-Q                                                                     Local Address:Port
Address:Port
LISTEN     0      128                                                                                   :::34476                                                                                :::*
LISTEN     0      128                                                                                   :::111                                                                                  :::*
LISTEN     0      128                                                                                    *:111                                                                                   *:*
LISTEN     0      128                                                                                   :::80                                                                                   :::*
LISTEN     0      128                                                                                   :::22                                                                                   :::*
LISTEN     0      128                                                                                    *:22                                                                                    *:*
LISTEN     0      128                                                                            127.0.0.1:631                                                                                   *:*
LISTEN     0      128                                                                                  ::1:631                                                                                  :::*
LISTEN     0      100                                                                                  ::1:25                                                                                   :::*
LISTEN     0      100                                                                            127.0.0.1:25                                                                                    *:*
LISTEN     0      128                                                                                    *:42907

定义多一个Vip地址

crm(live)# configure
crm(live)configure# primitive vip ocf:heartbeat:IP
ocf:heartbeat:IPaddr     ocf:heartbeat:IPaddr2    ocf:heartbeat:IPsrcaddr
crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=192.168.1.13 monitor interval=30s timeout=15s
ERROR: syntax in primitive: Unknown arguments: monitor interval=30s timeout=15s near <monitor> parsing 'primitive vip ocf:heartbeat:IPaddr params ip=192.168.1.13 monitor interval=30s timeout=15s'
crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=192.168.1.13 op monitor interval=30s timeout=15s
crm(live)configure# verify
WARNING: vip: specified timeout 15s for monitor is smaller than the advised 20s
crm(live)configure# delete vip
crm(live)configure# primitive vip ocf:heartbeat:IPaddr params ip=192.168.1.13 op monitor interval=30s timeout=20s
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# shownode hadoop1.abc.com \attributes standby=off
node hadoop2.abc.com \attributes standby=off
primitive vip IPaddr \params ip=192.168.1.13 \op monitor interval=30s timeout=20s
primitive webip IPaddr \params ip=192.168.1.12 \meta target-role=Started
primitive webserver lsb:httpd \meta target-role=Started \op monitor interval=30s timeout=15s
location webip_on_hadoop2 webip 200: hadoop2.abc.com
location webserver_on_hadoop2 webserver 200: hadoop2.abc.com
order webip_befor_webserver Mandatory: webip:start webserver
property cib-bootstrap-options: \dc-version=1.1.11-97629de \cluster-infrastructure="classic openais (with plugin)" \expected-quorum-votes=2 \stonith-enabled=false \no-quorum-policy=ignore \last-lrm-refresh=1437057604

转载于:https://blog.51cto.com/zouqingyun/1674207

corosync配置与详解相关推荐

  1. Ehcache配置参数详解

    ehcache配置参数详解 <?xml version="1.0" encoding="UTF-8"?><ehcache><dis ...

  2. Apache+PHP配置过程详解

    Apache+PHP配置过程详解 经过两晚上的奋斗终于将Apache配置PHP成功,安装配置过程中走了不少弯路,特记录之. 1.Apache配置PHP个人认为首先要注意的是Apache和PHP的版本信 ...

  3. 九爷带你了解 nginx 日志配置指令详解

    nginx日志配置指令详解 日志对于统计排错来说非常有利的. 本文总结了nginx日志相关的配置如 access_log.log_format.open_log_file_cache.log_not_ ...

  4. Cocos2d-x win7 + vs2010 配置图文详解(亲测)

    Cocos2d-x win7 + vs2010 配置图文详解(亲测) 下载最新版的cocos2d-x.打开浏览器,输入cocos2d-x.org,然后选择Download,本教程写作时最新版本为coc ...

  5. Java编程配置思路详解

    Java编程配置思路详解 SpringBoot虽然提供了很多优秀的starter帮助我们快速开发,可实际生产环境的特殊性,我们依然需要对默认整合配置做自定义操作,提高程序的可控性,虽然你配的不一定比官 ...

  6. hibenate5.1配置mysql_hibernate5.2的基本配置方法(详解)

    目标:将Student实体对象加入数据库 1.首先需要下载三个东西:hibernate,slf4j,mysql. 2.分别取他们的包导入新建的项目中,我这里的版本是:hibernate-release ...

  7. ASP.NET Core的配置(2):配置模型详解

    在上面一章我们以实例演示的方式介绍了几种读取配置的几种方式,其中涉及到三个重要的对象,它们分别是承载结构化配置信息的Configuration,提供原始配置源数据的ConfigurationProvi ...

  8. python定时任务crontab_【Python】Linux crontab定时任务配置方法(详解)

    CRONTAB概念/介绍 crontab命令用于设置周期性被执行的指令.该命令从标准输入设备读取指令,并将其存放于"crontab"文件中,以供之后读取和执行. cron 系统调度 ...

  9. oracle tns 代理配置_oracle数据库tns配置方法详解

    TNS简要介绍与应用 Oracle中TNS的完整定义:transparence Network Substrate透明网络底层,监听服务是它重要的一部分,不是全部,不要把TNS当作只是监听器. TNS ...

最新文章

  1. npm i和npm install的区别
  2. Qt 获取文件夹下所有文件
  3. 神经网络与机器学习 笔记—LMS(最小均方算法)和学习率退火
  4. 【转】TCP的SEQ和ACK的生成
  5. SAP CRM WebClient UI Selenium UiElementHandler的实现
  6. 业界对物联网技术最常见的三大误区解读
  7. pip安装报错处理+PyPi源切换教程
  8. php unserialize 实例,PHP ArrayIterator unserialize()用法及代码示例
  9. ICCV 2019 VisDrone挑战赛冠军方案解读
  10. 我的小白同事接触白鹭引擎4天,成功做了一款足球小游戏
  11. php myflow,WordPress安装使用Flowplayer简易指南
  12. 【白皮书分享】创新中国2030:释放技术红利,解锁增长动能-埃森哲.pdf(附下载链接)...
  13. .netcore 2.0 mysql_搭建连接MySql的三层架构的ASP.NetCore2.0的WebApi
  14. 使用Mat分析大堆信息
  15. e站host地址_IP地址和物理地址的区别和联系
  16. 图像融合之Poisson融合及其改进
  17. .NET Framework 3.5 安装错误:0x800F0906、0x800F081F、0x800F0907
  18. JavaScript中ubound函数
  19. php 网站访问统计插件,帝国CMS教程_网站访问统计插件使用教程_好特教程
  20. sa-token使用(源码解析 + 万字)

热门文章

  1. java.lang.IllegalArgumentException: Request header is too large的解决方法
  2. 在maven项目中解决第三方jar包依赖的问题
  3. 在Node.js中,如何从其他文件中“包含”函数?
  4. 如何在.NET中启用程序集绑定失败日志记录(Fusion)
  5. win11组策略如何恢复默认设置 windows11组策略恢复默认设置的步骤方法
  6. 获取项目中的文件流InputStream
  7. DBCP,C3P0,druid,HiKariCP连接池配置使用
  8. oracle 11g rac impdp,RAC创建DBlink并使用impdp抽取源库数据
  9. 用gpu跑_免费用GPU跑深度学习模型——如何获得极链AI云A100显卡
  10. 蓝牙:CRC原理详解(附crc16校验代码)