企业运维之七层负载均衡--Haproxy

1. 基础介绍
2. 负载均衡的类型
3. Haproxy 的部署
- 3.1 Haproxy 实现负载均衡和监控
- 3.2 日志采集
- 3.3 调度算法
- 3.5 设定备机
4. Haproxy的访问控制
- 4.1 通过调度器访问不同后端
- 4.2 用户访问黑白名单设置
- 4.3 读写分离
5. haproxy的高可用
6. fence 机制

1. 基础介绍

https://www.haproxy.org/（官方网站）

https://www.haproxy.org/download/1.8/src/haproxy-1.8.14.tar.gz（下载地址）

HAProxy提供高可用性、负载均衡以及基于TCP和HTTP应用的代理，支持虚拟主机，它是免费、快速并且可靠的一种解决方案。HAProxy特别适用于那些负载特大的web站点，这些站点通常又需要会话保持或七层处理。HAProxy运行在当前的硬件上，完全可以支持数以万计的并发连接。并且它的运行模式使得它可以很简单安全的整合进您当前的架构中，同时可以保护你的web服务器不被暴露到网络上。

HAProxy实现了一种事件驱动, 单一进程模型，此模型支持非常大的并发连接数。

第七层是 web 内容交换技术，即对访问流量的高层控制方式，通过对应用层内容的切换，将这种真正有意义的信息，结合我们对负载均衡设备的特定设置，进行服务器的选择。七层负载均衡是在四层的基础上做的，它可以根据七层的特定信息决定如何转发流量以及选择服务器实现负载均衡。
考虑到为了不造成访问流量的停滞，第七层交换技术更具优势。在接收到数据包时，它会检查Http报头，根据报头内的数据来决定将信息发送给哪台服务器，同时根据报头提供的信息判断用何种方式为个人信息或者图像视频等不同格式的内容提供服务。Http请求URL，但通过 web 内容交换技术，Http有可能请求到不同的服务器，即同一个URL请求对应了多个服务器，因为在Http发出请求时，并非建立了一个会话，而是通过负载均衡服务器建立了多个会话与真实的服务器连接。

2. 负载均衡的类型

无负载均衡：没有负载平衡的简单 Web 应用程序环境可能如下所示；

用户直接连接到Web服务器，在web 服务器上，没有负载平衡。如果单个Web服务器出现故障，用户将无法再访问到Web服务器。此外，如果许多用户试图同时访问服务器并且无法处理负载，他们可能会遇到缓慢的体验，或者可能根本无法连接。

4层负载均衡

将网络流量负载平衡到多个服务器的最简单方法是使用第4层（传输层）负载均衡。以这种方式进行负载均衡将根据 IP 范围和端口转发用户流量（即，如果请求访问web 服务器，则流量将转发到处理web 请求的后端。端口80）。

第四层交换功能的实现，也就是虚拟IP 地址( VIP) 方法，这个地址并不是与特定的计算机相连，也没有与计算机中的网络接口卡相连。它的实现过程是当数据包发送到这个VIP 地址时，通过第四层交换功能，并根据设定好的调度算法分配到一个真实的网络接口。每次TCP 请求都可以动态分配其中的一个IP 地址，从而达到负载均衡。
四层负载均衡，主要通过报文中的目标地址和端口，再加上负载均衡设备设置的服务器选择方式，决定最终选择的后端服务器。

用户访问负载均衡器，负载均衡器将用户的请求转发给后端服务器的Web后端组。无论选择哪个后端服务器，都将直接响应用户的请求。通常，Web后端中的所有服务器应该提供相同的内容 - 否则用户可能会收到不一致的内容。

七层负载均衡

7层负载均衡是更复杂的负载均衡网络流量的方法是使用第7层（应用层）负载均衡。使用第7层允许负载均衡器根据用户请求的内容将请求转发到不同的后端服务器。这种负载均衡模式允许在同一域和端口下运行多个Web应用程序服务器。

七层负载均衡，也称为“内容交换”，主要通过报文中的真正有意义的应用层内容，再加上负载均衡设备设置的服务器选择方式，决定最终选择的后端服务器。

注：一般4层 lvs 的性能好一点，但是七层 haproxy 可以做具体的策略，所以企业中一般是四层和七层负载均衡结合使用，也就是lvs->haproxy->rs。

3. Haproxy 的部署

实验环境：server1作为调度器，server2和server3作为后端的真实服务器。

3.1 Haproxy 实现负载均衡和监控

七层调度器是支持端口转发且自带健康检测的，首先在调度器上安装haproxy服务；
编辑配置文件，修改内核的最大连接数系统的最大连接数。
因为文件中的最大连接数为4000，而系统默认最大为1024。操作系统的最大允许操作数必须也做出改变。所以需要编辑/etc/security/limits.conf 文件，修改操作系统最大可操作文件数量。

[root@server1 ~]# yum install -y haproxy
[root@server1 ~]# cd /etc/haproxy/
[root@server1 haproxy]# id haproxy
uid=188(haproxy) gid=188(haproxy) groups=188(haproxy)[root@server1 haproxy]# sysctl -a |grep file
fs.file-max = 200070
fs.file-nr = 928   0   200070
fs.xfs.filestream_centisecs = 3000
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"[root@server1 haproxy]# vim /etc/security/limits.confhaproxy         -       nofile          4096
#在文件最后一行加入表示文件数为4096;表示对haproxy 服务用户做修改，默认为1024，-表示软硬权限一致

在/etc/haproxy/目录下，编辑haproxy.cfg文件，内容如下

配置文件中global表示全局配置段；defaults表示默认配置段；

[root@server1 haproxy]# vim haproxy.cfg 60         stats uri /status        #在default区域添加监控63 frontend  main *:80       #负载均衡器端采监听80端口64 #    acl url_static       path_beg       -i /static /images /javascript /sty    lesheets65 #    acl url_static       path_end       -i .jpg .gif .png .css .js66 #67 #    use_backend static          if url_static68     default_backend             app  #默认访问的服务器组是app组73 #backend static74 #    balance     roundrobin     75 #    server      static 127.0.0.1:4331 check80 backend app81     balance     roundrobin      ##轮询方式为加权轮询82     server  app1 172.25.105.2:80 check83     server  app2 172.25.105.3:80 check
[root@server1 haproxy]# systemctl start haproxy
[root@server1 haproxy]# netstat -antlp | grep :80
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      10223/haproxy

此时在网页访问http://172.25.105.1/status可以看到监控信息；

访问http://172.25.105.1/ 时可以看到负载均衡现象，也可以用 curl 172.25.105.1 来测试；

[root@westos ~]# curl 172.25.105.1
server2
[root@westos ~]# curl 172.25.105.1
server3
[root@westos ~]# curl 172.25.105.1
server2
[root@westos ~]# curl 172.25.105.1
server3

此时如果关闭一个后端，haproxy 是有健康检查的，会自动识别到。

这种直接可以看到监控信息相对来说是不太安全的，可以对其加对应的验证：

[root@server1 haproxy]# vim haproxy.cfg 60         stats uri /status61         stats auth admin:zxk[root@server1 haproxy]# systemctl restart haproxy

此时再次访问：http://172.25.105.1/status时便有认证；

3.2 日志采集

默认日志是存放在/var/log/messages下，为了方便查看用户访问时产生的日志，可以编辑/etc/rsyslog.conf文件，内容如下所示：

[root@server1 haproxy]# vim /etc/rsyslog.conf 14 # Provides UDP syslog reception15 $ModLoad imudp       #开启日志的传输格式UDP16 $UDPServerRun 51473 local7.*                                                /var/log/boot.log74 local2.*                       /var/log/haproxy.log#表示local2的所有日志存放在后面的文件中
[root@server1 haproxy]# systemctl restart rsyslog

此时重启之后还没有生成日志文件/var/log/haproxy.log ，需要经过访问几次然后才会生文件。

[root@server1 haproxy]# ll /var/log/haproxy.log
-rw------- 1 root root 12222 Jul 11 22:11 /var/log/haproxy.log
[root@server1 haproxy]# tail -n3 /var/log/haproxy.log
Jul 11 22:10:49 localhost haproxy[11761]: Proxy app started.
Jul 11 22:10:54 localhost haproxy[11762]: 172.25.105.250:45398 [11/Jul/2021:22:10:54.750] main main/<STATS> -1/-1/-1/-1/0 401 262 - - PR-- 0/0/0/0/0 0/0 "GET /status HTTP/1.1"
Jul 11 22:11:11 localhost haproxy[11762]: 172.25.105.250:45400 [11/Jul/2021:22:11:11.300] main main/<STATS> 0/0/0/0/0 200 14104 - - LR-- 1/1/0/0/0 0/0 "GET /status HTTP/1.1"

3.3 调度算法

source 算法
前面负载时，使用默认的 rr 轮询调度的算法；
修改调度算法为 source，该算法通过在静态散列表中查看源 ip 来给真正的服务器分派请求，当来源 ip 不变时，只要后端的页面还在，就不会调度的其他的页面。当 IP 不变，当后端页面找不到时才会调度到其他的页面。

[root@server1 haproxy]# vim haproxy.cfg 82 backend app83     #balance     roundrobin84     balance     source85     server  app1 172.25.105.2:80 check86     server  app2 172.25.105.3:80 check
[root@server1 haproxy]# systemctl reload haproxy

设置权重

对于某一个性能比较好的后端服务器，可以较多的分发一些请求；

[root@server1 haproxy]# vim haproxy.cfg 82 backend app83     balance     roundrobin84     #balance     source85     server  app1 172.25.105.2:80 check weight 286     server  app2 172.25.105.3:80 check
[root@server1 haproxy]# vim haproxy.cfg
[root@server1 haproxy]# systemctl reload haproxy

测试时效果如下：

[root@westos ~]# curl 172.25.105.1
server2
[root@westos ~]# curl 172.25.105.1
server2
[root@westos ~]# curl 172.25.105.1
server3
[root@westos ~]# curl 172.25.105.1
server2
[root@westos ~]# curl 172.25.105.1
server2
[root@westos ~]# curl 172.25.105.1
server3

3.5 设定备机

当后端两个服务器都挂掉之后，此时可以用备机来提示访问问题，而不是将错误的信息返回给用户；

[root@server1 haproxy]# vim haproxy.cfg 82 backend app83     balance     roundrobin84     #balance     source85     server  app1 172.25.105.2:80 check weight 286     server  app2 172.25.105.3:80 check87     server  backup 172.25.105.1:8080 backup  #本机的8080端口[root@server1 haproxy]# vim /etc/httpd/conf/httpd.conf
#修改本机的http的默认端口为808041 #Listen 12.34.56.78:8042 Listen 8080[root@server1 haproxy]# cat /var/www/html/index.html
Sorry ! Please try again later!![root@server1 haproxy]# systemctl start httpd
[root@server1 haproxy]# systemctl reload haproxy
[root@server1 haproxy]# netstat -antulp| grep :80
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      14612/haproxy
tcp6       0      0 :::8080                 :::*                    LISTEN      14619/httpd

此时访问：http://172.25.105.1/status可以看到备机信息。

此时当其他的都停掉之后，才会访问到备机，生产中备机应该只提示错误信息，让用户不在来访问。

4. Haproxy的访问控制

4.1 通过调度器访问不同后端

编辑haproxy.cfg 文件，开放访问策略。
用户访问时，以/static /images /javascript /stylesheets开头，以.jpg .gif .png .css .js结尾的访问请求，被转到后端默认的服务器组上；否则就转发到 app 服务器组上;

[root@server1 haproxy]# vim haproxy.cfg 65 frontend  main *:8066     acl url_static       path_beg       -i /static /images /javascript /stylesheets67     acl url_static       path_end       -i .jpg .gif .png .css .js68 69     use_backend static          if url_static70     default_backend             app75 backend static76     balance     roundrobin77     server      static 172.25.105.3:80 check82 backend app83     balance     roundrobin84     server  app1 172.25.105.2:80 check
[root@server1 haproxy]# systemctl reload haproxy

给server3加信息，此时在访问http://172.25.105.1/images/vim.jpg时才能访问到图片信息;

[root@server3 ~]# mkdir /var/www/html/images
[root@server3 ~]# ls
vim.jpg
[root@server3 ~]# mv vim.jpg /var/www/html/images/
[root@server3 ~]# cd /var/www/html/images/
[root@server3 images]# ls
vim.jpg

此时在访问：http://172.25.105.1/status，可以看到static和app的信息。

测试：
默认访问的是server2，当以image开头以.png.jpg开始时，才可以匹配到server3。

4.2 用户访问黑白名单设置

黑名单

源地址为172.25.105.250时，请求会被禁止；

[root@server1 haproxy]# vim haproxy.cfg 66 frontend  main *:8067     acl url_static       path_beg       -i /static /images /javascript /stylesheets68     acl url_static       path_end       -i .jpg .gif .png .css .js69 70     acl blacklist src 172.25.105.250 #黑名单，来源71     block if blacklist72 73     use_backend static          if url_static74     default_backend             app75 [root@server1 haproxy]# systemctl restart haproxy

测试：
默认访问的是后端的app当为黑名单中的用户时，会默认禁止；其他用户可以正常访问。

[root@westos ~]# curl 172.25.105.1
<html><body><h1>403 Forbidden</h1>
Request forbidden by administrative rules.
</body></html>[root@server3 ~]# curl 172.25.105.1
server2
[root@server2 ~]# curl 172.25.105.1
server2

同样还除了黑名单之外还有白名单；白名单的书写和黑名单类似，只需将blacklist换为 whitelist 即可，后面接的IP或者网段即可为白名单；访问时白名单的用户会访问到后端app，不在白名单中的用户会被禁止。

重定向

以上设置黑白名单之后，此时直接返回错误值，过于暴力，不够友好，可以将其错误信息重定向一个错误返回页面。

[root@server1 haproxy]# vim haproxy.cfg 66 frontend  main *:8067     acl url_static       path_beg       -i /static /images /javascript /stylesheets68     acl url_static       path_end       -i .jpg .gif .png .css .js69 70     acl blacklist src 172.25.105.25071     block if blacklist72     errorloc 403 http://172.25.105.1:808073 74     use_backend static          if url_static75     default_backend             app
[root@server1 haproxy]# systemctl restart haproxy

测试：对于命令的测试需要加参数-I，因为curl 默认不会显示重定向的信息；

[root@westos ~]# curl 172.25.105.1 -I
HTTP/1.1 302 Found
Cache-Control: no-cache
Content-length: 0
Location: http://172.25.0.1:8080

也可以将其重定向至别的网站门户；

[root@server1 haproxy]# vim haproxy.cfg 66 frontend  main *:8067     acl url_static       path_beg       -i /static /images /javascript /stylesheets68     acl url_static       path_end       -i .jpg .gif .png .css .js69 70     acl blacklist src 172.25.105.25071     #block if blacklist72     #errorloc 403 http://172.25.105.1:808073     redirect location http://www.baidu.com if blacklist74     75     use_backend static          if url_static76     default_backend             app77
[root@server1 haproxy]# systemctl restart haproxy

此时在用户 172.25.105.250访问服务器时会被重定向到百度。

4.3 读写分离

将读和写分别设定到不同的 RS上，编辑haproxy.cfg文件，然后重新编辑匹配的策略；

当匹配到写的规则时，去访问后端的static 172.25.105.3:80；当没有匹配到写的策略时会默认访问后端的app。

[root@server1 haproxy]# vim haproxy.cfg 66 frontend  main *:8067     acl url_static       path_beg       -i /static /images /javascript /stylesheets68     acl url_static       path_end       -i .jpg .gif .png .css .js74     acl write method PUT75     acl write method POST79     use_backend static          if write80     default_backend             app85 backend static86     balance     roundrobin87     server      static 172.25.105.3:80 check92 backend app93     balance     roundrobin94     #balance     source95     server  app1 172.25.105.2:80 check weight 296     #server  app2 172.25.105.3:80 check97     server  backup 172.25.105.1:8080 backup
[root@server1 haproxy]# systemctl reload haproxy

在server3上测试php 是否成功安装；

[root@server3 ~]# yum install -y php
[root@server3 images]# cd /var/www/html/
[root@server3 html]# vim index.php
[root@server3 html]# cat index.php     #php的发布页面可用测试php是否安装成功。
<?php
phpinfo()
?>
[root@server3 html]# systemctl restart httpd

在RS上安装php服务，在RS的 apache 默认发布目录下，写两个php文件以实现动态的写入；建立一个上传目录，设置该目录的权限为777；

[root@server2 ~]# yum install -y php
[root@server2 ~]# systemctl restart httpd  #重启apache 会自动加载php信息
[root@server2 html]# vim upload_file.php
[root@server2 html]# cat upload_file.php   #上传信息控制
<?php
if ((($_FILES["file"]["type"] == "image/gif")   #文件格式
|| ($_FILES["file"]["type"] == "image/jpeg")
|| ($_FILES["file"]["type"] == "image/pjpeg"))
&& ($_FILES["file"]["size"] < 2000000))          #2M内的大小{if ($_FILES["file"]["error"] > 0){echo "Return Code: " . $_FILES["file"]["error"] . "<br />";}else{echo "Upload: " . $_FILES["file"]["name"] . "<br />";echo "Type: " . $_FILES["file"]["type"] . "<br />";echo "Size: " . ($_FILES["file"]["size"] / 1024) . " Kb<br />";echo "Temp file: " . $_FILES["file"]["tmp_name"] . "<br />";if (file_exists("upload/" . $_FILES["file"]["name"])){echo $_FILES["file"]["name"] . " already exists. ";}else{move_uploaded_file($_FILES["file"]["tmp_name"],"upload/" . $_FILES["file"]["name"]);echo "Stored in: " . "upload/" . $_FILES["file"]["name"];}}}
else{echo "Invalid file";}
?>
[root@server2 html]# cat index.php ##php的发布页面
<html>
<body><form action="upload_file.php" method="post"
enctype="multipart/form-data">
<label for="file">Filename:</label>
<input type="file" name="file" id="file" />
<br />
<input type="submit" name="submit" value="Submit" />
</form></body>
</html>
[root@server2 html]# chmod 777 upload
[root@server2 html]# scp -r index.php upload_file.php upload server3:/var/www/html/

重启httpd服务，此时在网页可以测试读写分离访问：http://172.25.105.1/index.php来上传文件。访问http://172.25.105.1时只会访问到后端的server2的信息。

[root@server3 html]# cd upload
[root@server3 upload]# ls
vim.jpg

5. haproxy的高可用

此处为了实现高可用需要用到两台主机来做故障切换。

免密认证
对于两台主机上的操作几乎一致，所以为了避免出错，做免密认证，在一台主机上完成两台主机上的操作；

[root@server1 ~]# ssh-keygen
[root@server1 ~]# ssh-copy-id server4

在两台主机上安装高可用套件

由于默认的软件仓库中没有指定高可用的套件，所以根据镜像信息，在软件仓库中加入高可用信息；

[root@westos ~]# cd /var/www/html/rhel7
[root@westos rhel7]# ls
addons  EULA              GPL     isolinux  media.repo  repodata                 RPM-GPG-KEY-redhat-release
EFI     extra_files.json  images  LiveOS    Packages    RPM-GPG-KEY-redhat-beta  TRANS.TBL
[root@westos rhel7]# cd addons/
[root@westos addons]# ls
HighAvailability  ResilientStorage
[root@server1 yum.repos.d]# vim dvd.repo
[root@server1 yum.repos.d]# cat dvd.repo
[dvd]
name=rhel7.6
baseurl=http://172.25.105.250/rhel7
gpgcheck=0
[HighAvailability]
name=rhel7.6 HighAvailability
baseurl=http://172.25.105.250/rhel7/addons/HighAvailability
gpgcheck=0
[root@server1 yum.repos.d]# yum repolist
[root@server1 yum.repos.d]# scp dvd.repo server4:/etc/yum.repos.d/[root@server1 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python
[root@server1 ~]# ssh server4  yum install -y pacemaker pcs psmisc policycoreutils-python

开启服务

在开启服务之后，开启服务后，会自动生成名为hacluster的用户；对用户添加密码。

[root@server1 ~]# systemctl enable --now pcsd.service
[root@server1 ~]# ssh server4 systemctl  enable --now pcsd.service[root@server1 ~]# echo westos | passwd --stdin hacluster
[root@server1 ~]# ssh server4 'echo westos | passwd --stdin hacluster'

认证
配置群集节点的认证作为hacluster用户

[root@server1 ~]# pcs cluster auth server1 server4
Username: hacluster
Password:
server4: Authorized
server1: Authorized

创建集群
创建一个包含两个节点的群集，集群名为mycluster，并开启集群；

[root@server1 ~]# pcs cluster setup --name mycluster server1 server4
[root@server1 ~]# pcs cluster start --all
[root@server1 ~]# pcs cluster enable --all #开机自启

用命令 pcs status查看集群的状态；
此时会有一条警告，需要禁用节点的STONITH组件功能，再次查看集群状态时就已经完整。

[root@server1 haproxy]# pcs status
Cluster name: myclusterWARNINGS:
No stonith devices and stonith-enabled is not false[root@server1 haproxy]# pcs property set stonith-enabled=false
[root@server1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 14:45:04 2021
Last change: Sun Jul 11 14:45:02 2021 by root via cibadmin on server12 nodes configured
0 resources configuredOnline: [ server1 server4 ]No resourcesDaemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

配置VIP
此时集群已经完善。但是此时并没有可负载的服务，接下来就要为其添加服务。

在添加之前先在server4上也安装 haproxy,并确保两个节点的信息一致；

[root@server1 ~]# ssh server4 yum install haproxy -y
[root@server1 haproxy]# scp haproxy.cfg server4:/etc/haproxy/
[root@server1 haproxy]# ssh server4 systemctl start haproxy

添加VIP；

[root@server1 ~]# pcs resource standards    #查看可用的标准资源
lsb
ocf
service
systemd
[root@server1 ~]# pcs resource providers   #查看提供的组件
heartbeat
openstack
pacemaker
[root@server1 ~]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.25.105.100 op monitor interval=30s#添加VIP ，并输出到监控，30s刷新一次[root@server1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 14:53:28 2021
Last change: Sun Jul 11 14:53:17 2021 by root via cibadmin on server12 nodes configured
1 resource configuredOnline: [ server1 server4 ]Full list of resources:vip  (ocf::heartbeat:IPaddr2):   Started server1Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

模拟故障切换，查看vip 的转变；当server1节点下线时，server4接管vip。

[root@server1 ~]# pcs node standby
[root@server1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 14:54:11 2021
Last change: Sun Jul 11 14:54:03 2021 by root via cibadmin on server12 nodes configured
1 resource configuredNode server1: standby  #server1下线
Online: [ server4 ]Full list of resources:vip   (ocf::heartbeat:IPaddr2):   Started server4Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled
[root@server1 ~]# pcs node unstandby

当server1再次上线时会子的自动加入集群，但VIP 不会变化。

添加服务

停掉haproxy ，让集群管理。

[root@server1 ~]# systemctl disable --now haproxy
[root@server1 ~]# ssh server4 systemctl disable --now haproxy
[root@server1 ~]# netstat -antlp | grep :80
tcp6       0      0 :::8080                 :::*                    LISTEN      9708/httpd
[root@server1 ~]# pcs resource create haproxy systemd:haproxy op monitor interval=60s
#增加集群服务监控,监控haproxy的状态并设置条件

以上虽然添加了haproxy 的服务，但是可能会出现服务和 vip 不在一个节点的问题；所以就需要为其添加一个组，保证组中的资源会在同一个节点上运行。

[root@server1 ~]# pcs resource group add hagroup vip haproxy
[root@server4 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 14:59:04 2021
Last change: Sun Jul 11 14:58:25 2021 by root via cibadmin on server12 nodes configured
2 resources configuredOnline: [ server1 server4 ]Full list of resources:Resource Group: hagroupvip  (ocf::heartbeat:IPaddr2):   Started server4haproxy  (systemd:haproxy):  Started server4Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

模拟节点下线:
当server4下线时，此时在线的server1会接管VIP和服务；当节点下线时，并且此时没有在线的节点时，会出现无法负载，服务不能访问的情况，此时访问VIP会提示错误，当有备机时会显示备机的信息。

[root@server4 ~]# pcs node standby
[root@server4 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 14:59:56 2021
Last change: Sun Jul 11 14:59:29 2021 by root via cibadmin on server42 nodes configured
2 resources configuredNode server4: standby
Online: [ server1 ]Full list of resources:Resource Group: hagroupvip    (ocf::heartbeat:IPaddr2):   Started server1haproxy  (systemd:haproxy):  Started server1Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

模拟内核崩溃：
当VIP所在的节点崩溃之后此时自动会切换；内核出现问题时，需要断电重启一次，当上线之后，会自动加入集群。

[root@server1 ~]# echo c > /proc/sysrq-trigger [root@server4 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 15:02:08 2021
Last change: Sun Jul 11 15:00:16 2021 by root via cibadmin on server42 nodes configured
2 resources configuredOnline: [ server4 ]
OFFLINE: [ server1 ]Full list of resources:Resource Group: hagroupvip   (ocf::heartbeat:IPaddr2):   Started server4haproxy  (systemd:haproxy):  Started server4Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

6. fence 机制

在搭建好的负载均衡集群中，每个节点互相会发送心跳包判断节点的状态；假设Server1的心跳出现问题，则Server4会认为Server1出了问题;；资源会被调度到Server2上运行。

如果此时Server1可能由于负载过大出现宕机；只是一种假死的状态，其上的资源并没有进行释放。或者由于网络原因会出现检测不到彼此的心跳，误认为对方出现了问题。
当网络恢复时，两个主机上都开启了资源，此时就出现了脑裂(Split Brain)。

为了防止出现这种现象的出现，集群中一般会添加 Fence 设备。使用服务器本身硬件接口的称为内部Fence, 使用外部电源设备控制的称为外部Fence。。在生产环境中往往由高级的交换机来完成，此处用插件来完成内部Fence的功能。

当一台服务器出现超时问题时，为了避免问题导致的不可控后果；Fence设备会对服务器直接发出硬件管理指令，直接将其断电，同时向其他节点发出信号接管服务。

宿主机
在宿主机上保证安装了下面几个安装包，并完成配置；

[westos@westos Desktop]$ rpm -qa| grep fence
fence-virtd-0.4.0-9.el8.x86_64
fence-virtd-multicast-0.4.0-9.el8.x86_64
libxshmfence-1.3-2.el8.x86_64
fence-virtd-libvirt-0.4.0-9.el8.x86_64[root@westos ~]# fence_virtd -c
#来配置fence
[root@westos ~]# cd /etc/cluster/  #在该目录下生成密钥，如果没有该目录首先需要创建目录
[root@westos cluster]# ls
fence_xvm.key
[root@westos cluster]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1
#生成密码
[root@westos cluster]# systemctl start fence_virtd.service
[root@westos cluster]# netstat -anulp| grep :1229
udp        0      0 0.0.0.0:1229            0.0.0.0:*                           20131/fence_virtd

在调度器主机

在两个调度器主机上安装fence 客户端，并完成配置；

[root@server1 ~]# yum install -y fence-virt
[root@server1 ~]# ssh server4 yum install -y fence-virt
[root@server1 ~]# stonith_admin -I #查询fence设备fence_xvmfence_virt
2 devices found
[root@server1 ~]# stonith_admin -M -a fence_xvm    #查询fence设备信息[root@server1 ~]# mkdir /etc/cluster
[root@server1 ~]# ssh server4 mkdir /etc/cluster

将密码发送至两台调度主机；

[root@westos cluster]# scp fence_xvm.key root@172.25.105.1:/etc/cluster
[root@westos cluster]# scp fence_xvm.key root@172.25.105.4:/etc/cluster
[root@server1 ~]# ll /etc/cluster/fence_xvm.key

在pacemaker中添加fence资源，vmfence是自定义的名称；在集群中进行主机名与域名的设置；主机名在前，域名在后；同时设置了60s检测一次的选项。

[root@server1 ~]# pcs stonith create vmfence fence_xvm pcmk_host_map="server1:lvs1;server4:lvs4" op monitor interval=60s
[root@server1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 16:30:29 2021
Last change: Sun Jul 11 16:30:19 2021 by root via cibadmin on server12 nodes configured
3 resources configuredOnline: [ server1 server4 ]Full list of resources:Resource Group: hagroupvip  (ocf::heartbeat:IPaddr2):   Started server4haproxy  (systemd:haproxy):  Started server4vmfence  (stonith:fence_xvm):    Started server1Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

此时虽然fence 已经成功添加，但是，由于之前设置了 stonith为false，此时应该将其打开，否则不可用。
此时打开stonith 组件之后，也没有错误信息，因为此时已经用来fence的组件。

[root@server1 ~]# pcs property set stonith-enabled=true
[root@server1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 16:34:12 2021
Last change: Sun Jul 11 16:34:10 2021 by root via cibadmin on server12 nodes configured
3 resources configuredOnline: [ server1 server4 ]Full list of resources:Resource Group: hagroupvip  (ocf::heartbeat:IPaddr2):   Started server4haproxy  (systemd:haproxy):  Started server4vmfence  (stonith:fence_xvm):    Started server1Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled
[root@server1 ~]# crm_verify -LV   #查看服务状态，当什么都没有时表示正常。

模拟内核出现故障：
此时服务会自动切换到对端，然后等待检测不到时，会自动重启宕掉的机器。起来之后，会自动加入集群，此时fence 会在刚启动的机器上。

[root@server4 ~]# echo c > /proc/sysrq-trigger [root@server1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 16:36:39 2021
Last change: Sun Jul 11 16:34:10 2021 by root via cibadmin on server12 nodes configured
3 resources configuredOnline: [ server1 server4 ]Full list of resources:Resource Group: hagroupvip  (ocf::heartbeat:IPaddr2):   Started server1haproxy  (systemd:haproxy):  Started server1vmfence  (stonith:fence_xvm):    Started server4Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

模拟网络故障：
当网络出现故障时，此时服务会自动切换到对端；然后等待检测不到时，会自动重启宕掉的机器。起来之后，会自动加入集群，此时fence 会在刚启动的机器上。

[root@server1 ~]# ip link set down eth0[root@server4 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Sun Jul 11 16:44:47 2021
Last change: Sun Jul 11 16:34:10 2021 by root via cibadmin on server12 nodes configured
3 resources configuredOnline: [ server4 ]
OFFLINE: [ server1 ]Full list of resources:Resource Group: hagroupvip   (ocf::heartbeat:IPaddr2):   Started server4haproxy  (systemd:haproxy):  Started server4vmfence  (stonith:fence_xvm):    Started server4Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled