ansible一键部署高可用集群项目实战最细教程
文章目录
- 服务架构图
- 环境配置
- IP规划和配置
- ssh免密登录
- 开始搭建
- 管理节点
- 准备工作
- 搭建数据库
- 搭建NAS存储节点
- 搭建备份节点
- 搭建web节点
- 搭建负载均衡节点
- 配置keepalived高可用
- 配置Prometheus监控
- 准备二进制软件包
- 配置Prometheus-server节点
- 配置node-exporter探针
- 配置alertmanager组件
- 总剧本
- 搭建博客
- 总结一下
服务架构图
环境配置
IP规划和配置
负载均衡节点
nginx1:192.168.146.100
- 先查看本机网卡:
nmcli connection show
- 可以看到我的虚拟机上网卡连接名是ens33
- 根据连接名修改网卡IP
nmcli connection modify ens33 ipv4.address 192.168.146.100/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
- 修改网卡后记得重启该网卡,让配置生效
nmcli connection down ens33 nmcli connection up ens33
nginx2:192.168.146.101
- 修改网卡IP,按理说虚拟机环境下一张网卡的网卡连接名都是一样的,如果不放心可以重复上述步骤查看,再修改网卡
nmcli connection modify ens33 ipv4.address 192.168.146.101/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
- 重启网卡,操作跟上面的相同,下面就不再显示了
web服务器
apache1:192.168.146.102
nmcli connection modify ens33 ipv4.address 192.168.146.102/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
apache2:192.168.146.103
nmcli connection modify ens33 ipv4.address 192.168.146.103/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
NAS存储节点:
NFS服务器:192.168.146.104
nmcli connection modify ens33 ipv4.address 192.168.146.104/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
MySQL主从集群
master:192.168.146.105
nmcli connection modify ens33 ipv4.address 192.168.146.105/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
slave:192.168.146.106
nmcli connection modify ens33 ipv4.address 192.168.146.106/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
Prometheus-server节点
- prometheus:192.168.146.107
nmcli connection modify ens33 ipv4.address 192.168.146.107/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
rsync-server节点
- rsyncd:192.168.146.108
nmcli connection modify ens33 ipv4.address 192.168.146.108/24 ipv4.gateway 192.168.146.2 ipv4.dns 114.114.114.114 ipv4.method manual
ssh免密登录
用一台管理节点,对上面配置的所有服务器节点配置成可以ssh免密登录,为了后续在管理节点上ansible一键搭建整个架构
在管理节点上,生成密钥对
ssh-keygen
- 出现提示信息一路敲击回车就好
- 出现提示信息一路敲击回车就好
检查一下是否成功生成RSA密钥对
将公钥
id_rsa.pub
发送至各个服务器,以下命令在每个服务器包括管理节点都执行一遍,IP换成对应的即可ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.134 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.100 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.101 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.102 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.103 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.104 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.105 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.106 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.107 ssh-copy-id -i /root/.ssh/id_rsa.pub 192.168.146.108
出现以下信息即完成传输
可以在其他服务器上检查一下是否存在
authorized_keys
文件,存在该文件就说明公钥从管理节点接收成功了
在管理节点上,可以验证一下是否可以成功ssh免密登录到其他节点
[root@server1 ~]# ssh root@192.168.146.101 Last login: Sat Apr 9 10:39:32 2022 from 192.168.146.1 # 查看当前IP,发现已经成功登录到apache2节点上了 [root@server1 ~]# ip a | grep -v 'LOOPBACK' |awk '/^[0-9]/{print $2;getline;getline;if ($0~/inet/){print $2}}' ens33: 192.168.146.101/24
开始搭建
管理节点
准备工作
以下所有命令均在管理节点上使用ansible工具配置即可
使用yum安装ansible工具
# 安装epel扩展源 yum install epel-release.noarch -y # 安装ansible yum install -y ansible
编写ansible主机清单
/etc/ansible/hosts
文件,在文件末尾添加如下内容:[all_ip] 192.168.146.134 hostname=manager 192.168.146.100 hostname=nginx1 192.168.146.101 hostname=nginx2 192.168.146.102 hostname=apache1 192.168.146.103 hostname=apache2 192.168.146.104 hostname=nas rsync_server=192.168.146.108 192.168.146.105 hostname=master 192.168.146.106 hostname=slave 192.168.146.107 hostname=prometheus 192.168.146.108 hostname=rsyncd[balancers] nginx1 mb=MASTER priority=100 nginx2 mb=BACKUP priority=98[web] apache1 apache2[mysql] master master=true slave slave=true[mysql:vars] master_ip=192.168.146.105 slave_ip=192.168.146.106[nfs] nas[nfs:vars] rsync_server=192.168.146.108[rsync] rsyncd[prometheus] prometheus[alertmanagers] prometheus[node-exporter] 192.168.146.100 192.168.146.101 192.168.146.102 192.168.146.103 192.168.146.104 192.168.146.105 192.168.146.106 192.168.146.108
- 检查连通性,所有服务器均响应pong才表示管理节点与所有服务器连通无误
为所有服务器设置主机名,顺便把selinux和防火墙关了,这里为了偷懒选择直接关掉防火墙和selinux(当然也可以为每类的角色做特定的放行策略和设置对应的安全上下文,更符合生产环境配置)
[root@server1 ansible]# vim prepare_work.yml - name: prepare workhosts: all_iptasks:- name: set hostnameshell: hostnamectl set-hostname {{ hostname }}- name: stop firewalldservice:name: firewalldstate: stoppedenabled: no- name: disabled selinuxselinux:state: disabled# 执行剧本! [root@server1 ansible]# ansible-playbook prepare_work.yml
为所有服务器主机生成
/etc/hosts/
解析文件[root@server1 ~]# mkdir ansible [root@server1 ~]# cd ansible/# 编写hosts的模板文件 [root@server1 ansible]# vim hosts.j2 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6{% for host in groups.all_ip %} {{ hostvars[host].ansible_ens33.ipv4.address }} {{ hostvars[host].ansible_hostname }} {% endfor %}# 编写剧本yml文件 [root@server1 ansible]# vim hosts.yml - name: update hosts filehosts: all_iptasks:- name: copy hosts.j2 to other serverstemplate:src: hosts.j2dest: /etc/hosts# 执行剧本yml文件 [root@server1 ansible]# ansible-playbook hosts.yml
创建存放角色的目录roles
[root@server1 ansible]# mkdir roles
- 此时我们的角色目录roles不是ansible默认的角色目录,我们需要到
/etc/ansible/ansible.cfg
文件中修改,否则ansible找不到我们创建的角色
- 此时我们的角色目录roles不是ansible默认的角色目录,我们需要到
至此环境的准备工作结束了,让我们开始对具体的服务进行配置吧!
搭建数据库
创建数据库msql角色
[root@server1 ansible]# ansible-galaxy init role/mysql - Role role/mysql was created successfully
因为数据库我们选择搭建一个主从架构,设计的参数较多,并且有些是重复的,有些是运行ansible的py脚本的节点判断是主节点还是从节点的依据,所以我们可以预先定义一些变量在了主机清单文件中,后续的keepalived高可用搭建在主机清单文件中定义了一些变量,为了主备节点的判断。那么还有一些通用的变量,为了偷偷懒少写重复的值,可以在角色的
vars/main.yml
下定义:[root@server1 ansible]# vim roles/mysql/vars/main.yml --- # vars file for role/mysql mysql_sock: /var/lib/mysql/mysql.sock mysql_port: 3306 repl_user: repl repl_passwd: "123456"
编写模板文件,用于修改mysql的配置文件
my.cnf
,此时预先定义的变量就派上用场了,主从节点的配置server-id
的值需要不一样的。[mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock symbolic-links=0{% if master is defined %} server-id=1 innodb_file_per_table=on {% else %} server-id=2 {% endif %}log-bin=master-bin binlog-format=ROW # 这些是开启gtid的参数 log-slave-updates=true gtid-mode=on enforce-gtid-consistency=true master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 binlog-rows-query-log_events=1[mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
编写
mysql/tasks/main.yml
文件,完成一键搭建主从架构的ansible脚本。注:
一键搭建MySQL主从复制架构的ansible脚本这里采用了基于GTID的主从复制模式,比较方便,不需要我们去找主节点bin-log的position
并且我这里使用的是yum安装mysql而不是MariaDB,因此需要在安装之前先配置下载mysql源,否则因为yum源列表里没有mysql会默认安装MariaDB替换
[root@server1 ansible]# vim roles/mysql/tasks/main.yml --- # tasks file for role/mysql - name: yum install wgetyum:name: wgetstate: present- name: wget mysql reposhell: wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm- name: rpm mysql reposhell: rpm -ivh mysql-community-release-el7-5.noarch.rpmignore_errors: True- name: yum install mysqlyum:name: mysql-serverstate: present- name: config my.cnftemplate:src: my.cnf.j2dest: /etc/my.cnf- name: start mysqlservice:name: mysqldstate: restartedenabled: yes- name: install MYSQL-pythonyum:name: MySQL-pythonstate: present- name: update mysql root passwordshell: mysql -e "update mysql.user set password=password('123456') where user='root' and host='localhost';flush privileges;"ignore_errors: True- name: create repl usermysql_user:name: "{{ repl_user }}"host: '192.168.146.%'password: "{{ repl_passwd }}"priv: "*.*:REPLICATION SLAVE"state: presentlogin_user: 'root'login_password: '123456'login_host: localhostlogin_unix_socket: "{{ mysql_sock }}"login_port: "{{ mysql_port }}"when: master is defined- name: change master tomysql_replication:mode: changemastermaster_host: "{{ master_ip }}"master_user: "{{ repl_user }}"master_password: "{{ repl_passwd }}"login_password: '123456'login_host: localhostlogin_unix_socket: "{{ mysql_sock }}"login_port: "{{ mysql_port }}"when: slave is defined- name: start_slavemysql_replication:mode: startslavelogin_user: 'root'login_password: '123456'login_host: localhostlogin_unix_socket: "{{ mysql_sock }}"login_port: "{{ mysql_port }}"when: slave is defined- name: get_slave_infomysql_replication:login_host: localhostlogin_user: rootlogin_port: "{{ mysql_port }}"login_password: '123456'login_unix_socket: "{{ mysql_sock }}"mode: getslavewhen: slave is definedregister: info- name: dispaly_slavedebug:msg: "Slave_IO_Running={{ info.Slave_IO_Running }} Slave_SQL_Running={{ info.Slave_SQL_Running }}"when: slave is defined
搭建NAS存储节点
本架构通过NFS服务共享本地的资源给web节点使用,先创建nfs角色
[root@server1 ansible]# ansible-galaxy init role/nfs - Role role/nfs was created successfully
创建好角色后,编写
nfs/tasks/main.yml
文件[root@server1 ansible]# vim role/nfs/tasks/main.yml --- # tasks file for role/nfs - name: install nfs,expectyum:name: "{{ item }}"state: presentloop:- nfs-utils- rpcbind- expect*- name: config dir for dynamicshell: mkdir /data | chmod -Rf 777 /data | echo "/data 192.168.146.0/24(rw,sync,no_root_squash)" >> /etc/exports- name: make dir for exp and shfile:path: /shstate: directory- name: make dir for backupfile:path: /backupstate: directory- name: config expect.shtemplate:src: rsync.exp.j2dest: /sh/rsync.exp- name: config beifen.shtemplate:src: beifen.sh.j2dest: /sh/beifen.sh- name: chmod beifen.shfile:path: /sh/beifen.shmode: '0755'- name: cron tabshell: echo "0 1 * * * root /sh/beifen.sh" >> /etc/crontab - name: start nfsservice:name: "{{ item }}"state: restartedenabled: yesloop:- rpcbind- nfs-server
准备expect脚本的模板文件
rsync.exp.j2
[root@manager ansible]# vim roles/nfs/templates/rsync.exp.j2#!/usr/bin/expect set mulu [lindex $argv 0] set timeout 10 spawn rsync -avzr /backup/$mulu root@{{ rsync_server }}::backup_server expect Password send "123456\r" expect eof
准备备份脚本的模板文件
beifen.sh.j2
[root@manager ansible]# vim roles/nfs/templates/beifen.sh.j2#!/bin/bash # 准备压缩文件的目录 mulu=`ip a | grep global|awk -F'[ /]+' '{print $3}'`_`date +%F` echo $mulu mkdir -pv /backup/$mulu &> /dev/null # 打包待发送的数据 tar zcf /backup/$mulu/conf.tar.gz /data/* &> /dev/null touch /backup/$mulu # 发送数据 # 这一句就是执行expect脚本 expect /sh/rsync.exp $mulu # 保留七天以内的数据 find /backup -mtime +7 -delete
搭建备份节点
备份选择rsync服务端,定时接收NAS存储节点推送的备份文件。因此先创建rsync角色
[root@manager ansible]# ansible-galaxy init roles/rsync - Role roles/rsync was created successfully
编写文件
rsyncd.conf
[root@manager ~]# vim ansible/roles/rsync/files/rsyncd.conf [backup_server] path = /backup uid = root gid = root max connections = 2 timeout = 300 read only = false auth users = root secrets file = /etc/rsync.passwd strict modes = yes use chroot = yes
编写
rsync/tasks/main.yml
文件[root@manager ansible]# vim roles/rsync/tasks/main.yml --- # tasks file for roles/rsync - name: yum install rsyncyum:name: rsyncstate: present- name: config rsyncd.confcopy:src: rsyncd.confdest: /etc/rsyncd.conf- name: make dir for backupfile:path: /backupstate: directory- name: prepare rsync.passwdshell: echo "root:123456" >> /etc/rsync.passwd | chmod 600 /etc/rsy nc.passwd- name: start rsyncservice:name: rsyncdstate: startedenabled: yes
搭建web节点
因为本架构是LAMP环境,web服务器选用apache,因此先创建apache角色
[root@server1 ansible]# ansible-galaxy init role/apache - Role role/apache was created successfully [root@server1 ansible]# ll role/ 总用量 0 drwxr-xr-x. 10 root root 154 4月 9 14:30 apache drwxr-xr-x. 10 root root 154 4月 9 14:16 nginx
编写
apache/tasks/main.yml
文件[root@server1 ansible]# vim role/apache/tasks/main.yml --- # tasks file for role/apache - name: yum install lamp environmentyum:name: httpd,php-fpm,php-mysql,mod_phpstate: present- name: start httpdservice:name: httpdstate: restartedenabled: yes- name: start php-fpmservice:name: php-fpmstate: restartedenabled: yes- name: Client Install nfs Serveryum:name: nfs-utilsstate: present- name: mount nfs resourcesmount:src: nas:/datapath: /var/www/htmlfstype: nfsopts: defaultsstate: mounted
搭建负载均衡节点
因为本架构的负载均衡实现选择的是nginx,那就先创建nginx角色
# 创建nginx角色 [root@server1 ansible]# ansible-galaxy init role/nginx - Role role/nginx was created successfully
编写模板文件,为前端的两台nginx负载均衡节点创建负载均衡的配置文件
lb.conf
,为了实现会话保持,在这顺便使用最简单的方法:使用基于ip_hash算法的负载均衡[root@server1 ansible]# vim role/nginx/templates/lb.conf.j2 upstream webservers{server apache1; server apache2; ip_hash; }server{location / {proxy_pass http://webservers; proxy_next_upstream error timeout invalid_header http_500 http_502 http_504; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } }
创建好nginx角色和模板文件之后,那就来编写
nginx/tasks/main.yml
文件吧[root@server1 ansible]# vim role/nginx/tasks/main.yml --- # tasks file for nginx - name: yum install epelyum:name: epel-release.noarchstate: present- name: yum install nginxyum:name: nginxstate: present- name: config lb.conftemplate:src: lb.conf.j2dest: /etc/nginx/conf.d/lb.conf- name: start nginxservice:name: nginxstate: restartedenabled: yes
配置keepalived高可用
本架构前端负载均衡节点需要做高可用,因此创建keepalived角色,VIP:192.168.146.200
[root@server1 ansible]# ansible-galaxy init role/keepalived - Role role/keepalived was created successfully
接着编写模板文件,用于在负载均衡节点上创建keepalived的配置文件
[root@server1 ansible]# vim role/keepalived/templates/keepalived.conf.j2!Configuration File for keepalivedglobal_defs {router_id {{ ansible_hostname }} }vrrp_script check_nginx {script "/usr/local/src/check_nginx_pid.sh"interval 1weight -10 }vrrp_instance VI_1 {state {{ mb }}interface ens33virtual_router_id 10priority {{ priority }}advert_int 1authentication {auth_type PASSauth_pass 1111 }track_script {check_nginx }virtual_ipaddress {192.168.146.200 } }
编写用于检测nginx服务状态的脚本,以便于keepalived实现VIP飘移
[root@manager ansible]# vim roles/keepalived/files/check_nginx_pid.sh #!/bin/bash nginx_process_num=`ps -C nginx --no-header | wc -l` if [ $nginx_process_num -eq 0 ];thenexit 1 elseexit 0 fi
创建好keepalived角色和模板文件后,开始编写
keepalived/tasks/main.yml
文件[root@server1 ansible]# vim role/keepalived/tasks/main.yml --- # tasks file for role/keepalived - name: yum install keepalivedyum:name: keepalivedstate: present- name: copy check_nginx_pid.shcopy:src: check_nginx_pid.shdest: /usr/local/src/check_nginx_pid.sh- name: chmod shfile:path: /usr/local/src/check_nginx_pid.shmode: '0755'- name: config keepalived.conftemplate:src: keepalived.conf.j2dest: /etc/keepalived/keepalived.conf- name: start keepalivedservice:name: keepalivedstate: restartedenabled: yes
配置Prometheus监控
这里的为服务集群配置Prometheus监控服务,需要安装Prometheus-server负责拉取数据并展示,再在各个节点上安装node-exporter用于监控服务器的CPU、内存、机器开启状态,还需要安装alertmanager用于告警。
因此下面创建了三个角色,负责上述的三项工作,来完成Prometheus监控服务的配置
准备二进制软件包
这里的Prometheus监控服务的组件都使用二进制包来安装,但是从官网上下载二进制包非常慢,并且用tar解压时会有归档异常,如果用ansible工具的脚本直接编写语句,在受控节点下载二进制包并且解压,首先会非常慢,其次解压会因为报错而不执行(本人亲测,若大佬们有更好的方法请指教!!!)
- 这里作为小白的我没有找到更好的解决办法,因此我只能选择将本地事先准备好的Prometheus组件的压缩包上传到ansible管理节点,并且将每个组件的压缩包先解压到创建的角色如:
roles/prometheus/files
目录下,然后在编写角色的任务时,用copy模块直接到受控节点上
# 解压压缩包
[root@manager ansible]# tar -zxvf prometheus-2.25.0.linux-amd64.tar.gz -C roles/prometheus/files/
prometheus-2.25.0.linux-amd64/
prometheus-2.25.0.linux-amd64/consoles/
prometheus-2.25.0.linux-amd64/consoles/index.html.example
prometheus-2.25.0.linux-amd64/consoles/node-cpu.html
prometheus-2.25.0.linux-amd64/consoles/node-disk.html
prometheus-2.25.0.linux-amd64/consoles/node-overview.html
prometheus-2.25.0.linux-amd64/consoles/node.html
prometheus-2.25.0.linux-amd64/consoles/prometheus-overview.html
prometheus-2.25.0.linux-amd64/consoles/prometheus.html
prometheus-2.25.0.linux-amd64/console_libraries/
prometheus-2.25.0.linux-amd64/console_libraries/menu.lib
prometheus-2.25.0.linux-amd64/console_libraries/prom.lib
prometheus-2.25.0.linux-amd64/prometheus.yml
prometheus-2.25.0.linux-amd64/LICENSE
prometheus-2.25.0.linux-amd64/NOTICE
prometheus-2.25.0.linux-amd64/prometheus
prometheus-2.25.0.linux-amd64/promtool# 报错信息不用管!!!
gzip: stdin: unexpected end of file
tar: 归档文件中异常的 EOF
tar: 归档文件中异常的 EOF
tar: Error is not recoverable: exiting now# 解压好了!
[root@manager ansible]# ll roles/prometheus/files/
总用量 55608
-rw-r--r--. 1 root root 1606 4月 11 20:20 node.yml
drwxr-xr-x 4 3434 3434 132 2月 18 2021 prometheus-2.25.0.linux-amd64
- 安装其他组件同理
tar -zxvf alertmanager-0.21.0.linux-amd64.tar.gz -C roles/alertmanager/files/tar -zxvf node_exporter-1.3.1.linux-amd64.tar.gz -C roles/node-exporter/files/
配置Prometheus-server节点
老样子,第一步创建角色
ansible-galaxy init roles/prometheus
为Prometheus服务配置文件编写模板文件
prometheus.yml.j2
[root@manager ansible]# vim roles/prometheus/templates/prometheus.yml.j2 global:scrape_interval: 30sevaluation_interval: 30squery_log_file: ./promql.logalerting:alertmanagers:- static_configs:- targets: {% for alertmanager in groups['alertmanagers'] %}- {{ alertmanager }}:9093 {% endfor %} rule_files:- "rules/*.yml"scrape_configs:- job_name: 'prometheus'static_configs:- targets: {% for prometheu in groups['prometheus'] %}- "{{ prometheu }}:9090" {% endfor %}- job_name: "node"static_configs:- targets: {% for node in groups['node-exporter'] %}- "{{ node }}:9100" {% endfor %}
为Prometheus配置成service项编写模板文件
prometheus.service.j2
[root@manager ansible]# vim roles/prometheus/templates/prometheus.service.j2 [Unit] Description=Prometheus After=network.target[Service] WorkingDirectory=/usr/local/prometheus ExecStart=/usr/local/prometheus/prometheus ExecReload=/bin/kill -HUP $MAINPID ExecStop=/bin/kill -KILL $MAINPID Type=simple KillMode=control-group Restart=on-failure RestartSec=3s[Install] WantedBy=multi-user.target
编写告警规则文件
node.yml
- 因为node.yml文件中需要用到jinjia2语法写监控项,所以就不用template模块来在受控节点上编写该文件,而是直接使用copy模块将编写好的文件复制过去即可
[root@manager ansible]# vim roles/prometheus/files/node.yml groups: - name: node.rules # 报警规则组名称expr: up == 0for: 30s #持续时间,表示持续30秒获取不到信息,则触发报警annotations:summary: "Instance {{ $labels.instance }} down" # 自定义摘要for: 2mlabels:severity: warningannotations:summary: "{{$labels.instance}}: {{$labels.mountpoint }} 分区使>用过高"description: "{{$labels.instance}}: {{$labels.mountpoint }} 分>区使用大于 80% (当前值: {{ $value }})"- alert: node Memoryexpr: 100 - (node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100 > 80for: 2mlabels:severity: warningannotations:summary: "{{$labels.instance}}: 内存使用过高"description: "{{$labels.instance}}: 内存使用大于 80% (当前值: {{ $value }})"- alert: node CPUexpr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance) * 100) > 80for: 2mlabels:severity: warningannotations:summary: "{{$labels.instance}}: CPU使用过高"description: "{{$labels.instance}}: CPU使用大于 80% (当前值: {{ $value }})"
编写
prometheus/tasks/main.yml
[root@manager ansible]# vim roles/prometheus/tasks/main.yml --- # tasks file for roles/promethues - name: copy prometheus.tar.gzcopy:src: prometheus-2.25.0.linux-amd64dest: /usr/local/- name: create soft linkfile:src: /usr/local/prometheus-2.25.0.linux-amd64dest: /usr/local/prometheusstate: link- name: chmod filefile:path: /usr/local/prometheus/prometheusmode: '0755'- name: copy service filetemplate:src: prometheus.service.j2dest: /etc/systemd/system/prometheus.service- name: copy config prometheus yamltemplate:src: prometheus.yml.j2dest: /usr/local/prometheus/prometheus.yml- name: create rules dirfile:path: /usr/local/prometheus/rulesstate: directory- name: copy rules yamlcopy:src: node.ymldest: /usr/local/prometheus/rules/node.yml- name: start prometheusservice:name: prometheusstate: startedenabled: yes
配置node-exporter探针
创建角色
ansible-galaxy init roles/node-exporter
编写模板文件
node_exporter.service.j2
[root@manager ansible]# vim roles/node-exporter/templates/node_exporter.service.j2 [Unit] Description=Node Exporter After=network.target[Service] WorkingDirectory=/prometheus_exporter/node_exporter/ ExecStart=/prometheus_exporter/node_exporter/node_exporter ExecStop=/bin/kill -KILL $MAINPID Type=simple KillMode=control-group Restart=on-failure RestartSec=3s[Install] WantedBy=multi-user.target
编写
node-exporter/tasks/main.yml
[root@manager ansible]# vim roles/node-exporter/tasks/main.yml --- # tasks file for roles/node-exporter - name: create dirfile:path: /prometheus_exporterstate: directory- name: copy filecopy:src: node_exporter-1.3.1.linux-amd64dest: /prometheus_exporter- name: create linkfile:src: /prometheus_exporter/node_exporter-1.3.1.linux-amd64dest: /prometheus_exporter/node_exporterstate: link- name: chmod filefile:path: /prometheus_exporter/node_exporter/node_exportermode: '0755'- name: copy service filetemplate:src: node_exporter.service.j2dest: /etc/systemd/system/node_exporter.service- name: start node_exporterservice:name: node_exporterstate: restartedenabled: yes
配置alertmanager组件
创建角色
ansible-galaxy init roles/alertmanager
编写模板文件
alertmanager.service.j2
[root@manager ansible]# vim roles/alertmanager/templates/alertmanager.service.j2 [Unit] Description=AlertManager After=network.target[Service] WorkingDirectory=/usr/local/alertmanager/ ExecStart=/usr/local/alertmanager/alertmanager ExecReload=/bin/kill -HUP $MAINPID ExecStop=/bin/kill -KILL $MAINPID Type=simple KillMode=control-group Restart=on-failure RestartSec=3s[Install] WantedBy=multi-user.target
编写模板文件
alertmanager.yml.j2
smtp_auth_password字段的密码不是QQ邮箱的密码,到QQ邮箱的
设置 ——> 账户
里,找到下图的生成授权码,填写在该字段上
[root@manager ansible]# vim roles/alertmanager/templates/alertmanager.yml.j2 global:resolve_timeout: 5msmtp_from: "2249807270@qq.com"smtp_smarthost: 'smtp.qq.com:465'smtp_auth_username: "2249807270@qq.com"smtp_auth_password: "vwsorrnckxwjdhgf"smtp_require_tls: false route:group_by: ['alertname']group_wait: 10sgroup_interval: 10srepeat_interval: 24hreceiver: 'default-receiver'receivers: - name: 'default-receiver'email_configs:- to: '2249807270@qq.com'inhibit_rules:- source_match:severity: 'critical'target_match:severity: 'warning'equal: ['alertname', 'dev', 'instance']
编写
alertmanager/tasks/main.yml
[root@manager ansible]# vim roles/alertmanager/tasks/main.yml --- # tasks file for roles/alertmanager - name: copy filecopy:src: alertmanager-0.21.0.linux-amd64dest: /usr/local/- name: create linkfile:src: /usr/local/alertmanager-0.21.0.linux-amd64dest: /usr/local/alertmanagerstate: link- name: chmod filefile:path: /usr/local/alertmanager/alertmanagermode: '0755'- name: copy service filetemplate:src: alertmanager.service.j2dest: /etc/systemd/system/alertmanager.service- name: copy config yamltemplate:src: alertmanager.yml.j2dest: /usr/local/alertmanager/alertmanager.yml- name: start serverservice:name: alertmanagerstate: restartedenabled: yes
总剧本
到此,整个架构的所有服务,都已经用角色编辑好了!现在我们最后一步就剩编写一个总剧本,也是最简单的,把所有的编写好的角色都调用一遍就好了
[root@server1 ansible]# vim all.yml- name: config mysql replicationhosts: mysqlroles:- mysql- name: config nfshosts: nfsroles:- nfs- name: config rsynchosts: rsyncroles:- rsync- name: config lamphosts: webroles:- apache- name: config lbhosts: balancersroles:- nginx- keepalived- name: install prometheushosts: prometheusroles:- prometheus- alertmanager- name: install node-exporterhosts: node-exporterroles:- node-exporter
一键安装!
ansible-playbook all.yml
搭建博客
这里的博客使用typecho,现在到我们搭建好的NFS服务器上,解压typecho压缩包到共享的资源目录
/data/
下[root@nas ~]# yum install -y wget [root@nas ~]# wget http://typecho.org/downloads/1.1-17.10.30-release.tar.gz[root@nas ~]# ll 总用量 11156 -rw-r--r--. 1 root root 487445 10月 30 2017 1.1-17.10.30-release.tar.gz[root@nas ~]# tar -zxvf 1.1-17.10.30-release.tar.gz[root@nas ~]# ll 总用量 11156 -rw-r--r--. 1 root root 487445 10月 30 2017 1.1-17.10.30-release.tar.gz drwxr-xr-x. 6 501 games 111 10月 30 2017 build[root@nas ~]# mv build/ typecho[root@nas ~]# cp -r typecho/ /data/[root@nas ~]# ll /data/ 总用量 0 drwxr-xr-x. 6 root root 111 4月 9 20:42 typecho
到MySQL主库服务器上,创建好用于博客软件登录和博客软件使用的数据库,并授予好权限
[root@master ~]# mysql -uroot -p123456mysql> create database bbs;mysql> grant all on bbs.* to 'bbs'@'192.168.146.%' identified by '123456';mysql> flush privileges;
现在,直接打开浏览器,输入
192.168.146.200/typecho
,就会跳转到开启typecho博客界面了
让我们点击下一步,输入连接的数据库基本信息
再把管理员信息输入完成,即可开始安装博客,后续跟着提示信息一步步安装即可。
总结一下
本实战项目服务架构不小,用ansible一键部署,编写剧本和角色的yaml文件工作量不小,而且容易眼花,敲的时候要小心噢~~~
新人第一篇博客,制作不易!求点赞~~~不足的地方欢迎大佬指教!!!
ansible一键部署高可用集群项目实战最细教程相关推荐
- s19.基于 Kubernetes v1.25 (kubeadm) 和 Docker 部署高可用集群(一)
基于 Kubernetes v1.25 和 Docker 部署高可用集群 主要内容 Kubernetes 集群架构组成 容器运行时 CRI Kubernetes v1.25 新特性 Kubernete ...
- Kubernetes — 使用 kubeadm 部署高可用集群
目录 文章目录 目录 Kubernetes 在生产环境中架构 高可用集群部署拓扑 1.网络代理配置 2.Load Balancer 环境准备 3.Kubernetes Cluster 环境准备 安装 ...
- db+Nacos的方式部署高可用集群模式
db+Nacos的方式部署高可用集群模式 环境: 电脑环境:Win10专业版 java : jdk1.8.0 MySQL: 5.7 spring cloud alibaba : 2.2.5.RELEA ...
- s24.基于 Kubernetes v1.25 (二进制) 和 Docker部署高可用集群
1.安装说明 本文章将演示二进制方式安装高可用k8s 1.17+,相对于其他版本,二进制安装方式并无太大区别,只需要区分每个组件版本的对应关系即可. 生产环境中,建议使用小版本大于5的Kubernet ...
- s20.基于 Kubernetes v1.25 (kubeadm) 和 Docker 部署高可用集群(二)
4.4 安装 Docker master和node安装docker-ce: [root@k8s-master01 ~]# cat install_docker.sh #!/bin/bash # #** ...
- Kubernetes — 在 OpenStack 上使用 kubeadm 部署高可用集群
目录 文章目录 目录 高可用集群部署拓扑 高可用集群网络拓扑 网络代理配置 Load Balancer 环境准备 Kubernetes Cluster 环境准备 安装 Container Runtim ...
- Hadoop2.0高可用集群搭建【保姆级教程】
搭载Hadoop2.0高可用集群 说明 准备 下载好所需要的文件 目录准备 虚拟机网络配置(可能会在其他文章中讲到) 文件的安装 配置环境变量 环境变量的验证 关闭防火墙 配置Hadoop高可用集群 ...
- Kubernetes 部署高可用集群(二进制,v1.18)下
高可用架构(扩容多Master架构) Kubernetes作为容器集群系统,通过健康检查+重启策略实现了Pod故障自我修复能力,通过调度算法实现将Pod分布式部署,并保持预期副本数,根据Node失效状 ...
- 高可用集群篇(五)-- K8S部署微服务
高可用集群篇(五)-- K8S部署微服务 一.K8S有状态服务 1.1 什么是有状态服务 1.2 k8s部署MySQL 1.2.1 创建MySQL主从服务 1.2.2 测试主从配置 1.2.3 k8s ...
最新文章
- python 数组数据类型
- Oracle感慨(转)
- 服务器备份文件ctf,GUET-CTF 题目备份
- Machine Learning week 6 quiz: Advice for Applying Machine Learning
- 信息系统项目管理知识--软件工程
- 代码优化之减少重复代码-实践
- 大摩亚太区打造4万人团队 主攻区块链和大数据
- 下载matlab安装包太慢_MATLAB 2020a商业数学中文版软件下载安装教程
- 惊呆了!颜值爆表的20+位阿里技术女神同一时间向你发出共事邀请!
- Kafka安装及部署
- Flutter之Windows环境搭建
- 使用 Capistrano 和写作 Ruby 迭代边缘部署
- .7-Vue源码之AST(3)
- c++类成员变量初始化详解
- android 模拟器优化,Android模拟器大幅优化 为开发者谋福利
- linux语言 ctrl命令,linux下Ctrl命令组合
- java1.8 64_jdk 1.8 64位 官方版
- 程序人生-Hello’s P2P
- 基于 Roslyn 实现代码动态编译
- 跨专业考c语言程序设计,多位跨考大神,教你如何跨专业上岸
热门文章
- ORA-04098: trigger ‘xxx.xxx‘ is invalid and failed re-validation
- 系统维护常用命令及工具
- 在线汇编翻译、函数对比
- 一张图带你看懂UML类图
- 【2022年终总结】勇敢追梦,去和人生博弈
- Unity周围环境与光效调节[一]天空盒与天空盒反射的调节
- RT-Thread——STM32——FAL库
- 聚丙烯酸负载小鼠血清白蛋白(MSA)/大鼠血清白蛋白(RSA)/小麦麦清白蛋白;PAA-MSA/RSA
- Matlab实现Lagrange插值多项式
- 为什么要用Qt开发(Qt跨平台应用开发)