【本文正在参与炫“库”行动-人大金仓有奖征文】

CSDNhttps://marketing.csdn.net/p/98bd30353e7cb998b6070a89e8b91edb

案例说明:
    KingbaseES V8R6集群部署一般可采用图形化方式快速部署,但在生产一线,有的服务器系统未启用图形化环境或无法启用图形界面,所以对于KingbaseES V8R6集群部署需采用手工字符界面方式部署,本次文档记录了在生产环境下的字符界面部署操作步骤及部署中的注意事项和故障案例。

1)本案例在通用机环境下部署完成。
    2)需要首先安装KingbaseES V8R6 Cluster版本的软件包。
    3)本案例主要用于系统环境不能提供图形化部署或者图形化部署中出现故障时。
    4)本案例在通用机环境完成,专用机环境可用于参考。
    5)通用机环境的操作基本由kingbase用户完成。
    6)在通过脚本一键部署R6集群时,请先做好系统环境的准备工作:(如ssh信任关系、防火墙、selinux配置、进程资源管理配置、用户创建、ip分配等)。

一、    系统环境
1.1    集群架构

图1-1 KingbaseES V8R6集群架构

如图1-1所示,KingbaseES V8R6采用开源repmgr架构,repmgr流复制管理系统有repmgr和repmgrd两个命令。其中repmgr命令实现对集群节点的管理,如注册(Register)主/备节点、克隆(Clone)Standby节点、提升(Promote)Standby为Primary节点、切换新主(Follow)节点以及在线切换(Switchover)主备操作等;repmgrd命令用来启动repmgr系统的守护进程,用以对集群节点的监控。

Repmgr流复制管理工具对集群节点的管理是基于一个分布式的管理系统。每个节点都有自己的repmgr.conf配置文件,用来记录本节点的ID,节点名称,连接信息,数据库DATA目录等配置参数。在配置好这些参数后,就可以通过repmgr命令实现对集群节点的“一键式”部署。

为了有效地管理复制集群,repmgr提供专用数据库(元数据表)存储和管理有关repmgr集群服务的相关信息。

1.2    数据库版本

KingbaseES_V008R006C003B0062_Aarch64

1.3    系统CPU架构(鲲鹏920)

[root@ECOLABAPP37 ~]# lscpu
Architecture:                    aarch64
CPU op-mode(s):                  64-bit
Byte Order:                      Little Endian
CPU(s):                          192
On-line CPU(s) list:             0-191
Thread(s) per core:              1
Core(s) per socket:              48
Socket(s):                       4
NUMA node(s):                    8
Vendor ID:                       HiSilicon
Model:                           0
Model name:                      Kunpeng-920
Stepping:                        0x1
CPU max MHz:                     3000.0000
CPU min MHz:                     200.0000
......

1.4    系统内存信息

[root@ECOLABAPP37 ~]# free -m
                      total        used        free      shared  buff/cache   available
Mem:         522103       18400      501575          63        2127     501458
Swap:         65535           0       65535

1.5    网卡信息

nm-bond: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
        inet 10.248.52.*  netmask 255.255.240.0  broadcast 10.248.63.255
        inet6 fe80::1728:3b0b:9694:6c2c  prefixlen 64  scopeid 0x20<link>
        ether 00:07:45:c2:d1:20  txqueuelen 1000  (Ethernet)
        RX packets 83667032  bytes 5305257118 (4.9 GiB)
        RX errors 0  dropped 16629  overruns 0  frame 0
        TX packets 513509  bytes 44561399 (42.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

1.6    系统内核信息

[root@ECOLABAPP37 ~]# uname -a
Linux ECOLABAPP37 4.19.90-2003.4.0.0036.oe1.aarch64 #1 SMP Mon Mar 23 19:06:43 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux

二、    配置系统环境(all nodes)
2.1 创建kingbase用户

[root@ECOLABAPP37 ~]# id kingbase
uid=1002(kingbase) gid=1002(kingbase) groups=1002(kingbase)

2.2 关闭主机系统防火墙

[root@ECOLABAPP37 Scripts]# systemctl stop firewalld
[root@ECOLABAPP37 Scripts]# systemctl disable firewalld
[root@ECOLABAPP38 ~]# systemctl status firewalld
 firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)
……

2.3 配置selinux

[kingbase@node3 ~]$ cat /etc/sysconfig/selinux |grep -v  ^#|grep -v ^$
SELINUXTYPE=targeted 
SELINUX=disabled

三、    通过脚本构建集群
3.1  配置部署环境

=== 相关集群部署脚本,在集群软件包安装后,在集群软件安装目录下可以查找到===

Kingbase用户在宿主目录下创建文件夹:
[kingbase@ECOLABAPP37 ~] mkdir R6_install
将部署脚本、配置文件及数据库license.dat文件放置到当前目录下。
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-rw-r--r-- 1 kingbase kingbase 2.9K Apr 19 17:20 license.dat
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57 trust_cluster.sh
-rw------- 1 kingbase kingbase  32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase  31K Apr 19 16:57 V8R6一键部署集群脚本操作手册.docx

1) 查看和编辑集群配置文件(根据系统环境进行修改)

[kingbase@node3 ~]$ cat install.conf |grep -v ^#|grep -v ^$
on_bmj=0
all_ip=(10.248.52.165 10.248.52.166)
install_dir="/home/kingbase/cluster"
zip_package="/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip"
license_file=(license.dat)
db_user="system"                 # the user name of database
db_password="123456"             # the password of database
db_port="54321"                  # the port of database, defaults is 54321
db_mode="oracle"                 # database mode: pg, oracle
db_auth="scram-sha-256"          # database authority: scram-sha-256, md5, default is scram-sha-256
trusted_servers="10.248.48.1"
virtual_ip="10.248.52.174/20"
net_device=(nm-bond nm-bond)
ipaddr_path="/sbin"
arping_path="/usr/sbin"
ping_path="/bin"
super_user="root"
execute_user="kingbase"
reconnect_attempts="6"           # the number of retries in the event of an error
reconnect_interval="10"          # retry interval
recovery="manual"                # the way of cluster recovery: automatic/manual
ssh_port="22"                    # the port of ssh, default is 22

2) 配置主机间ssh互信(可以手工配置,也可以通过以下脚本配置,建议手工配置)
注意:
需要配置kingbase用户之间、root用户之间、kingbase和root用户之间,配置完成后检查用户信任关系

查看脚本内容(部分内容):
[kingbase@ECOLABAPP37 R6_install]$ cat trust_cluster.sh
#!/bin/bash
# you should change two parameters: general_user and all_ip
# general_user is the general user which you want to config SSH password free
# all_ip is the devices that you want to config SSH password free
shell_folder=$(dirname $(readlink -f "$0"))
install_conf="${shell_folder}/install.conf"
primary_host=""
curren_user=`whoami`
......
for ips in ${all_ip[@]}
do
    ssh -p ${ssh_port} root@$ips "cp -r /root/.ssh /home/$general_user/"
    ssh -p ${ssh_port} root@$ips "chmod 700 /home/$general_user/.ssh/"
    ssh -p ${ssh_port} root@$ips "chown -R $general_user:$general_user /home/$general_user/.ssh/"
done

3)查看cluser部署脚本(部分内容)

[kingbase@ECOLABAPP37 R6_install]$ cat V8R6_cluster_install.sh
#!/bin/bash
shell_folder=$(dirname $(readlink -f "$0"))
install_conf=""
#all_ip=(192.168.28.10 192.168.28.11)
all_ip=()
#install_dir="/home/kingbase/tmp_kingbase"
install_dir=""
#zip_package="${shell_folder}/db.zip"
zip_package=""
#license_path="${shell_folder}"
license_path="${shell_folder}"
#BMJ Kingbase install path
soft_dir="/opt/Kingbase/ES/V8/Server"
......
    # start up the cluster
    echo "[INSTALL] start up the whole cluster ..."
    execute_command ${execute_user} ${all_ip[-2]} "${sys_bindir}/sys_monitor.sh start"
    [ $? -ne 0 ] && exit 1
    echo "[INSTALL] start up the whole cluster ... OK"
}
main
exit 0

3.2 执行脚本部署
注意:
必须将license.dat文件也存放到当前目录下,缺少license.dat将会出现错误。
当前集群手工部署文件存储目录:

[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-rw-r--r-- 1 kingbase kingbase 2.9K Apr 19 17:20 license.dat
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57 trust_cluster.sh
-rw------- 1 kingbase kingbase  32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase  31K Apr 19 16:57 V8R6一键部署集群脚本操作手册.docx

执行部署脚本:
根据输出日志信息,判断部署过程中的故障。完整阅读输出日志,结合图形化部署工具,可以加深repmgr集群部署的工作机制。

[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ sh V8R6_cluster_install.sh

===读取install.conf文件,获取集群节点配置信息,创建集群数据目录环境===
[CONFIG_CHECK] file format is correct ... OK
[CONFIG_CHECK] check if the virtual ip "10.248.52.*" already exist ...
[CONFIG_CHECK] there is no "10.248.52.*" on any host, OK
[CONFIG_CHECK] the number of net_device matches the length of all_ip or the number of net_device is 1 ... OK
[CONFIG_CHECK] the number of license_num matches the length of all_ip or the number of license_num is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] check the db is running or not...
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] check if the install dir is already exist ...
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[INSTALL] create the install dir "/home/kingbase/cluster/kingbase" on every host ...
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase"
[INSTALL] success to decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase" on "10.248.52.*"..... OK
[INSTALL] create the dir "/home/kingbase/cluster/kingbase/etc" on all host
[INSTALL] scp the dir "/home/kingbase/cluster/kingbase" to other host
[INSTALL] try to copy the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" .....
[INSTALL] success to scp the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" ..... OK
[RUNNING] chmod u+s for "/sbin" and "/home/kingbase/cluster/kingbase/bin"
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /home/kingbase/cluster/kingbase/bin/arping on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /home/kingbase/cluster/kingbase/bin/arping on "10.248.52.*" ..... OK
[INSTALL] check license_file "license.dat"
[INSTALL] success to access license_file: /home/kingbase/R6_install/license.dat
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] check license_file "license.dat"
[INSTALL] success to access license_file: /home/kingbase/R6_install/license.dat
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*

=== 初始化primary节点数据库,并启动数据库服务===
[INSTALL] begin to init the database on "10.248.52.*" ...
The files belonging to this database system will be owned by user "kingbase".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory /home/kingbase/cluster/kingbase/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
Begin setup encrypt device
initializing the encrypt device ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
create security database ... ok
load security database ... ok
create initial audit rules ... ok
syncing data to disk ... ok

Success. You can now start the database server using:
    /home/kingbase/cluster/kingbase/bin/sys_ctl -D /home/kingbase/cluster/kingbase/data -l logfile start

[INSTALL] end to init the database on "10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ...
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] start up the database on "10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
[INSTALL] start up the database on "10.248.52.*" ... OK

=== 创建repmgr元数据库===
[INSTALL] create the database "esrep" and user "esrep" for repmgr ...
CREATE DATABASE
CREATE ROLE
[INSTALL] create the database "esrep" and user "esrep" for repmgr ... OK
[INSTALL] register the primary on "10.248.52.*" ...
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed

=== 注册primary库,并clone and register standby库===
NOTICE: PING 10.248.52.* (10.248.52.*) 56(84) bytes of data.

--- 10.248.52.* ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1005ms

WARNING: ping host"10.248.52.*" failed
DETAIL: average RTT value is not greater than zero
INFO: loadvip result: 1, arping result: 1
NOTICE: node (ID: 1) acquire the virtual ip 10.248.52.* success
NOTICE: primary node record (ID: 1) registered
[INSTALL] register the primary on "10.248.52.*" ... OK
[INSTALL] clone and start up the standby ...
clone the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/repmgr -h 10.248.52.* -U esrep -d esrep -p 54321 standby clone
NOTICE: destination directory "/home/kingbase/cluster/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=10.248.52.* user=esrep port=54321 dbname=esrep
DETAIL: current installation size is 64 MB
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: creating directory "/home/kingbase/cluster/kingbase/data"...
NOTICE: starting backup (using sys_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  /home/kingbase/cluster/kingbase/bin/sys_basebackup -l "repmgr base backup"  -D /home/kingbase/cluster/kingbase/data -h 10.248.52.* -p 54321 -U esrep -X stream -S repmgr_slot_2 
NOTICE: standby clone (using sys_basebackup) complete
NOTICE: you can now start your Kingbase server
HINT: for example: sys_ctl -D /home/kingbase/cluster/kingbase/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"
clone the standby on "10.248.52.*" ... OK
start up the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
start up the standby on "10.248.52.*" ... OK
register the standby on "10.248.52.*" ...
INFO: connecting to local node "node2" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node "node2" (ID: 2) successfully registered
[INSTALL] register the standby on "10.248.52.*" ... OK

===启动集群===
[INSTALL] start up the whole cluster ...
2021-04-19 17:31:58 Ready to start all DB ...
2021-04-19 17:31:58 begin to start DB on "[10.248.52.*]".
2021-04-19 17:31:59 DB on "[10.248.52.*]" already started, connect to check it.
2021-04-19 17:32:00 DB on "[10.248.52.*]" start success.
2021-04-19 17:32:00 Try to ping trusted_servers on host 10.248.52.* ...
2021-04-19 17:32:02 Try to ping trusted_servers on host 10.248.52.* ...
2021-04-19 17:32:05 begin to start DB on "[10.248.52.*]".
2021-04-19 17:32:05 DB on "[10.248.52.*]" already started, connect to check it.
2021-04-19 17:32:06 DB on "[10.248.52.*]" start success.
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2021-04-19 17:32:06 The primary DB is started.
2021-04-19 17:32:12 Success to load virtual ip [10.248.52.*] on primary host [10.248.52.*].
2021-04-19 17:32:12 Try to ping vip on host 10.248.52.* ...
2021-04-19 17:32:14 Try to ping vip on host 10.248.52.* ...
2021-04-19 17:32:17 begin to start repmgrd on "[10.248.52.*]".
[2021-04-19 17:32:17] [NOTICE] using provided configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 17:32:17] [NOTICE] redirecting logging output to "/home/kingbase/cluster/kingbase/hamgr.log"

2021-04-19 17:32:17 repmgrd on "[10.248.52.*]" start success.
2021-04-19 17:32:17 begin to start repmgrd on "[10.248.52.*]".
[2021-04-19 15:15:45] [NOTICE] using provided configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 15:15:45] [NOTICE] redirecting logging output to "/home/kingbase/cluster/kingbase/hamgr.log"

=== 查看集群节点状态,集群部署完成===

2021-04-19 17:32:18 repmgrd on "[10.248.52.*]" start success.
 ID | Name  | Role    | Status    | Upstream | repmgrd | PID   | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
 1  | node1 | primary | * running |          | running | 62956 | no      | n/a                
 2  | node2 | standby |   running | node1    | running | 25769 | no      | 0 second(s) ago    
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf does not exist
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf does not exist
2021-04-19 17:32:22 Done.
[INSTALL] start up the whole cluster ... OK
=== 根据以上信息获知,集群手工部署成功!===

四、查看集群部署后的状态
4.1 查看数据库服务状态(主库)

[kingbase@ECOLABAPP37 ~]$ ps -ef |grep kingbase
kingbase   62335       1  0 17:31 ?        00:00:00 /home/kingbase/cluster/kingbase/bin/kingbase -D /home/kingbas/cluster/kingbase/data
kingbase   62336   62335  0 17:31 ?        00:00:00 kingbase: logger   
kingbase   62338   62335  0 17:31 ?        00:00:00 kingbase: checkpointer   
kingbase   62339   62335  0 17:31 ?        00:00:00 kingbase: background writer   
kingbase   62340   62335  0 17:31 ?        00:00:00 kingbase: walwriter   
kingbase   62341   62335  0 17:31 ?        00:00:00 kingbase: autovacuum launcher   
kingbase   62342   62335  0 17:31 ?        00:00:00 kingbase: archiver   last was 000000010000000000000002.00000028.backup
kingbase   62343   62335  0 17:31 ?        00:00:00 kingbase: stats collector   
kingbase   62344   62335  0 17:31 ?        00:00:00 kingbase: ksh writer   
kingbase   62345   62335  0 17:31 ?        00:00:00 kingbase: ksh collector   
kingbase   62346   62335  0 17:31 ?        00:00:00 kingbase: sys_kwr collector   
kingbase   62347   62335  0 17:31 ?        00:00:00 kingbase: logical replication launcher   
kingbase   62426   62335  0 17:31 ?        00:00:00 kingbase: walsender esrep 10.248.52.*(52926) streaming 0/300B810
kingbase   62954   62335  0 17:32 ?        00:00:00 kingbase: esrep esrep 10.248.52.*(47290) idle
kingbase   62956       1  0 17:32 ?        00:00:00 /home/kingbase/cluster/kingbase/bin/repmgrd -d -v -f /home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf
kingbase   62966   62335  0 17:32 ?        00:00:00 kingbase: esrep esrep 10.248.52.*(52934) idle
kingbase   63178       1  0 17:32 ?        00:00:00 /home/kingbase/cluster/kingbase/bin/kbha -A daemon -f /home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf
kingbase   63822   63178  0 17:35 ?        00:00:00 ping -q -c3 -w2 10.248.48.*

4.2 主备流复制状态

[kingbase@ECOLABAPP37 ~]$ ksql -U system test
ksql (V8.0)
Type "help" for help.
test=# select * from sys_stat_replication;
  pid  | usesysid | usename | application_name |  client_addr  | client_hostname | client_port |         backend_s
tart         | backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag |
 replay_lag | sync_priority | sync_state |          reply_time           
-------+----------+---------+------------------+---------------+-----------------+------
 62426 |    16385 | esrep   | node2            | 10.248.52.* |                 |       52926 | 2021-04-19 17:31:
57.986053+08 |              | streaming | 0/300B810 | 0/300B810 | 0/300B810 | 0/300B810  |           |           |
            |             1 | quorum     | 2021-04-19 15:19:35.941223+08
(1 row)

4.3 查看集群节点状态

[kingbase@ECOLABAPP37 ~]$ repmgr cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+-------+---------+-----------+----------+----------+----------+----------+--------
 1  | node1 | primary | * running |          | default  | 100      | 1        | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | user=esrep dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

4.4 测试主备流复制同步

主库DML操作:
test=# create database prod;
CREATE DATABASE
test=# \c prod;
You are now connected to database "prod" as user "system".

prod=# create table t1 (id int);
CREATE TABLE
prod=# insert into t1 values (10),(20),(30);
INSERT 0 3
prod=# select * from t1;
 id 
----
 10
 20
 30
(3 rows)

备库查看同步数据:
[kingbase@ECOLABAPP38 ~]$ ksql -U system test
ksql (V8.0)
Type "help" for help.

test=# \c prod
You are now connected to database "prod" as user "system".
prod=# select * from t1;
 id 
----
 10
 20
 30
(3    rows)

五、部署故障案例

故障现象说明:
       没有将license.dat文件存放到集群部署脚本的当前目录下,在执行部署脚本时,出现故障,无法访问到license.dat文件,后将license.dat文件拷贝到此目录后,部署成功。

[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57 trust_cluster.sh
-rw------- 1 kingbase kingbase  32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase  31K Apr 19 16:57 V8R6一键部署集群脚本操作手册.docx
[kingbase@ECOLABAPP37 R6_install]$ sh V8R6_cluster_install.sh 
[CONFIG_CHECK] file format is correct ... OK
[CONFIG_CHECK] check if the virtual ip "10.248.52.*" already exist ...
[CONFIG_CHECK] there is no "10.248.52.*" on any host, OK
[CONFIG_CHECK] the number of net_device matches the length of all_ip or the number of net_device is 1 ... OK
[CONFIG_CHECK] the number of license_num matches the length of all_ip or the number of license_num is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] success connect to the target "10.248.52.*" ..... OK
[RUNNING] check the db is running or not...
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] the db is not running on "10.248.52.*:54321" ..... OK
[RUNNING] check if the install dir is already exist ...
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[RUNNING] the install dir is not exist on "10.248.52.*" ..... OK
[INSTALL] create the install dir "/home/kingbase/cluster/kingbase" on every host ...
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] success to create the install dir "/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase"
[INSTALL] success to decompress the "/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to "/home/kingbase/cluster/kingbase" on "10.248.52.*"..... OK
[INSTALL] create the dir "/home/kingbase/cluster/kingbase/etc" on all host
[INSTALL] scp the dir "/home/kingbase/cluster/kingbase" to other host
[INSTALL] try to copy the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" .....
[INSTALL] success to scp the install dir "/home/kingbase/cluster/kingbase" to "10.248.52.*" ..... OK
[RUNNING] chmod u+s for "/sbin" and "/usr/sbin"
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /usr/sbin/arping on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /sbin/ip on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /usr/sbin/arping on "10.248.52.*" ..... OK
[INSTALL] check license_file "license.dat"
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] check license_file "license.dat"
[INSTALL] Copy license to /home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] begin to init the database on "10.248.52.*" ...
The files belonging to this database system will be owned by user "kingbase".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory /home/kingbase/cluster/kingbase/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
Begin setup encrypt device
initializing the encrypt device ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
create security database ... ok
load security database ... ok
create initial audit rules ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
    /home/kingbase/cluster/kingbase/bin/sys_ctl -D /home/kingbase/cluster/kingbase/data -l logfile start

[INSTALL] end to init the database on "10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on "10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on "10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ...
[INSTALL] wirte the sys_hba.conf on "10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on "10.248.52.*" ...
[INSTALL] write the repmgr.conf on "10.248.52.*" ... OK
[INSTALL] start up the database on "10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data start
waiting for server to start.... stopped waiting
sys_ctl: could not start server
Examine the log output.

===注意:以上故障是在启动数据库服务时,数据库服务启动失败;通过手工执行数据库服务启动命令(如下所示),查看日志的反馈,发现无法读取license文件,导致数据库启动失败。所以必须将license.dat文件也存放到当前目录下,以上错误就是当前目录下缺少license.dat,数据库启动是无法读取到license文件。===

在排除故障时,可以手工执行一下命令,然后查看故障日志:

/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingbase/data

【本文正在参与炫“库”行动-人大金仓有奖征文】

CSDNhttps://marketing.csdn.net/p/98bd30353e7cb998b6070a89e8b91edb

炫“库”行动-人大金仓有奖征文-KingbaseES V8R6 手工创建主备流复制集群案例相关推荐

  1. 炫‘库’行动-人大金仓有奖征文-挑战国产数据库金仓上k8s(二)

    本文正在参与炫"库"行动-人大金仓有奖征文: 快来点击活动链接参与投稿吧 https://marketing.csdn.net/p/98bd30353e7cb998b6070a89 ...

  2. 炫“库”行动-人大金仓有奖征文—金仓数据库入门体验

    目录 前言 正文 一.人大金仓数据库介绍 二.安装金仓数据库 1. 下载安装包 2. 解压安装包 3. 安装数据库系统 4. 数据库设置 三.安装过程中的常见问题 结尾 文本正在参与[炫"库 ...

  3. 炫“库”行动-人大金仓有奖征文-数据库的备份及恢复

    计算机系统在运行过程中可能会发生内部故障.系统故障.硬件故障等问题.这些问题可能会造成系统崩溃,数据库运行事务非正常中断,部分数据丢失等一系列严重后果.因此,对生产数据库.容灾数据库.测试数据库等重要 ...

  4. 炫“库“行动—人大金仓有奖征文——金仓数据库安装教程

    1.概述 1.1简介 金仓数据库管理系统 简称[kingbaseES]是北京金仓信息技术股份有限公司[简称人大金仓]经过多年努力自主研发的.具有自主知识产权的商用关系型数据库管理系统(DBMS).该产 ...

  5. 炫“库”行动-人大金仓有奖征文-KFS目标端支持Kafka配置详解

    [本文正在参与炫"库"行动-人大金仓有奖征文] 活动链接:  https://marketing.csdn.net/p/98bd30353e7cb998b6070a89e8b91e ...

  6. 炫“库”行动-人大金仓有奖征文-KFS数据比对和数据修复

    [本文正在参与炫"库"行动-人大金仓有奖征文] 活动链接:  https://marketing.csdn.net/p/98bd30353e7cb998b6070a89e8b91e ...

  7. 炫“库”行动-人大金仓有奖征文—谈谈oracle建表规范

    [本文正在参与炫"库"行动-人大金仓有奖征文] 活动链接https://bss.csdn.net/m/topic/kingbase 前言: 今天突然想把工作几年在oracle建表相 ...

  8. 炫“酷”行动-人大金仓有奖征文--金仓分析型数据库迁移IBM Netezza一体机技术可行性

    https://bss.csdn.net/m/topic/kingbase 本文正在参与"炫库"行动--人大金仓有奖征文 一. 数据类型兼容性 通常异构数据库移植的工作量繁重.这些 ...

  9. 炫“库”行动-人大金仓征文大赛—数据领域“新·独角兽”

    前言: 最近公司正在着手于数据这方面的项目,淘宝了好多天,发现[人大金仓],进入主页[https://www.kingbase.com.cn/]查了一下,其在 电子政务.党务.国防军工.金融.智慧城市 ...

最新文章

  1. 团队-团队编程项目作业名称-模块开发过程
  2. rtsp连接断开_live555学习之RTSP连接建立以及请求消息处理过程
  3. div 居中,浏览器兼容性
  4. 华为,为什么让全世界都感到害怕?
  5. Android弹窗组件工作机制之Dialog、DialogFragment(二)
  6. p服务器不响应,无法加载资源:服务器响应状态为500
  7. 【生活】我的2019年度总结
  8. 敏捷开发 看板_什么是看板? 定义的敏捷方法论,以及如何将其用于您的软件开发团队
  9. 3Dcnn 降假阳性模型调试(三)
  10. 离你最近的疫情小区,终于可以自己查了!
  11. 科罗拉多大学波尔得分校计算机科学,科罗拉多大学波尔得分校排名
  12. 乐视云视频PHP接口操作视频上传,编辑,查询以及删除
  13. ISOIEC27000标准族-ISO27001关联体系
  14. 林毅夫1.7万字长文:我的13个经济学新见解
  15. 旋转矩阵与四元数的理解
  16. 第二周学习记录之面向对象
  17. 使用EasyExcel导入、根据模板下载(附前后端代码)
  18. Python版经典小游戏愤怒的小鸟源代码,基于pygame+pymunk
  19. 五年,他们从应届生成为了滴滴的「技术扛把子」
  20. System.Data.SqlClient.SqlException:“登录失败。该登录名来自不受信任的域,不能与 Windows 身份验证一起使用。”

热门文章

  1. 应用Python爬虫技术获取福彩历史数据
  2. 蓝桥杯比赛准备总结(大学编程学习历程)
  3. IP地址划分、组播地址、公有IP、私有IP
  4. GP6创建tablespace 和GP4的差别
  5. 2020年8月份需求排期
  6. Hadoop 3.X 和 2.X 的常用端口号和配置文件
  7. 排列组合问题 “n个球放入m个盒子(8种)”
  8. ubuntu18.04 使用systemd方式添加开机运行sh脚本
  9. 锂电池相关参数及其使用指导
  10. cmap用法,很详细(转)