服务器描述:本次搭建是用来测试,所以是在一台服务器上搭建三个redis服务(一主两从)

服务角色

端口

Redis.conf名称

sentinel配置文件名称

sentinel端口

redis日志路径

sentinel路劲

主(master)

6379

redis.conf

sentinel.conf

26379

/home/zhangxs/data/redislog/redis_server/master.log

/home/zhangxs/data/redislog/sentinel/sentinel6379.log

从(slave)

6380

redis_slave6380.conf

Sentinel6380.conf

26380

/home/zhangxs/data/redislog/redis_server/slave6380.log

/home/zhangxs/data/redislog/sentinel/sentinel6380.log

从(slave)

6381

redis_slave6381.conf

Sentinel6381.conf

26381

/home/zhangxs/data/redislog/redis_server/slave6381.log

/home/zhangxs/data/redislog/sentinel/sentinel6381.log

修改配置文件

  • 1: redis.conf

修改redis服务日志路径:logfile "/home/zhangxs/data/redislog/redis_server/master.log"

其他没有修改,使用的是默认配置

  • 2:redis_slave6380.conf 和redis_slave6381.conf  (copy redis_slave.conf)
  1. 设置他们指向的master服务的ip和端口:slaveof 127.0.0.1 6379(两个文件都配置)
  2. 修改redis_slave6380.conf 和redis_slave6381.conf 端口:redis_slave6380.conf  端口为26380 ; redis_slave6381.conf 端口为26381
  3. 设置slave服务日志路径: logfile /home/zhangxs/data/redislog/redis_server/slave6380.log 和 logfile /home/zhangxs/data/redislog/redis_server/slave6381.log
  • 3:sentinel.conf
  1. 修改日志路径:logfile "/home/zhangxs/data/redislog/sentinel/sentinel6379.log"
  • 4:Sentinel6380.conf 和 Sentinel6381.conf (copy sentinel.conf)
  1. 修改端口号:Sentinel6380.conf 改为 26380;   Sentinel6381.conf 改为 26381;
  2. 修改日志路径:logfile "/home/zhangxs/data/redislog/sentinel/sentinel6380.log" 和 logfile "/home/zhangxs/data/redislog/sentinel/sentinel6381.log"

上面配置好后,启动redis服务

  • 1:启动master
src/redis-server redis.conf&

  • 2:启动从服务(slave6380)
src/redis-server redis_slave6380.conf &

查看主服务日志,会发现多出来一段

2755:M 29 Jul 00:13:25.469 * Starting BGSAVE for SYNC with target: disk
2755:M 29 Jul 00:13:25.502 * Background saving started by pid 2784
2784:C 29 Jul 00:13:25.608 * DB saved on disk
2784:C 29 Jul 00:13:25.608 * RDB: 6 MB of memory used by copy-on-write
2755:M 29 Jul 00:13:25.677 * Background saving terminated with success
2755:M 29 Jul 00:13:25.677 * Synchronization with slave 127.0.0.1:6380 succeeded   //同步到从服务127.0.0.1:6380成功

查看从服务日志slave6380.log

2780:S 29 Jul 00:13:25.436 * DB loaded from disk: 0.007 seconds
2780:S 29 Jul 00:13:25.436 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
2780:S 29 Jul 00:13:25.436 * Ready to accept connections
2780:S 29 Jul 00:13:25.436 * Connecting to MASTER 127.0.0.1:6379 //正在连接master服务
2780:S 29 Jul 00:13:25.436 * MASTER <-> SLAVE sync started //master到slave 启动同步
2780:S 29 Jul 00:13:25.436 * Non blocking connect for SYNC fired the event.
2780:S 29 Jul 00:13:25.436 * Master replied to PING, replication can continue...
2780:S 29 Jul 00:13:25.436 * Trying a partial resynchronization (request a26696cff5d35f38896c4eb068f71adbb7cfc421:474104).
2780:S 29 Jul 00:13:25.577 * Full resync from master: 541cd938f43b4f144e647881af409fa1884ea5a4:0 //从master全量同步
2780:S 29 Jul 00:13:25.577 * Discarding previously cached master state.//丢弃之前缓存的master状态
2780:S 29 Jul 00:13:25.677 * MASTER <-> SLAVE sync: receiving 250 bytes from master //slave从master 同步250个字节
2780:S 29 Jul 00:13:25.677 * MASTER <-> SLAVE sync: Flushing old data
2780:S 29 Jul 00:13:25.677 * MASTER <-> SLAVE sync: Loading DB in memory
2780:S 29 Jul 00:13:25.678 * MASTER <-> SLAVE sync: Finished with success

  • 3:启动从服务(slave6381)
src/redis-server redis_slave6381.conf &

查看主服务日志,会发现多出来一段

//从服务127.0.0.1:6381发送同步请求2755:M 29 Jul 00:26:46.531 * Slave 127.0.0.1:6381 asks for synchronization
//接受127.0.1:6381的部分重同步请求。从偏移量1开始发送积压的1106字节
2755:M 29 Jul 00:26:46.531 * Partial resynchronization request from 127.0.0.1:6381 accepted. Sending 1106 bytes of backlog starting from offset 1.

查看从服务日志slave6381.log

2809:S 29 Jul 00:26:46.531 * DB loaded from disk: 0.000 seconds
2809:S 29 Jul 00:26:46.531 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
2809:S 29 Jul 00:26:46.531 * Ready to accept connections
2809:S 29 Jul 00:26:46.531 * Connecting to MASTER 127.0.0.1:6379
2809:S 29 Jul 00:26:46.531 * MASTER <-> SLAVE sync started
2809:S 29 Jul 00:26:46.531 * Non blocking connect for SYNC fired the event.
2809:S 29 Jul 00:26:46.531 * Master replied to PING, replication can continue...
2809:S 29 Jul 00:26:46.531 * Trying a partial resynchronization (request 541cd938f43b4f144e647881af409fa1884ea5a4:1).
2809:S 29 Jul 00:26:46.531 * Successful partial resynchronization with master.
2809:S 29 Jul 00:26:46.532 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization

  • 4:测试数据同步功能(使用redis-cli 连接服务器)

1:连接master

[root@vm1 src]# redis-cli -h 127.0.0.1 -p 6379
127.0.0.1:6379> set name fj

2:连接slave6380

[root@vm1 src]# redis-cli -h 127.0.0.1 -p 6380
127.0.0.1:6380> get name
"fj"
127.0.0.1:6380>

3:连接slave6381

[root@vm1 src]# redis-cli -h 127.0.0.1 -p 6381
127.0.0.1:6381> get name
"fj"
127.0.0.1:6381>

Ok 主从同步没有问题。

默认情况下从服务是不允许set数据的,测试下

127.0.0.1:6380> set name hello
(error) READONLY You can't write against a read only slave.
127.0.0.1:6380>

127.0.0.1:6381> set name hello
(error) READONLY You can't write against a read only slave.
127.0.0.1:6381>

启动各个服务的sentinel

  • 启动sentinel6379
src/redis-sentinel sentinel.conf &

  • 查看Sentinel6379.log
2908:X 29 Jul 01:01:32.838 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
2908:X 29 Jul 01:01:32.839 # Redis version=4.0.10, bits=64, commit=00000000, modified=0, pid=2908, just started
2908:X 29 Jul 01:01:32.839 # Configuration loaded
2908:X 29 Jul 01:01:32.839 * Increased maximum number of open files to 10032 (it was originally set to 1024).
2908:X 29 Jul 01:01:32.840 * Running mode=sentinel, port=26379.
2908:X 29 Jul 01:01:32.840 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
2908:X 29 Jul 01:01:32.855 # Sentinel ID is 1a77392638e41bb0ea0a865ffc93b8de6335227f
2908:X 29 Jul 01:01:32.855 # +monitor master mymaster 127.0.0.1 6379 quorum 2
//一个新的从服务器已经被sentinel识别并关联
2908:X 29 Jul 01:01:32.856 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
2908:X 29 Jul 01:01:32.858 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

  • 启动sentinel6380
 src/redis-sentinel sentinel6380.conf &

  • 查看 Sentinel6380.log
2937:X 29 Jul 01:08:14.325 # Redis version=4.0.10, bits=64, commit=00000000, modified=0, pid=2937, just started
2937:X 29 Jul 01:08:14.325 # Configuration loaded
2937:X 29 Jul 01:08:14.327 * Increased maximum number of open files to 10032 (it was originally set to 1024).
2937:X 29 Jul 01:08:14.377 * Running mode=sentinel, port=26380.
2937:X 29 Jul 01:08:14.377 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
2937:X 29 Jul 01:08:14.379 # Sentinel ID is 4a6aebffdd1301bf054e722c34e8a6611418ba8a
2937:X 29 Jul 01:08:14.379 # +monitor master mymaster 127.0.0.1 6379 quorum 2
2937:X 29 Jul 01:08:14.380 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
2937:X 29 Jul 01:08:14.381 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
//一个新的sentinel(监控)已经被识别并关联
2937:X 29 Jul 01:08:14.919 * +sentinel sentinel 1a77392638e41bb0ea0a865ffc93b8de6335227f 127.0.0.1 26379 @ mymaster 127.0.0.1 6379

  • Sentinel6380启动后会发现,Sentinel6379.log 加了一段日志
//一个新的sentinel(监控)已经被识别并关联
2908:X 29 Jul 01:08:16.367 * +sentinel sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379

新启动一个sentinel,会通过发布订阅功能自动发现监控相同master下的其他sentinel。这一功能是通过向频道 sentinel:hello 发送信息来实现的。

  • 启动sentinel6381
[root@vm1 redis-4.0.10]# src/redis-sentinel sentinel6381.conf &

查看 Sentinel6381.log

2961:X 29 Jul 01:11:09.823 # Configuration loaded
2961:X 29 Jul 01:11:09.823 * Increased maximum number of open files to 10032 (it was originally set to 1024).
2961:X 29 Jul 01:11:09.852 * Running mode=sentinel, port=26381.
2961:X 29 Jul 01:11:09.852 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
2961:X 29 Jul 01:11:09.853 # Sentinel ID is 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030
2961:X 29 Jul 01:11:09.853 # +monitor master mymaster 127.0.0.1 6379 quorum 2
2961:X 29 Jul 01:11:09.853 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
2961:X 29 Jul 01:11:09.855 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
2961:X 29 Jul 01:11:10.334 * +sentinel sentinel 1a77392638e41bb0ea0a865ffc93b8de6335227f 127.0.0.1 26379 @ mymaster 127.0.0.1 6379
2961:X 29 Jul 01:11:11.446 * +sentinel sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379

  • 新加入的sentinel6381, sentinel6379和sentinel6380 都会收到通知(//一个新的sentinel(监控)已经被识别并关联)

2908:X 29 Jul 01:11:11.880 * +sentinel sentinel 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030 127.0.0.1 26381 @ mymaster 127.0.0.1 6379

2937:X 29 Jul 01:11:11.878 * +sentinel sentinel 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030 127.0.0.1 26381 @ mymaster 127.0.0.1 6379

sentinel的状态都会记录到sentinel.conf文件中,用于启动后恢复状态,查看下各个sentinel.conf 文件 变动后的部分

  • sentinel.conf

启动前:无     启动后:sentinel myid 1a77392638e41bb0ea0a865ffc93b8de6335227f //自己的sentinel myid

启动前:无     启动后:

# Generated by CONFIG REWRITE

#master下得两个从服务

sentinel known-slave mymaster 127.0.0.1 6380

sentinel known-slave mymaster 127.0.0.1 6381

#master下其他两个sentinel

sentinel known-sentinel mymaster 127.0.0.1 26380 4a6aebffdd1301bf054e722c34e8a6611418ba8a

sentinel known-sentinel mymaster 127.0.0.1 26381 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030

sentinel current-epoch 0

Sentinel6380.conf和Sentinel6381.conf改动和sentinel.conf 基本一样。不一样的就是 记录自己sentinel myid和master下其他两个sentinel不一样,大同小异。

测试故障迁移

Sentinel 故障迁移我使用的是默认配置(不需要再配置,可以自定义修改)

//判断master失效,至少有两个sentinel同意才会执行故障迁移sentinel monitor mymaster 127.0.0.1 6379 2
//如果在10秒内sentinel 都收到master的一次有效回复,就认为该master主观下线
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000//在执行故障转移时,同时只有一个slave能对新的master进行数据同步sentinel parallel-syncs mymaster 1 sentinel monitor resque 192.168.1.3 6380 4 sentinel down-after-milliseconds resque 10000 sentinel failover-timeout resque 180000 sentinel parallel-syncs resque 5

  • 1:查看redis的相关服务
root       2755   2551  0 00:11 pts/2    00:00:06 src/redis-server 127.0.0.1:6379
root       2780   2551  0 00:13 pts/2    00:00:06 src/redis-server 127.0.0.1:6380
root       2809   2551  0 00:26 pts/2    00:00:05 src/redis-server 127.0.0.1:6381
root       2816   2529  0 00:30 pts/1    00:00:00 redis-cli -h 127.0.0.1 -p 6379
root       2822   2530  0 00:33 pts/0    00:00:00 redis-cli -h 127.0.0.1 -p 6380
root       2841   2823  0 00:34 pts/6    00:00:00 redis-cli -h 127.0.0.1 -p 6381
root       2908   2551  0 01:01 pts/2    00:00:07 src/redis-sentinel *:26379 [sentinel]
root       2937   2551  0 01:08 pts/2    00:00:06 src/redis-sentinel *:26380 [sentinel]
root       2961   2551  0 01:11 pts/2    00:00:06 src/redis-sentinel *:26381 [sentinel]
root       3000   2551  0 01:50 pts/2    00:00:00 grep --color=auto redis

  • 2:查看整个备份状态
127.0.0.1:6379> info# Server
# Clients
# Memory
# Persistence
# Stats
# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=6380,state=online,offset=544591,lag=0
slave1:ip=127.0.0.1,port=6381,state=online,offset=544591,lag=0
master_replid:541cd938f43b4f144e647881af409fa1884ea5a4
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:544857
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:544857
# CPU# Cluster# Keyspace

其他信息我都删掉了,只留下【Replication】的信息,其他信息可以在redis-cli 命令行中使用【info】命令查看

可以看到,6379是master角色,master下有两个从服务port=6380,port=6381

  • 3: Kill 掉 master,观察日志
kill -9 2755

master被干掉了,所以master.log 没有日志,看其他两个从服务日志(截取部分)

redis_slave6380.log

///一分钟内
2780:S 29 Jul 01:58:53.414 # Connection with master lost.
2780:S 29 Jul 01:58:53.414 * Caching the disconnected master state.
2780:S 29 Jul 01:58:54.163 * Connecting to MASTER 127.0.0.1:6379
2780:S 29 Jul 01:58:54.163 * MASTER <-> SLAVE sync started
2780:S 29 Jul 01:58:54.164 # Error condition on socket for SYNC: Connection refused
2780:S 29 Jul 01:58:55.168 * Connecting to MASTER 127.0.0.1:6379
2780:S 29 Jul 01:58:55.169 * MASTER <-> SLAVE sync started
....
...
...
2780:S 29 Jul 01:59:22.381 # Error condition on socket for SYNC: Connection refused
2780:S 29 Jul 01:59:23.389 * Connecting to MASTER 127.0.0.1:6379
2780:S 29 Jul 01:59:23.389 * MASTER <-> SLAVE sync started
2780:S 29 Jul 01:59:23.389 # Error condition on socket for SYNC: Connection refused
///一分钟后
2780:S 29 Jul 01:59:24.321 * SLAVE OF 127.0.0.1:6381 enabled (user request from 'id=8 addr=127.0.0.1:52556 fd=11 name=sentinel-4a6aebff-cmd age=3070 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=133 qbuf-free=32635 obl=36 oll=0 omem=0 events=r cmd=exec')
2780:S 29 Jul 01:59:24.323 # CONFIG REWRITE executed with suc
2780:S 29 Jul 01:59:24.399 * Connecting to MASTER 127.0.0.1:6381
2780:S 29 Jul 01:59:24.399 * MASTER <-> SLAVE sync started
2780:S 29 Jul 01:59:24.399 * Non blocking connect for SYNC fired the event.
2780:S 29 Jul 01:59:24.399 * Master replied to PING, replication can continue...
2780:S 29 Jul 01:59:24.399 * Trying a partial resynchronization (request 541cd938f43b4f144e647881af409fa1884ea5a4:617714).
2780:S 29 Jul 01:59:24.400 * Successful partial resynchronization with master.
2780:S 29 Jul 01:59:24.400 # Master replication ID changed to 514edab0972b4b6e5388edc4f14fbdb4d223d39e
2780:S 29 Jul 01:59:24.400 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.

从01:58:53.414 到01:59:23.389 这一分钟内一直在尝试连接master,一分钟内没有连接成功后,sentinel 就会master判断为主观下线,看日志

Sentinel6379.log

//判定master 主观下线
2908:X 29 Jul 01:59:23.492 # +sdown master mymaster 127.0.0.1 6379
//当前的纪元(epoch)已经被更新。
2908:X 29 Jul 01:59:23.546 # +new-epoch 1
 //开始给sentinel6380投票,谁来主导这次故障转移
2908:X 29 Jul 01:59:23.549 # +vote-for-leader 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1
//判定master 客观观下线,已经有2个sentinel同意
2908:X 29 Jul 01:59:23.569 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2
2908:X 29 Jul 01:59:23.569 # Next failover delay: I will not start a failover before Sun Jul 29 02:05:23 2018
2908:X 29 Jul 01:59:24.328 # +config-update-from sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
2908:X 29 Jul 01:59:24.328 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
2908:X 29 Jul 01:59:24.328 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
2908:X 29 Jul 01:59:24.328 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
2908:X 29 Jul 01:59:54.333 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

Sentinel6380.log

//判定master 主观下线
2937:X 29 Jul 01:59:23.459 # +sdown master mymaster 127.0.0.1 6379
//判定master 客观观下线,已经有2个sentinel同意
2937:X 29 Jul 01:59:23.536 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2
2937:X 29 Jul 01:59:23.537 # +new-epoch 1
//尝试故障转移master
2937:X 29 Jul 01:59:23.537 # +try-failover master mymaster 127.0.0.1 6379
 //开始给sentinel6380投票,谁来主导这次故障转移
2937:X 29 Jul 01:59:23.540 # +vote-for-leader 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1
//其他两个sentinel 都投票给4a6aebffdd1301bf054e722c34e8a6611418ba8a 【6380sentinel】
2937:X 29 Jul 01:59:23.549 # 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030 voted for 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1
2937:X 29 Jul 01:59:23.549 # 1a77392638e41bb0ea0a865ffc93b8de6335227f voted for 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1
//6379这个服务赢得选举可以进行故障转移
2937:X 29 Jul 01:59:23.619 # +elected-leader master mymaster 127.0.0.1 6379
//发现6379这个服务是故障转移状态,就开始选择master下得从服务
2937:X 29 Jul 01:59:23.619 # +failover-state-select-slave master mymaster 127.0.0.1 6379
//故障转移操作现在处于 select-slave 状态 —— Sentinel 正在寻找可以升级为主服务器的从服务器。(选择mymaster 127.0.0.1 6379 下 6381 的从服务)
2937:X 29 Jul 01:59:23.710 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
//Sentinel 正在将6379下的从服务器6381升级为主服务器,等待升级功能完成。
2937:X 29 Jul 01:59:23.710 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
//master下的从服务 6381  等待升级
2937:X 29 Jul 01:59:23.769 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
//升级master下从服务6381
2937:X 29 Jul 01:59:24.251 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
//故障转移状态切换到了 reconf-slaves 状态。(再次确认从服务器转为主服务器)
2937:X 29 Jul 01:59:24.251 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379
//牵头的sentinel 向6380从服务器发送slaveof 指令,将它设置为新的master
2937:X 29 Jul 01:59:24.321 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
//6379不再处于客观下线状态,客观下线状态只用于master服务,6379已经不是master了
2937:X 29 Jul 01:59:24.653 # -odown master mymaster 127.0.0.1 6379
//6380服务正在将自己设置为6381主服务的从服务器,还未完成
2937:X 29 Jul 01:59:25.270 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
//从服务器6380已经完成对新master服务的同步
2937:X 29 Jul 01:59:25.271 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
//master6379 故障转移结束,所有的从服务器开始同步新的master
2937:X 29 Jul 01:59:25.347 # +failover-end master mymaster 127.0.0.1 6379
//配置变更主服务器的ip地址已经改变, 选择master 为6381
2937:X 29 Jul 01:59:25.347 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
//6381下的两个从服务(新的从服务被识别并关联)
2937:X 29 Jul 01:59:25.347 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
2937:X 29 Jul 01:59:25.347 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
//添加master下从服务6379 为客观下线
2937:X 29 Jul 01:59:55.401 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

Sentinel6381.log

//判定master 主观下线2961:X 29 Jul 01:59:23.459 # +sdown master mymaster 127.0.0.1 6379
2961:X 29 Jul 01:59:23.545 # +new-epoch 1//开始给6380投票
2961:X 29 Jul 01:59:23.548 # +vote-for-leader 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1
2961:X 29 Jul 01:59:24.325 # +config-update-from sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379
2961:X 29 Jul 01:59:24.325 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381
2961:X 29 Jul 01:59:24.325 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
2961:X 29 Jul 01:59:24.325 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
2961:X 29 Jul 01:59:54.348 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

可以看到在 01:59:23秒 也就是一分钟之后,三个监控master的sentinel 都判定了master为主观下线(sdown),我们配置的至少有2个sentinel 同意master 主观下线,master就会被切换到客观下线(odown) 【+odown master mymaster 127.0.0.1 6379 #quorum 2/2】。当判断master为客观下线后,sentinel 就开始选举出新的master,可以看到Sentinel6380.log 日志要比其他的sentinel.log多,因为整个选举的过程是Sentinel6380 在牵头执行。

4:在6381下查看整个服务的备份状态

# Replication
role:master
connected_slaves:1
slave0:ip=127.0.0.1,port=6380,state=online,offset=1125775,lag=0
master_replid:514edab0972b4b6e5388edc4f14fbdb4d223d39e
master_replid2:541cd938f43b4f144e647881af409fa1884ea5a4
master_repl_offset:1125775
second_repl_offset:617714
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:77200
repl_backlog_histlen:1048576

可以看到6381的角色成为了master,只有一个slave,因为另一个挂了。

5:再次启动6379服务

查看6379服务的日志

3037:S 29 Jul 02:46:47.753 # CONFIG REWRITE executed with success.3037:S 29 Jul 02:46:48.359 * Connecting to MASTER 127.0.0.1:63813037:S 29 Jul 02:46:48.360 * MASTER <-> SLAVE sync started3037:S 29 Jul 02:46:48.360 * Non blocking connect for SYNC fired the event.3037:S 29 Jul 02:46:48.361 * Master replied to PING, replication can continue...3037:S 29 Jul 02:46:48.362 * Trying a partial resynchronization (request 7b0dc6ac9c2188e3c92eb29eea200ea6c572619c:1).3037:S 29 Jul 02:46:48.608 * Full resync from master: 514edab0972b4b6e5388edc4f14fbdb4d223d39e:11781423037:S 29 Jul 02:46:48.608 * Discarding previously cached master state.3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: receiving 253 bytes from master3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: Flushing old data3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: Loading DB in memory3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: Finished with success3037:S 29 Jul 02:46:48.709 * Background append only file rewriting started by pid 30423037:S 29 Jul 02:46:48.750 * AOF rewrite child asks to stop sending diffs.3042:C 29 Jul 02:46:48.750 * Parent agreed to stop sending diffs. Finalizing AOF...3042:C 29 Jul 02:46:48.750 * Concatenating 0.00 MB of AOF diff received from parent.3042:C 29 Jul 02:46:48.750 * SYNC append only file rewrite performed3042:C 29 Jul 02:46:48.750 * AOF rewrite: 6 MB of memory used by copy-on-write3037:S 29 Jul 02:46:48.781 * Background AOF rewrite terminated with success3037:S 29 Jul 02:46:48.782 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)3037:S 29 Jul 02:46:48.782 * Background AOF rewrite finished successfully

看到启动后重写配置文件,然后自动连接6381这个新的master服务,开始从master 上全量同步数据

查看6381这个新master日志

//响应6379的同步请求2809:M 29 Jul 02:46:48.362 * Slave 127.0.0.1:6379 asks for synchronization
//不接受同步部分数据请求
2809:M 29 Jul 02:46:48.362 * Partial resynchronization not accepted: Replication ID mismatch (Slave asked for '7b0dc6ac9c2188e3c92eb29eea200ea6c572619c', my replication IDs are '514edab0972b4b6e5388edc4f14fbdb4d223d39e' and '541cd938f43b4f144e647881af409fa1884ea5a4')
//开始同步
2809:M 29 Jul 02:46:48.362 * Starting BGSAVE for SYNC with target: disk
2809:M 29 Jul 02:46:48.607 * Background saving started by pid 3041
3041:C 29 Jul 02:46:48.607 * DB saved on disk
#6m内存用于写复制
3041:C 29 Jul 02:46:48.608 * RDB: 6 MB of memory used by copy-on-write
//后台保存成功
2809:M 29 Jul 02:46:48.708 * Background saving terminated with success
2809:M 29 Jul 02:46:48.708 * Synchronization with slave 127.0.0.1:6379 succeeded

查看sentinel日志

6379sentinel.log

//减去6379服务的主观下线状态
2908:X 29 Jul 02:46:37.610 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381
//转换为master6381 下的从服务
2908:X 29 Jul 02:46:47.628 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

6380sentinel.log

2937:X 29 Jul 02:46:37.767 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

6381sentinel.log

2961:X 29 Jul 02:46:38.023 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

再看下6381这个新master的整个备份信息

# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=6380,state=online,offset=1348132,lag=0
slave1:ip=127.0.0.1,port=6379,state=online,offset=1348132,lag=0
master_replid:514edab0972b4b6e5388edc4f14fbdb4d223d39e
master_replid2:541cd938f43b4f144e647881af409fa1884ea5a4
master_repl_offset:1348132
second_repl_offset:617714
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:299557
repl_backlog_histlen:1048576

新的master 增加了一个slave6379 从服务

我们再搭建前做的redis配置,当故障转移成功后,这些配置会被重写,重写的内容基本都在配置文件的最后

Redis.conf配置文件,多了

# Generated by CONFIG REWRITE
slaveof 127.0.0.1 6381

Redis6381.conf配置文件没有变

# Master-Slave replication. Use slaveof to make a Redis instance a copy of
# slaveof <masterip> <masterport>
slaveof 127.0.0.1 6381

新的master配置,也就是redis_slave6381.conf 已经没有了slaveof 配置

# Master-Slave replication. Use slaveof to make a Redis instance a copy of
# slaveof <masterip> <masterport>

Sentinel.conf 也会发生变化,可以自己去看看

6:再次测试故障转移后的 同步功能

之前的master已经不再支持set

127.0.0.1:6379> set name zhangxs
(error) READONLY You can't write against a read only slave.

新master set成功

127.0.0.1:6381> set name zhangxs
OK

127.0.0.1:6379> get name
"zhangxs"

127.0.0.1:6380> get name
"zhangxs"

同步没问题。

转移后的服务器变成了

服务角色

端口

Redis.conf名称

sentinel配置文件名称

sentinel端口

redis日志路径

sentinel路劲

从(master)

6379

redis.conf

sentinel.conf

26379

/home/zhangxs/data/redislog/redis_server/master.log

/home/zhangxs/data/redislog/sentinel/sentinel6379.log

从(slave)

6380

redis_slave6380.conf

Sentinel6380.conf

26380

/home/zhangxs/data/redislog/redis_server/slave6380.log

/home/zhangxs/data/redislog/sentinel/sentinel6380.log

主(slave)

6381

redis_slave6381.conf

Sentinel6381.conf

26381

/home/zhangxs/data/redislog/redis_server/slave6381.log

/home/zhangxs/data/redislog/sentinel/sentinel6381.log

参考文档:http://www.redis.cn/topics

转载于:https://www.cnblogs.com/zhangXingSheng/p/9385885.html

Redis-ha(sentinel)搭建相关推荐

  1. Redis哨兵Sentinel的搭建和原理说明

    原文地址:http://www.cnblogs.com/zhoujinyi/p/5570024.html 背景: Redis-Sentinel是Redis官方推荐的高可用性(HA)解决方案,当用Red ...

  2. Redis HA篇 +集群搭建

    说明:本文为面向Redis集群搭建的指导手册 标签:Redis集群.Redis高可用.Redis分布式.Redis 4.0.2 注意:文中删去了不需要的多余部分,让初学者一目了然一学就会 温馨提示:如 ...

  3. 【Redis】使用Redis Sentinel实现Redis HA

    阅读目录 简单介绍 章节1:配置Sentinel.conf 章节2:启动Redis Sentinel 章节3:关闭master redis,测试failover 章节4:重新切换为最初的master ...

  4. Redis 集群搭建(三):Docker 部署 Redis + Sentinel 高可用集群

    Redis 集群搭建(三):Docker 部署 Redis + Sentinel 高可用集群 前言 建议 官方译文 什么是 Sentinel? Sentinel 优点 Redis 配置文件 maste ...

  5. Redis哨兵(sentinel)

    Redis哨兵(sentinel) [目标] 掌握解决主从复制故障的解决方案 掌握哨兵监控的搭建 掌握哨兵监控机制及故障的自动转移 [理论知识] 哨兵监控架构设计 主观和客观下线 Leader选举流程 ...

  6. Redis集群搭建及java连接redis

    Redis集群搭建及java连接redis Redis集群分为三种: 1.主从关系模式2.Sentinel哨兵关系模式3.Cluster去中心化模式 1.主从关系模式 1.1.什么是主从模式? (1) ...

  7. 云服务器 Redis 集群搭建

    云服务器 Redis 集群搭建 主从复制架构搭建 集群架构介绍 环境准备 安装 配置及启动 测试 Sentinel (哨兵)架构搭建 集群架构介绍 环境准备 Sentinel 配置 Sentinel ...

  8. Redis集群搭建的三种方式

    一.Redis主从 1.1 Redis主从原理 和MySQL需要主从复制的原因一样,Redis虽然读取写入的速度都特别快,但是也会产生性能瓶颈,特别是在读压力上,为了分担压力,Redis支持主从复制. ...

  9. Mac redis集群搭建

    redis集群搭建说明 主从模式 Sentinel哨兵模式 Cluster模式 主从模式 主从模式是三种模式中最为简单的,其中主库(master)只能有一台,从库(slave)可以有多台. 1.首先安 ...

最新文章

  1. 20145328 《网络对抗技术》恶意代码分析
  2. BIOS——PE无法识别硬盘问题问题解决方案
  3. leetcode 155. 最小栈(常数时间获取最小值,需要维护两个栈)
  4. 东京食尸鬼 第四季 高清下载
  5. 电子商务有哪些相关工作?
  6. softmax ce loss_从Softmax到AMSoftmax(附可视化代码和实现代码)
  7. freessl申请ssl证书-笔记
  8. Android连接MySQL数据库
  9. DuKBitmapImages 图片压缩处理技术
  10. 文件管理(以Unix系统为例)
  11. macw资讯:MacOS如何隐藏、加密文件或文件夹
  12. 实践数据湖iceberg 第十四课 元数据合并(解决元数据随时间增加而元数据膨胀的问题)
  13. 机柜系统:数据和业务的幕后英雄
  14. Android社招最全面试题,成功拿下大厂offer
  15. win10修复ubuntu18.04引导
  16. 常见安全设备总结(IDS、IPS、上网行为管理、网闸、漏扫、日志审计、数据库审计、堡垒机等)
  17. ssl证书购买后的认证签发过程
  18. 直链下载Windows和office安装包,这个网站值得收藏
  19. 详细对比DRAM、Flash和DDR技术
  20. 计算机办公应用实训教程,《21世纪高等学校规划教材·计算机应用:Office办公软件同步实训教程》—甲虎网一站式图书批发平台...

热门文章

  1. 【PAT甲级 - 1013】Battle Over Cities (25分)(并查集)
  2. 【HDU - 5968】异或密码(思维,STLmap)
  3. Apollo进阶课程⑳丨Apollo感知之旅——机器学习与感知的未来
  4. matlab的diray在哪,matlab笔记
  5. 腾讯 tars java_腾讯TARS开源团队郑苏波:腾讯微服务开发框架的源码剖析
  6. 计算机网络十进制转二进制的应用题,【网络-理论】二进制与十进制的转换
  7. Collection源码阅读
  8. C++:09---类静态成员、类常量成员
  9. Linux内核OOM机制的详细分析
  10. ML Tools List