1. 背景介绍

由于业务需要Redis,我司选择使用Redis Operator在Kubernetes中部署一套Redis集群。

运维同学反应,当重启虚拟机之后,Redis集群无法重建集群。我把问题排查过程记录了下来,于是有了大家看到的这篇文章。

重启虚拟机之后,Redis-Operator无法重建集群,错误日志如下:

从日志信息中可以看到,Redis集群的Leader, Follower已经成功启动,但是在通过redis-cli --cluster add-node命令时出错。

# kubectl logs -f --tail 100 -n redis-system redis-operator-75f946fd68-stzjx
I1213 03:17:50.751820       1 request.go:665] Waited for 1.042141823s due to client-side throttling, not priority and fairness, request: GET:https://10.233.0.1:443/apis/events.k8s.io/v1?timeout=32s
{"level":"info","ts":1670901471.1052506,"logger":"controller-runtime.metrics","msg":"Metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1670901471.1055748,"logger":"setup","msg":"starting manager"}
I1213 03:17:51.105988       1 leaderelection.go:248] attempting to acquire leader lease redis-system/6cab913b.redis.opstreelabs.in...
{"level":"info","ts":1670901471.1060054,"msg":"Starting server","path":"/metrics","kind":"metrics","addr":"[::]:8080"}
{"level":"info","ts":1670901471.1060386,"msg":"Starting server","kind":"health probe","addr":"[::]:8081"}
I1213 03:18:08.969275       1 leaderelection.go:258] successfully acquired lease redis-system/6cab913b.redis.opstreelabs.in
{"level":"info","ts":1670901488.9695313,"logger":"controller.redis","msg":"Starting EventSource","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"Redis","source":"kind source: *v1beta1.Redis"}
{"level":"info","ts":1670901488.9696147,"logger":"controller.redis","msg":"Starting Controller","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"Redis"}
{"level":"info","ts":1670901488.9695785,"logger":"controller.rediscluster","msg":"Starting EventSource","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"RedisCluster","source":"kind source: *v1beta1.RedisCluster"}
{"level":"info","ts":1670901488.9696522,"logger":"controller.rediscluster","msg":"Starting Controller","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"RedisCluster"}
{"level":"info","ts":1670901489.0710678,"logger":"controller.rediscluster","msg":"Starting workers","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"RedisCluster","worker count":1}
{"level":"info","ts":1670901489.0711362,"logger":"controller.redis","msg":"Starting workers","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"Redis","worker count":1}
{"level":"info","ts":1670901489.071197,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"redis-system","Request.Name":"redis"}
{"level":"info","ts":1670901489.0770147,"logger":"controller_redis","msg":"Redis statefulset get action was successful","Request.StatefulSet.Namespace":"redis-system","Request.StatefulSet.Name":"redis-leader"}
{"level":"info","ts":1670901489.0919821,"logger":"controller_redis","msg":"Reconciliation Complete, no Changes required.","Request.StatefulSet.Namespace":"redis-system","Request.StatefulSet.Name":"redis-leader"}
{"level":"info","ts":1670901489.0957606,"logger":"controller_redis","msg":"Redis service get action is successful","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-leader-headless"}
{"level":"info","ts":1670901489.0983348,"logger":"controller_redis","msg":"Redis service is already in-sync","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-leader-headless"}
{"level":"info","ts":1670901489.10161,"logger":"controller_redis","msg":"Redis service get action is successful","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-leader"}
{"level":"info","ts":1670901489.1034107,"logger":"controller_redis","msg":"Redis service is already in-sync","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-leader"}
{"level":"info","ts":1670901489.2027256,"logger":"controller_redis","msg":"Redis PodDisruptionBudget get action failed","Request.PodDisruptionBudget.Namespace":"redis-system","Request.PodDisruptionBudget.Name":"redis-leader"}
{"level":"info","ts":1670901489.202785,"logger":"controller_redis","msg":"Reconciliation Successful, no PodDisruptionBudget Found.","Request.PodDisruptionBudget.Namespace":"redis-system","Request.PodDisruptionBudget.Name":"redis-leader"}
{"level":"info","ts":1670901489.2072728,"logger":"controller_redis","msg":"Redis statefulset get action was successful","Request.StatefulSet.Namespace":"redis-system","Request.StatefulSet.Name":"redis-follower"}
{"level":"info","ts":1670901489.217569,"logger":"controller_redis","msg":"Reconciliation Complete, no Changes required.","Request.StatefulSet.Namespace":"redis-system","Request.StatefulSet.Name":"redis-follower"}
{"level":"info","ts":1670901489.2213707,"logger":"controller_redis","msg":"Redis service get action is successful","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-follower-headless"}
{"level":"info","ts":1670901489.2232463,"logger":"controller_redis","msg":"Redis service is already in-sync","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-follower-headless"}
{"level":"info","ts":1670901489.2272012,"logger":"controller_redis","msg":"Redis service get action is successful","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-follower"}
{"level":"info","ts":1670901489.2288983,"logger":"controller_redis","msg":"Redis service is already in-sync","Request.Service.Namespace":"redis-system","Request.Service.Name":"redis-follower"}
{"level":"info","ts":1670901489.2301779,"logger":"controller_redis","msg":"Redis PodDisruptionBudget get action failed","Request.PodDisruptionBudget.Namespace":"redis-system","Request.PodDisruptionBudget.Name":"redis-follower"}
{"level":"info","ts":1670901489.2302952,"logger":"controller_redis","msg":"Reconciliation Successful, no PodDisruptionBudget Found.","Request.PodDisruptionBudget.Namespace":"redis-system","Request.PodDisruptionBudget.Name":"redis-follower"}
{"level":"info","ts":1670901489.2347434,"logger":"controller_redis","msg":"Redis statefulset get action was successful","Request.StatefulSet.Namespace":"redis-system","Request.StatefulSet.Name":"redis-leader"}
{"level":"info","ts":1670901489.239003,"logger":"controller_redis","msg":"Redis statefulset get action was successful","Request.StatefulSet.Namespace":"redis-system","Request.StatefulSet.Name":"redis-follower"}
{"level":"info","ts":1670901489.2390306,"logger":"controllers.RedisCluster","msg":"Creating redis cluster by executing cluster creation commands","Request.Namespace":"redis-system","Request.Name":"redis","Leaders.Ready":"3","Followers.Ready":"3"}
{"level":"info","ts":1670901489.246679,"logger":"controller_redis","msg":"Successfully got the ip for redis","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis-leader-0","ip":"10.233.75.78"}
{"level":"info","ts":1670901489.247344,"logger":"controller_redis","msg":"Redis cluster nodes are listed","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Output":"a23173bd200452117228a4d83388d8cf4c0bd977 10.233.74.87:6379@16379 master - 0 1670901487928 3 connected 10923-16383\n2fac4565c44c0dc15d700bbea68965bcc708cfea 10.233.97.134:6379@16379 master - 0 1670901488931 2 connected 5461-10922\n39fdd5b0cd3493fe86e6af5bb0753c310e655fa6 10.233.75.78:6379@16379 myself,master - 0 1670901484000 1 connected 0-5460\n"}
{"level":"info","ts":1670901489.247449,"logger":"controller_redis","msg":"Total number of redis nodes are","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Nodes":"3"}
{"level":"info","ts":1670901489.254622,"logger":"controller_redis","msg":"Successfully got the ip for redis","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis-leader-0","ip":"10.233.75.78"}
{"level":"info","ts":1670901489.2553933,"logger":"controller_redis","msg":"Redis cluster nodes are listed","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Output":"a23173bd200452117228a4d83388d8cf4c0bd977 10.233.74.87:6379@16379 master - 0 1670901487928 3 connected 10923-16383\n2fac4565c44c0dc15d700bbea68965bcc708cfea 10.233.97.134:6379@16379 master - 0 1670901488931 2 connected 5461-10922\n39fdd5b0cd3493fe86e6af5bb0753c310e655fa6 10.233.75.78:6379@16379 myself,master - 0 1670901484000 1 connected 0-5460\n"}
{"level":"info","ts":1670901489.2554975,"logger":"controller_redis","msg":"Number of redis nodes are","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Nodes":"3","Type":"leader"}
{"level":"info","ts":1670901489.2555683,"logger":"controllers.RedisCluster","msg":"All leader are part of the cluster, adding follower/replicas","Request.Namespace":"redis-system","Request.Name":"redis","Leaders.Count":3,"Instance.Size":3,"Follower.Replicas":3}
{"level":"info","ts":1670901489.2618587,"logger":"controller_redis","msg":"Successfully got the ip for redis","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis-leader-0","ip":"10.233.75.78"}
{"level":"info","ts":1670901489.2623925,"logger":"controller_redis","msg":"Redis cluster nodes are listed","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Output":"a23173bd200452117228a4d83388d8cf4c0bd977 10.233.74.87:6379@16379 master - 0 1670901487928 3 connected 10923-16383\n2fac4565c44c0dc15d700bbea68965bcc708cfea 10.233.97.134:6379@16379 master - 0 1670901488931 2 connected 5461-10922\n39fdd5b0cd3493fe86e6af5bb0753c310e655fa6 10.233.75.78:6379@16379 myself,master - 0 1670901484000 1 connected 0-5460\n"}
{"level":"info","ts":1670901489.2665036,"logger":"controller_redis","msg":"Successfully got the ip for redis","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis-follower-0","ip":"10.233.75.71"}
{"level":"info","ts":1670901489.2665575,"logger":"controller_redis","msg":"Checking if Node is in cluster","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Node":"10.233.75.71"}
{"level":"info","ts":1670901489.2665722,"logger":"controller_redis","msg":"Adding node to cluster.","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Node.IP":"10.233.75.71","Follower.Pod":{"PodName":"redis-follower-0","Namespace":"redis-system"}}
{"level":"info","ts":1670901489.2955825,"logger":"controller_redis","msg":"Successfully got the ip for redis","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis-follower-0","ip":"10.233.75.71"}
{"level":"info","ts":1670901489.2998388,"logger":"controller_redis","msg":"Successfully got the ip for redis","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis-leader-0","ip":"10.233.75.78"}
{"level":"info","ts":1670901489.3070674,"logger":"controller_redis","msg":"Pod Counted successfully","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Count":0,"Container Name":"redis-leader"}
{"level":"error","ts":1670901489.4937437,"logger":"controller_redis","msg":"Could not execute command","Request.RedisManager.Namespace":"redis-system","Request.RedisManager.Name":"redis","Command":["redis-cli","--cluster","add-node","10.233.75.71:6379","10.233.75.78:6379","--cluster-slave","-a","xxxxxxxx"],"Output":">>> Adding node 10.233.75.71:6379 to cluster 10.233.75.78:6379\n>>> Performing Cluster Check (using node 10.233.75.78:6379)\nM: 39fdd5b0cd3493fe86e6af5bb0753c310e655fa6 10.233.75.78:6379\n   slots:[0-5460] (5461 slots) master\nM: a23173bd200452117228a4d83388d8cf4c0bd977 10.233.74.87:6379\n   slots:[10923-16383] (5461 slots) master\nM: 2fac4565c44c0dc15d700bbea68965bcc708cfea 10.233.97.134:6379\n   slots:[5461-10922] (5462 slots) master\n[OK] All nodes agree about slots configuration.\n>>> Check for open slots...\n>>> Check slots coverage...\n[OK] All 16384 slots covered.\nAutomatically selected master 10.233.75.78:6379\n[ERR] Node 10.233.75.71:6379 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.\n","Error":"Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.\n","error":"command terminated with exit code 1","stacktrace":"redis-operator/k8sutils.ExecuteRedisReplicationCommand\n\t/workspace/k8sutils/redis.go:180\nredis-operator/controllers.(*RedisClusterReconciler).Reconcile\n\t/workspace/controllers/rediscluster_controller.go:127\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227"}

集群信息如下:

# kubectl get pods -n redis-system
NAME                              READY   STATUS    RESTARTS   AGE     IP              NODE    NOMINATED NODE   READINESS GATES
redis-follower-0                  2/2     Running   0          15m     10.233.75.71    node6   <none>           <none>
redis-follower-1                  2/2     Running   0          7m9s    10.233.97.163   node5   <none>           <none>
redis-follower-2                  2/2     Running   0          5m28s   10.233.74.124   node4   <none>           <none>
redis-leader-0                    2/2     Running   0          13m     10.233.75.78    node6   <none>           <none>
redis-leader-1                    2/2     Running   0          6m33s   10.233.97.134   node5   <none>           <none>
redis-leader-2                    2/2     Running   0          5m17s   10.233.74.87    node4   <none>           <none>
redis-operator-75f946fd68-bsn6k   1/1     Running   3          23m     10.233.75.94    node6   <none>           <none>## kubectl get svc -n redis-system
NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
redis-follower            ClusterIP   10.233.24.70   <none>        6379/TCP,9121/TCP   83m
redis-follower-headless   ClusterIP   None           <none>        6379/TCP            83m
redis-leader              ClusterIP   10.233.26.65   <none>        6379/TCP,9121/TCP   83m
redis-leader-headless     ClusterIP   None           <none>        6379/TCP            83m

2. 解决方案

改造Operator,实现以下功能即可

  • 1、删除aof/rdb文件
  • 2、删除nodes.conf文件
  • 3、执行flushdb命令
  • 4、执行cluster reset命令

3. Reference

Either the node already knows other nodes (check with CLUSTER NODES) error message on joining a cluster #3154

redis集群添加节点报错Either the node already knows other nodes (check with CLUSTER NODES) or contains some k

Redis集群报错Node Is Not Empty,Either The Node Already Knows Other Nodes

Operator——Redis之重启虚拟机后无法重建集群相关推荐

  1. linux dhclient源码 多进程,重启虚拟机后dhclient进程未运行解决办法

    问题分析 重启虚拟机后,dhclient进程未运行的根因通常为: 1.NetworkManager未开启自启动导致的dhclient进程未运行 2.网卡设置未纳入NetworkManager管理导致的 ...

  2. Redis单机模式主从模式哨兵模式集群模式搭建

    文章目录 一.Redis下载及安装 1.1.下载 1.2.环境安装 1.3.编译安装 1.4.修改配置 1.5.启动Redis 1.6.验证Redis是否启动 1.7.进入到Redis客户端 1.8. ...

  3. k8s学习(2)- 虚拟机搭建搭建Kubernetes集群(1.24.2)

    虚拟机搭建搭建Kubernetes集群 环境 规划 虚拟机搭建 配置网络 解决和主机复制粘贴的问题 使用MobaXterm连接虚拟机 安装vmware tools(建议使用MobaXterm) 配置y ...

  4. EMQX Operator 如何快速创建弹性伸缩的 MQTT 集群

    引言:拥抱云原生的 EMQX 5.0 云原生理念逐渐深入到各企业关键业务的应用开发中.对于一个云原生应用来说,水平扩展和弹性集群是其应具备的重要特性. 作为积极拥抱云原生的大规模分布式开源物联网 MQ ...

  5. Redis(主从复制、哨兵模式、集群)概述及部署

    Redis(主从复制.哨兵模式.集群)概述及部署 前言 一.主从复制 (1)主从复制原理 (2)主从复制作用 (3)主从复制流程 (4)搭建主从复制 ①修改master节点配置文件 ②修改Slave节 ...

  6. VM虚拟机网络配置,集群搭建

    VMware Workstation虚拟机安装Linux系统不做介绍, 主要记录宿主机与虚拟机的网络配置 目录 基础 安装VMware后,宿主机本地网络会出现两个网卡 编辑VMware虚拟网络 虚拟机 ...

  7. 2.redis高可用-持久化-主从复制-哨兵-cluster集群概述与部署,内容依旧多看完直接通透!

    文章目录 一,Redis 高可用 1.持久化 2.主从复制 3.哨兵 4.集群(cluster) 二,Redis 持久化方式 1.持久化的功能 2.持久化的方式 三, RDB 持久化 1.触发条件 2 ...

  8. Redis核心技术笔记——Redis主从、主从从、切片集群

    1.Redis主从集群 ​ 首先我们来谈谈Redis的高可靠性,Redis的高可靠性其实有两层含义 一是保证数据尽量少丢失或者不丢失,AOF和RDB持久化保证了 二是服务尽量少中断,Redis采用了增 ...

  9. 极客时间 Redis核心技术与实战 笔记(实践篇 集群)

    Redis主从同步与故障切换,有哪些坑? 主从数据不一致 原因:主从库间的命令复制是异步进行的 从库会滞后执行同步命令的原因: 主从库间的网络可能会有传输延迟,所以从库不能及时地收到主库发送的命令,从 ...

最新文章

  1. MPLS *** 高级教程(张洋讲解演示版)
  2. 二十五、redis主从复制
  3. CV之Face Detection:Face Detection人脸检测原理及其常见分类技术
  4. moment 时间格式化
  5. VMware虚拟机安装
  6. SmartCommit让复合提交不在是难题
  7. java.lang.ExceptionInInitializerError解决办法
  8. 雷布斯风雪山神庙,董小姐威震安平寨
  9. bufferedwriter怎么写入tab_电脑越用越卡是怎么回事?教你三招恢复火箭般的速度...
  10. celery cluser redis_Celery配置Redis Sentinel做高可用
  11. mysql 水晶报表_Crystal Reports 2008(水晶报表) JDBC连接mysql数据库
  12. win10计算机休眠快捷键,win10睡眠快捷键,win10睡眠按啥键唤醒
  13. 计算机网络【课程复习】
  14. 几何图形变化(Codevember)
  15. 超详细!!vue、vue-cli脚手架项目使用prerender-spa-plugin,解决SEO并为其添加title,keyWords,descript
  16. USACO 2018 FEBURARY CONTEST :SILVER T1
  17. 用iSee图片专家制作淘宝店标教程
  18. NKOJ3685 8数
  19. S - Picture
  20. 今日金融词汇---为什么股票前面会有个DR

热门文章

  1. 自动生产线拆装与调试实训装置
  2. FLINK 流批一体ETL之flink-cdc-connectors组件
  3. 2021物联网产业链全景图谱(附PDF下载)
  4. Android 模拟器 Genymotion 安装常见问题记录
  5. 交换机堆叠知识:概述,配置与常见问题解答
  6. matlab 椭圆参数传递,[平面几何][Matlab] 平面椭圆参数与一般式之间的转换
  7. 如何有效培养一年级孩子的注意力
  8. 二本计算机专业可以考电网,容易被忽略的二本大学,毕业多数拥有铁饭碗,电网直接招走...
  9. Bulma CSS - 入门
  10. 【学习笔记】网络图数据分析导论(solid)