1 partitions have leader brokers without a matching listener, including [baidd-0] (org.apache.kafka.
一 故障描述
9月22日,全国kafka集群中的其中一台kafka因磁盘空间不足宕机后,业务会受到影响,无法生产与消费消息。程序报错:
WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] 1 partitions have leader brokers without a matching listener, including [baidd-0] (org.apache.kafka.clients.NetworkClient)
二 故障模拟
2.1 topic分区的replicas为1时情形
#生产消息
[root@Centos7-Mode-V8 kafka]# bin/kafka-console-producer.sh --broker-list 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic baidd
>aa
>bb
#消费消息:
[root@Centos7-Mode-V8 kafka]# bin/kafka-console-consumer.sh -bootstrap-server 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic baidd
aa
bb
2.1.1 模拟关掉该topic所属leader节点
#用kafka tool查看该topic的分区的leader在哪个节点上
/*
用kafka命令也可以看
bin/kafka-topics.sh --zookeeper 192.168.144.247:3292,192.168.144.251:3292,192.168.144.253:3292 --describe
结果输出如下:
*/
关掉其leader节点,发现生产者和所有消费者进程都一直在刷如下信息:
[2021-09-23 17:09:53,495] WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] 1 partitions have leader brokers without a matching listener, including [baidd-0] (org.apache.kafka.clients.NetworkClient)
无法发送消息,也无法消费消息。
2.1.2 模拟关掉非leader节点
有时消费者进程会报错:[2021-09-23 17:21:22,480] WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] Connection to node 2147483645 (/192.168.144.253:9193) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
报错期间可以正常生产消息,但无法消费这中间产生的数据。
2.1.3 总结
在分区replicats等于1的情况下,停掉任意一个节点,都会影响业务。
其中,当某个分区leader所在节点宕机,会影响生产消息与消费消息。
当非leader节点宕机,会影响消费消息。
2.2 分区有多个副本情形
分区在无其他副本情况下,影响业务可以理解,因此尝试为topic配置多个副本,发现竟然还是影响业务:
#创建一个拥有三副本的topic
bin/kafka-topics.sh --create --zookeeper 192.168.144.247:3292,192.168.144.251:3292,192.168.144.253:3292 --replication-factor 3 --partitions 1 --topic song
#查看副本信息
[root@Centos7-Mode-V8 kafka]# bin/kafka-topics.sh --zookeeper 192.168.144.247:3292,192.168.144.251:3292,192.168.144.253:3292 --describe --topic song
Topic:song PartitionCount:1 ReplicationFactor:3 Configs:
Topic: song Partition: 0 Leader: 0 Replicas: 0,2,1 Isr: 0,2,1
#发消息
bin/kafka-console-producer.sh --broker-list 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic song
#消费进程1
bin/kafka-console-consumer.sh -bootstrap-server 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic song --group g1
#消费进程2
bin/kafka-console-consumer.sh -bootstrap-server 192.168.144.247:9193,192.168.144.251:9193,192.168.144.253:9193 --topic song --group g2
#模拟关掉该topic所属leader节点
发现还能生产消息,没有报1 partitions have leader brokers without a matching listener错了,但是发现消费者在连不上topic leader后,有时报错:
[2021-09-24 19:01:06,316] WARN [Consumer clientId=consumer-1, groupId=console-consumer-27609] Connection to node 2147483647 (/192.168.144.247:9193) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
这期间生产的数据有时没有过来,无法消费节点故障期间产生的消息。
只是为什么有了多个副本之后节点宕机还是会丢消息呢?
答:__consumer_offsets只有1个副本,会导致即使拥有多个副本的topic也无法实现高可用。
#后来通过扩kafka自带的这个topic(__consumer_offsets)的副本,可以实现其他普通topic的高可用了,虽然停掉某个节点后,还是报Broker may not be available,但是不再影响业务了。
三 故障定位
Kafka配置文件中没配置default.replication.factor=3,而该参数默认为1,表示没有其他副本,因此相当于是单点。
四 解决办法
4.1 修改default.replication.factor参数
修改所有kafka节点配置文件,调大topic的默认副本因子(该参数默认为1):
default.replication.factor=3
设置了default.replication.factor=3,offsets.topic.replication.factor也会默认为3。
注意,不要设置了default.replication.factor=3,又设置offsets.topic.replication.factor=1,这样offsets.topic.replication.factor的值会覆盖default.replication.factor的值。
#重启kafka,使配置生效
systemctl restart kafka
4.2 为现有普通topic扩副本
可参考https://blog.csdn.net/yabingshi_tech/article/details/120443647
4.3 为__consumer_offset扩副本
方法同上,json文件如下:
{"version": 1, "partitions": [{"topic": "__consumer_offsets", "partition": 0, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 1, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 2, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 3, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 4, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 5, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 6, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 7, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 8, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 9, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 10, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 11, "replicas": [0, 1, 2]},{"topic": "__consumer_offsets", "partition": 12, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 13, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 14, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 15, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 16, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 17, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 18, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 19, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 20, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 21, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 22, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 23, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 24, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 25, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 26, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 27, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 28, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 29, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 30, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 31, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 32, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 33, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 34, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 35, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 36, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 37, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 38, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 39, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 40, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 41, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 42, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 43, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 44, "replicas": [2, 0, 1 ]},{"topic": "__consumer_offsets", "partition": 45, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 46, "replicas": [0, 1, 2 ]},{"topic": "__consumer_offsets", "partition": 47, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 48, "replicas": [1, 2, 0 ]},{"topic": "__consumer_offsets", "partition": 49, "replicas": [2, 0, 1 ]}]
}
--本篇文章参考了:Kafka突然宕机了?稳住,莫慌!
1 partitions have leader brokers without a matching listener, including [baidd-0] (org.apache.kafka.相关推荐
- kafka服务器报错1 partitions have leader brokers without a matching listener, including [topic_log-0]
服务器部分报错信息截图 2022-06-11 17:18:18,140 (PollableSourceRunner-KafkaSource-r1) [WARN - org.apache.kafka.c ...
- 【kafka】连接kafka报错 partitions have leader brokers without a matching listener
1.概述 一个正常的kafka消费者,开始正常,后来报错 partitions have leader brokers without a matching listener WARN [tag-se ...
- 连接kafka报错:1 partitions have leader brokers without a matching listener
服务输出部分错误日志截图 2020/12/25 下午2:32:442020-12-25 14:32:44.320 WARN [tag-service,,,] 1 --- [ntainer#4-0-C- ...
- WARN [Consumer clientId=consumer-1, groupId=console-consumer-55928] 1 partitions have leader brokers
一 问题描述 同事反馈我们的三节点kafka集群当其中一台服务器宕机后,业务受到影响,无法生产与消费消息.程序报错: WARN [Consumer clientId=consumer-1, group ...
- 【Kafka】测试集群中Broker故障对客户端的影响
本文主要测试Kafka集群中Broker节点故障对客户端的影响. 集群信息:4个broker.topic:100+(每个topic30个partition).集群加密方式:plaintext.存储:c ...
- OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions
问题描述: OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions ...
- Kafka深度解析(如何在producer中指定partition)(转)
原文链接:Kafka深度解析 背景介绍 Kafka简介 Kafka是一种分布式的,基于发布/订阅的消息系统.主要设计目标如下: 以时间复杂度为O(1)的方式提供消息持久化能力,即使对TB级以上数据也能 ...
- Docker安装Kafka(docker-compose.yml)
Docker安装Kafka(docker-compose.yml) 前置条件 请先安装Docker 创建docker-compose.yml文件 version: '2' services:zooke ...
- kafka redis vs 发布订阅_发布订阅的消息系统 Kafka的深度解析
背景介绍 Kafka简介 Kafka是一种分布式的,基于发布/订阅的消息系统.主要设计目标如下: 以时间复杂度为O(1)的方式提供消息持久化能力,即使对TB级以上数据也能保证常数时间的访问性能 高吞吐 ...
最新文章
- C 语言回顾,数组指针的使用(小鸡肋的使用)
- 数据库使用--MySQL: InnoDB 还是 MyISAM?
- 第二百二十六天 how can I 坚持
- UA MATH567 高维统计专题1 稀疏信号及其恢复5 LASSO的估计误差
- vue 如何解析原生html,VUE渲染后端返回含有script标签的html字符串示例
- 基于linux智能家居系统设计,基于Linux的智能家居的设计(2)
- EasyUI datagrid : 启用行号、固定列及多级表头后,头部行号位置单元格错位的问题...
- The Model Driven Software Network
- Android开发学习之卡片式布局的简单实现
- axure如何页面滑动时广告位上移_Axure8.0教程:模拟滑动效果
- java的访问修饰符
- javascript客户端验证函数大全
- 《深入浅出通信原理》连载
- Resnet网络结构图和对应参数表的简单理解
- 个税计算公式excel_财务不会做工资表?全函数统计查询、自动个税计算模板送你,给力...
- 【机器学习】分类性能度量指标 : ROC曲线、AUC值、正确率、召回率、敏感度、特异度
- windows7系统安装,Ultimate(旗舰版)
- 排列和组合 Permutation and Combination
- javalang 生成抽象语法树AST ----python源码分析
- 终端模拟器 java_程序员必备之终端模拟器,让你的终端世界多一抹“颜色”