kafka java消费者消息拉取

版本2.4.0

Kafka的客户端消费者在启动的过程中会通过ensureActiveGroup()方法来确保自己是可用的消费者，在这个方法中，会向kafka的broker集群发送join请求，在join请求的response中可以得到该生产者所订阅的topic中被分配得到的分区信息。而接下来的消息拉取将会只请求此处分配得到的topic分区。此时，当前获得的topic分区的消费偏移量还是未知的，在正式拉取消息之前需要构造fetchOffset请求得到具体的偏移量位置以便消费。

private RequestFuture<Map<TopicPartition, OffsetAndMetadata>> sendOffsetFetchRequest(Set<TopicPartition> partitions) {Node coordinator = checkAndGetCoordinator();if (coordinator == null)return RequestFuture.coordinatorNotAvailable();log.debug("Fetching committed offsets for partitions: {}", partitions);// construct the requestOffsetFetchRequest.Builder requestBuilder = new OffsetFetchRequest.Builder(this.groupId,new ArrayList<>(partitions));// send the request with a callbackreturn client.send(coordinator, requestBuilder).compose(new OffsetFetchResponseHandler());
}

每次当kafka的消费者需要通过poll()方法拉取消息的时候，将会通过sendFetches()方法来试图拉取消息。

在准备发送fetch请求拉取消息的时候，首先需要通过prepareFetchRequests()方法来准备fetch请求。

已经完成拉取而没有实际处理的topic分区暂时没有必要再次拉取消息，而过滤掉以上情况的broker分配给该消费者的topic分区，将会用来做发送fetch请求的准备。

private List<TopicPartition> fetchablePartitions() {Set<TopicPartition> exclude = new HashSet<>();if (nextInLineRecords != null && !nextInLineRecords.isFetched) {exclude.add(nextInLineRecords.partition);}for (CompletedFetch completedFetch : completedFetches) {exclude.add(completedFetch.partition);}return subscriptions.fetchablePartitions(tp -> !exclude.contains(tp));
}

而所要发送的topic分区将会根据其leader副本所在的broker节点构造fetch请求准备发送拉取消息。

for (TopicPartition partition : fetchablePartitions()) {// Use the preferred read replica if set, or the position's leaderSubscriptionState.FetchPosition position = this.subscriptions.position(partition);Node node = selectReadReplica(partition, position.currentLeader.leader, currentTimeMs);if (node == null || node.isEmpty()) {metadata.requestUpdate();} else if (client.isUnavailable(node)) {client.maybeThrowAuthFailure(node);// If we try to send during the reconnect blackout window, then the request is just// going to be failed anyway before being sent, so skip the send for nowlog.trace("Skipping fetch for partition {} because node {} is awaiting reconnect backoff", partition, node);} else if (this.nodesWithPendingFetchRequests.contains(node.id())) {log.trace("Skipping fetch for partition {} because previous request to {} has not been processed", partition, node);} else {// if there is a leader and no in-flight requests, issue a new fetchFetchSessionHandler.Builder builder = fetchable.get(node);if (builder == null) {int id = node.id();FetchSessionHandler handler = sessionHandler(id);if (handler == null) {handler = new FetchSessionHandler(logContext, id);sessionHandlers.put(id, handler);}builder = handler.newBuilder();fetchable.put(node, builder);}builder.add(partition, new FetchRequest.PartitionData(position.offset,FetchRequest.INVALID_LOG_START_OFFSET, this.fetchSize, position.currentLeader.epoch));log.debug("Added {} fetch request for partition {} at position {} to node {}", isolationLevel,partition, position, node);}
}

可以看到，发送到同一个broker的fetch请求将会被集中发送，Kafka消费者客户端将会以异步的方式发送这些fetch请求，在其请求返回的时候进行处理。

long fetchOffset = requestData.fetchOffset;
FetchResponse.PartitionData<Records> fetchData = entry.getValue();log.debug("Fetch {} at offset {} for partition {} returned fetch data {}",isolationLevel, fetchOffset, partition, fetchData);
completedFetches.add(new CompletedFetch(partition, fetchOffset, fetchData, metricAggregator,resp.requestHeader().apiVersion()));

异步接收的fetch请求将会被组装成CompletedFetch缓存在completedFetches集合中等待解析。

而后，将会通过fetchRecords()方法中，将completedFetches中的拉取消息的请求从缓存中取出并解析得到所需要的消息。

while (recordsRemaining > 0) {if (nextInLineRecords == null || nextInLineRecords.isFetched) {CompletedFetch completedFetch = completedFetches.peek();if (completedFetch == null) break;try {nextInLineRecords = parseCompletedFetch(completedFetch);} catch (Exception e) {// Remove a completedFetch upon a parse with exception if (1) it contains no records, and// (2) there are no fetched records with actual content preceding this exception.// The first condition ensures that the completedFetches is not stuck with the same completedFetch// in cases such as the TopicAuthorizationException, and the second condition ensures that no// potential data loss due to an exception in a following record.FetchResponse.PartitionData partition = completedFetch.partitionData;if (fetched.isEmpty() && (partition.records == null || partition.records.sizeInBytes() == 0)) {completedFetches.poll();}throw e;}completedFetches.poll();} else {List<ConsumerRecord<K, V>> records = fetchRecords(nextInLineRecords, recordsRemaining);TopicPartition partition = nextInLineRecords.partition;if (!records.isEmpty()) {List<ConsumerRecord<K, V>> currentRecords = fetched.get(partition);if (currentRecords == null) {fetched.put(partition, records);} else {// this case shouldn't usually happen because we only send one fetch at a time per partition,// but it might conceivably happen in some rare cases (such as partition leader changes).// we have to copy to a new list because the old one may be immutableList<ConsumerRecord<K, V>> newRecords = new ArrayList<>(records.size() + currentRecords.size());newRecords.addAll(currentRecords);newRecords.addAll(records);fetched.put(partition, newRecords);}recordsRemaining -= records.size();}}
}

当准备拉取的消息数量小于最大拉取数量或者completedFetches中没有已经缓存的fetch response，则会结束消息的拉取。

在这里nextInLineRecords将会缓存下一个拉取得到的消息集合。

首先通过parseCompletedFetch()方法解析completedFetches顶部的fetch response，里面主要确保得到的fetchOffset与自己之前预测的一致，并更新hw等参数到自己的缓存中，在完成上述操作后，将这一fetch结果从completedFetches中取出，并准备将其放入nextInLineRecords从中获取所得到的消息正文，并更新下一次所想消费的偏移量。而此处得到的结果也正是kafka消费者所需要得到的消息。

kafka java消费者消息拉取相关推荐

从源码分析RocketMQ系列-消息拉取PullMessageProcessor详解
导语在之前的分析中分析了关于SendMessageProcessor,并且提供了对应的源码分析分析对于消息持久化的问题,下面来看另外一个PullMessageProcessor,在RocketM ...
RocketMQ：Consumer概述及启动流程与消息拉取源码分析
文章目录 Consumer 概述消费者核心类消费者启动流程消息拉取 PullMessageService实现机制 ProcessQueue实现机制消息拉取基本流程客户端发起消息拉取请求消息 ...
Consumer消息拉取和消费流程分析
1. 前言 MQConsumer是RocketMQ提供的消费者接口,从接口定义上可以看到,它主要的功能是订阅感兴趣的Topic.注册消息监听器.启动生产者开始消费消息. 消费者获取消息的模式有两种 ...
Kafka | Java 消费者是如何管理TCP连接的? | 极客时间
今天我要和你分享的主题是:Kafka 的 Java 消费者是如何管理 TCP 连接的. 在专栏中,我们专门聊过"Java生产者是如何管理 TCP 连接资源的"这个话题,你应该还有印 ...
rocketmq中的消息拉取及并发消费理解
消息拉取采用单线程形式,便于消息的顺序拉取默认批量取32个,出现性能考虑,减少网络请求.不能保证会拉取到32个,因为消息队列中的存放的是topic-queueid对应的索引,会包含多个tag,而消息 ...
kafka java客户端消息的分区与缓存发送
当kafka发送消息的时候,在完成消息的序列化之后,如果没有指定消息的分区,将会通过Partitioner来选择该消息发往的分区,在默认情况下,将采用DefaultPartitioner来进行消息的分 ...
kafka consumer配置拉取速度慢_Kafka分区分配策略（Partition Assignment Strategy）
众所周知,Apache Kafka是基于生产者和消费者模型作为开源的分布式发布订阅消息系统(当然,目前Kafka定位于an open-source distributed event streamin ...
kafka Java客户端之 consumer API 消费消息
背景:我使用docker-compose 搭建的kafka服务 kafka的简单介绍以及docker-compose部署单主机Kafka集群使用consumer API消费指定Topic里面的消息 ...
RocketMQ源码(十七)—Broker处理DefaultMQPushConsumer发起的拉取消息请求源码
转载来源: RocketMQ源码(19)-Broker处理DefaultMQPushConsumer发起的拉取消息请求源码[一万字]_刘Java的博客-CSDN博客此前我们学习了RocketMQ源码 ...

kafka java消费者消息拉取

kafka java消费者消息拉取相关推荐

最新文章

热门文章