1.概述

参考:elasticsearch报Data too large异常处理

在线上ES集群日志中发现了如下异常,elasticsearch版本为7.3.2

[2021-03-16T21:05:10,338][DEBUG][o.e.a.a.c.n.i.TransportNodesInfoAction] [java-d-service-es-200-56-client-1] failed to execute on node [hsF4JzeAQ6mflJRGnJIKzQ]
org.elasticsearch.transport.RemoteTransportException: [data-es-group-online-200-67-2][10.110.200.67:9301][cluster:monitor/nodes/info[n]]
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [33093117638/30.8gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33093114144/30.8gb], new bytes reserved: [3494/3.4kb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=3494/3.4kb, accounting=104564949/99.7mb]at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:342) ~[elasticsearch-7.3.2.jar:7.3.2]at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-7.3.2.jar:7.3.2]at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:173) [elasticsearch-7.3.2.jar:7.3.2]at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:121) [elasticsearch-7.3.2.jar:7.3.2]at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:105) [elasticsearch-7.3.2.jar:7.3.2]at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:660) [elasticsearch-7.3.2.jar:7.3.2]at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) [transport-netty4-client-7.3.2.jar:7.3.2]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [netty-codec-4.1.36.Final.jar:4.1.36.Final]at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) [netty-codec-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1408) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:682) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:582) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:536) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.36.Final.jar:4.1.36.Final]at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) [netty-common-4.1.36.Final.jar:4.1.36.Final]at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.36.Final.jar:4.1.36.Final]at java.lang.Thread.run(Thread.java:835) [?:?]
[2021-03-16T21:05:11,203][INFO ][o.e.x.s.a.AuthenticationServi

拉下ES源码,报错类位置org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService,具体代码如下:

public void checkParentLimit(long newBytesReserved, String label) throws CircuitBreakingException {final MemoryUsage memoryUsed = memoryUsed(newBytesReserved);long parentLimit = this.parentSettings.getLimit();if (memoryUsed.totalUsage > parentLimit) {this.parentTripCount.incrementAndGet();final StringBuilder message = new StringBuilder("[parent] Data too large, data for [" + label + "]" +" would be [" + memoryUsed.totalUsage + "/" + new ByteSizeValue(memoryUsed.totalUsage) + "]" +", which is larger than the limit of [" +parentLimit + "/" + new ByteSizeValue(parentLimit) + "]");if (this.trackRealMemoryUsage) {final long realUsage = memoryUsed.baseUsage;message.append(", real usage: [");message.append(realUsage);message.append("/");message.append(new ByteSizeValue(realUsage));message.append("], new bytes reserved: [");message.append(newBytesReserved);message.append("/");message.append(new ByteSizeValue(newBytesReserved));message.append("]");} else {message.append(", usages [");message.append(String.join(", ",this.breakers.entrySet().stream().map(e -> {final CircuitBreaker breaker = e.getValue();final long breakerUsed = (long)(breaker.getUsed() * breaker.getOverhead());return e.getKey() + "=" + breakerUsed + "/" + new ByteSizeValue(breakerUsed);}).collect(Collectors.toList())));message.append("]");}// derive durability of a tripped parent breaker depending on whether the majority of memory tracked by// child circuit breakers is categorized as transient or permanent.CircuitBreaker.Durability durability = memoryUsed.transientChildUsage >= memoryUsed.permanentChildUsage ?CircuitBreaker.Durability.TRANSIENT : CircuitBreaker.Durability.PERMANENT;throw new CircuitBreakingException(message.toString(), memoryUsed.totalUsage, parentLimit, durability);}}

从代码可以看出,当memoryUsed.totalUsage > parentLimit时,才会出现熔断;parentLimit的值与配置indices.breaker.total.limit(默认值为95%或者70%)有关,它的默认值与indices.breaker.total.use_real_memory(默认值为true)的配置有关,如下代码所示:

public static final Setting<Boolean> USE_REAL_MEMORY_USAGE_SETTING =Setting.boolSetting("indices.breaker.total.use_real_memory", true, Property.NodeScope);public static final Setting<ByteSizeValue> TOTAL_CIRCUIT_BREAKER_LIMIT_SETTING =Setting.memorySizeSetting("indices.breaker.total.limit", settings -> {if (USE_REAL_MEMORY_USAGE_SETTING.get(settings)) {return "95%";} else {return "70%";}}, Property.Dynamic, Property.NodeScope);

我们再来看看memoryUsed.totalUsage的值,它是该类的一个方法计算出来,代码如下:

private MemoryUsage memoryUsed(long newBytesReserved) {long transientUsage = 0;long permanentUsage = 0;for (CircuitBreaker breaker : this.breakers.values()) {long breakerUsed = (long)(breaker.getUsed() * breaker.getOverhead());if (breaker.getDurability() == CircuitBreaker.Durability.TRANSIENT) {transientUsage += breakerUsed;} else if (breaker.getDurability() == CircuitBreaker.Durability.PERMANENT) {permanentUsage += breakerUsed;}}if (this.trackRealMemoryUsage) {final long current = currentMemoryUsage();return new MemoryUsage(current, current + newBytesReserved, transientUsage, permanentUsage);} else {long parentEstimated = transientUsage + permanentUsage;return new MemoryUsage(parentEstimated, parentEstimated, transientUsage, permanentUsage);}}

trackRealMemoryUsage的值(取自该配置indices.breaker.total.use_real_memory)决定了是使用实际的内存使用量还是child circuit breakers的内存使用量来判断熔断; 官方解释如下:

Static setting determining whether the parent breaker should take real memory usage into account (true) or only consider the amount that is reserved by child circuit breakers (false). Defaults to true

总结:2021年3月17日中午11点50开始修改线上DATA节点配置:indices.breaker.total.use_real_memory:false 并且滚动重启了线上集群;

今天是2021年3月18日,昨天中午更新完该配置,昨天晚上18:30对集群进行了业务压测,未见该异常出现;(没改前,压力测试集群会掉点,并且由于分片漂移导致集群变yellow);

【Elasticsearch】Data too large, data for which is larger than the limit of相关推荐

  1. 【ElasticSearch】ES 的 path.data 配置多个盘的路径,查询效率与单个存储盘的效率比,哪个效率高些?

    1.概述 想最大程度发挥磁盘读写 io,还是推荐 RAID0. 使用多路径不一定会提升读写速度,和集群 shard 的数量有关系:主要是因为一个 shard 对应的文件,只会放到其中一块磁盘上,不会跨 ...

  2. 【Flink】kafka FlinkKafkaException send data to Kafka old epoch newer producer same transactionalId

    文章目录 1.场景1 1.1 概述 2.场景2 M.参考 1.场景1 1.1 概述 重复问题:[Flink]kafka INVALID_PRODUCER_EPO send data to Kafka ...

  3. 【Flink】kafka INVALID_PRODUCER_EPO send data to Kafka old epoch newer producer same transactionalId

    文章目录 1.场景1 1.1 原因 1.2 解决 1.3 源码 2.类似问题 1.场景1 问题重复:[Flink]kafka FlinkKafkaException send data to Kafk ...

  4. 【Elasticsearch】es 集群健康值 红色 red 分片 未分配

    1.概述 转载:https://zhuanlan.zhihu.com/p/101608973 转载这篇文章是因为根据我的文章 [Elasticsearch]elasticsearch 7.x 查看分片 ...

  5. 【Elasticsearch】使用索引生命周期管理实现热温冷架构

    1.概述 [Elasticsearch]Elasticsearch 索引生命周期管理 转载:使用索引生命周期管理实现热温冷架构 索引生命周期管理 (ILM) 是在 Elasticsearch 6.6( ...

  6. 【Elasticsearch】Elasticsearch analyzer 中文 分词器

    1.概述 转载: https://blog.csdn.net/tzs_1041218129/article/details/77887767 分词器首先看文章:[Elasticsearch]Elast ...

  7. 【Elasticsearch】es 脑裂

    1.概述 转载:https://blog.csdn.net/yangshangwei/article/details/103997630 参考:[Elasticsearch]zen discovery ...

  8. 【Elasticsearch】zen discovery集群发现机制

    1.概述 转载:https://blog.csdn.net/yangshangwei/article/details/103996803 继续跟中华石杉老师学习ES,第64篇 课程地址: https: ...

  9. 【ElasticSearch】学习笔记(三)es的高级操作

    [ElasticSearch]学习笔记(三)es的高级操作 文章目录 [ElasticSearch]学习笔记(三)es的高级操作 1. 数据聚合 1.1 聚合总类 1.2 DSL实现聚合 1.2.1 ...

最新文章

  1. Layui + bootstrap + servlet 的房屋出租管理系统
  2. linux 学习笔记 (五)
  3. 视频监控软件 SecuritySpy 简介
  4. POJ 3358 Period of an Infinite Binary Expansion ★ (数论好题:欧拉函数)
  5. python import from class_Python: import vs from (module) import function(class) 的理解
  6. c语言数位重排为最大数,18.12.09-C语言练习:黑洞数 / Kaprekar问题(示例代码)
  7. 英语不会读怎么办?它来教你……
  8. 50-20-010-kafka 配置-Listeners
  9. 教师节,老师最大的愿望是...
  10. linux下rsync+sersync实现自动备份数据
  11. java各版本之间的差异_Java 8-13版本功能差异一览指南 - marcobehler
  12. iOS之摇一摇功能实现
  13. 【软件过程改进 学习笔记】过程思维 ( 软件危机 | 软件过程 | 过程改进 | 过程思维 | 过程描述 | ISO 9000 | 6σ | PCM | CMMI )
  14. 移动CMPP2.0封装
  15. Dobot机械臂的Python Demo
  16. 1008: 美元和人民币 C语言
  17. 【tensorflow学习】Ftrl学习
  18. linux 内核list head,Linux内核之list_head.pdf
  19. 英语学习详细笔记(十五)被动语态
  20. 使用vue+golang+mysql写一个即时聊天、多人视频的项目

热门文章

  1. 一分钱解锁全网视频会员?加入团队还能月入百万?
  2. 集邦咨询:预估今年GaN功率元件营收达8300万美元
  3. 小鹏N5申报图曝光 搭载155KW电机、NEDC 600公里与P5相同
  4. iPhone 13系列电池容量曝光:续航时长能否提升才是关键
  5. iPhone 13高端版所需120Hz刷新率屏幕或将由三星独家供应
  6. 刚公布完价格就被骂?这款新机有点惨...
  7. 消息称快手已通过港交所聆讯 计划2月第一周上市
  8. 明天上线!部分开发者手机已安装鸿蒙OS:超流畅,可装安卓应用
  9. ARM联合创始人:若被英伟达收购 将是一场灾难
  10. 微信上线“拍一拍”功能,结果被网友激情吐槽...