背景

行情服务每次在开盘与收盘期间的堆外内存都会上涨,并且需要周期性手动重启,影响到服务的稳定性

排查过程

1.堆外内存的计算标准

此matrix(used_direct_memory)计算标准由netty统一进行计算,因此可以初步判断是由于netty分配的堆外内存导致内存上涨,并非直接有也直接调用Unsafe分配堆外内存

import io.netty.util.internal.PlatformDependent;public DirectMemoryMonitor() {// 使用堆外内存Field usedMemory = ReflectionUtils.findField(PlatformDependent.class, "DIRECT_MEMORY_COUNTER");usedMemory.setAccessible(true);Field limitMemory = ReflectionUtils.findField(PlatformDependent.class, "DIRECT_MEMORY_LIMIT");limitMemory.setAccessible(true);try {DIRECT_MEMORY_COUNTER = (AtomicLong) usedMemory.get(PlatformDependent.class);DIRECT_MEMORY_LIMIT = (Long) limitMemory.get(PlatformDependent.class);} catch (IllegalAccessException e) {}XueqiuMetrics.getInstance().register("used_direct_memory", (Gauge<Long>) () -> DIRECT_MEMORY_COUNTER.get());
}

2.查看此堆外内存的分配逻辑及引用方

netty的分配堆外内存,需要增加相应的数值数值,查看此方法调用

public final class PlatformDependent {private static final AtomicLong DIRECT_MEMORY_COUNTER;// 增加对应容量的堆外内存的数值private static void incrementMemoryCounter(int capacity) {if (DIRECT_MEMORY_COUNTER != null) {for (;;) {long usedMemory = DIRECT_MEMORY_COUNTER.get();long newUsedMemory = usedMemory + capacity;if (newUsedMemory > DIRECT_MEMORY_LIMIT) {throw new OutOfDirectMemoryError("failed to allocate " + capacity+ " byte(s) of direct memory (used: " + usedMemory + ", max: " + DIRECT_MEMORY_LIMIT + ')');}if (DIRECT_MEMORY_COUNTER.compareAndSet(usedMemory, newUsedMemory)) {break;}}}}
}

查看申请ByteBuffer的具体构造方法

public static ByteBuffer allocateDirectNoCleaner(int capacity) {assert USE_DIRECT_BUFFER_NO_CLEANER;incrementMemoryCounter(capacity);try {return PlatformDependent0.allocateDirectNoCleaner(capacity);} catch (Throwable e) {decrementMemoryCounter(capacity);throwException(e);return null;}
}

通过反射查看调用的是DirectByteBuffer(long addr, int cap)的构造函数,并且无cleaner,需要手动释放

static ByteBuffer allocateDirectNoCleaner(int capacity) {return newDirectBuffer(UNSAFE.allocateMemory(capacity), capacity);}// Invoked only by JNI: NewDirectByteBuffer(void*, long)//private DirectByteBuffer(long addr, int cap) {super(-1, 0, cap, cap);address = addr;cleaner = null;att = null;}

查看DirectByteBuffer的引用方,引用方是DirectArena.PoolChunk,熟悉netty的内存模型的同学,都有所了解netty的底层byte底层存储依赖chunk的管理,最后添加到DriectArena

static final class DirectArena extends PoolArena<ByteBuffer> {private static ByteBuffer allocateDirect(int capacity) {return PlatformDependent.useDirectBufferNoCleaner() ?PlatformDependent.allocateDirectNoCleaner(capacity) : ByteBuffer.allocateDirect(capacity);}protected PoolChunk<ByteBuffer> newChunk(int pageSize, int maxOrder,int pageShifts, int chunkSize) {if (directMemoryCacheAlignment == 0) {return new PoolChunk<ByteBuffer>(this,allocateDirect(chunkSize), pageSize, maxOrder,pageShifts, chunkSize, 0);}final ByteBuffer memory = allocateDirect(chunkSize+ directMemoryCacheAlignment);return new PoolChunk<ByteBuffer>(this, memory, pageSize,maxOrder, pageShifts, chunkSize,offsetCacheLine(memory));}private void allocateNormal(PooledByteBuf<T> buf, int reqCapacity, int normCapacity) {if (q050.allocate(buf, reqCapacity, normCapacity) || q025.allocate(buf, reqCapacity, normCapacity) ||q000.allocate(buf, reqCapacity, normCapacity) || qInit.allocate(buf, reqCapacity, normCapacity) ||q075.allocate(buf, reqCapacity, normCapacity)) {return;}// Add a new chunk.PoolChunk<T> c = newChunk(pageSize, maxOrder, pageShifts, chunkSize);long handle = c.allocate(normCapacity);assert handle > 0;c.initBuf(buf, handle, reqCapacity);qInit.add(c);}}

3.根据服务dump文件进行有关DirectByteBuffer的分析

由于问题的复杂性,需要利用oql分析堆内文件 :相关语法介绍可以参考JVM 对象查询语言(OQL)_潘建南的博客-CSDN博客

3.1.验证监控中堆外内存数值,由于时间较长,数值有所失真

3.2.分析netty chunk 是否与堆外内存分配向匹配

oql说明:查询持有java.nio.DirectByteBuffer(cleaner 为null)引用的netty chunk 对象的明细

select map(filter(referrers(s), "/io.netty.buffer.PoolC/.test(classof(it).name)"),"toHtml(it) + ' mem:' + toHtml(it.memory) + ' chunksize:' + toHtml(it.chunkSize) + ' unusable:' + toHtml(it.unusable)  + ' free:' + toHtml(it.freeBytes)")from java.nio.DirectByteBuffer swhere s.cleaner == null & count(referrers(s)) > 0

2021-07-29:共分配12个chunk,并且每个chunk的大小是16MB,共 12 * 16 = 192M

io.netty.buffer.PoolChunk#2 mem:java.nio.DirectByteBuffer#10 chunksize:16777216 unusable:12 free:2424832io.netty.buffer.PoolChunk#3 mem:java.nio.DirectByteBuffer#11 chunksize:16777216 unusable:12 free:16769024io.netty.buffer.PoolChunk#1 mem:java.nio.DirectByteBuffer#12 chunksize:16777216 unusable:12 free:16769024io.netty.buffer.PoolChunk#4 mem:java.nio.DirectByteBuffer#17 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#5 mem:java.nio.DirectByteBuffer#18 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#6 mem:java.nio.DirectByteBuffer#19 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#7 mem:java.nio.DirectByteBuffer#20 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#8 mem:java.nio.DirectByteBuffer#21 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#9 mem:java.nio.DirectByteBuffer#22 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#10 mem:java.nio.DirectByteBuffer#23 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#12 mem:java.nio.DirectByteBuffer#24 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#13 mem:java.nio.DirectByteBuffer#25 chunksize:16777216 unusable:12 free:11436032

2021-08-10:共分配45个chunk,并且每个chunk的大小是16MB,共 45 * 16 = 720MB

io.netty.buffer.PoolChunk#1 mem:java.nio.DirectByteBuffer#11 chunksize:16777216 unusable:12 free:1900544io.netty.buffer.PoolChunk#2 mem:java.nio.DirectByteBuffer#13 chunksize:16777216 unusable:12 free:466944io.netty.buffer.PoolChunk#3 mem:java.nio.DirectByteBuffer#14 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#4 mem:java.nio.DirectByteBuffer#16 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#5 mem:java.nio.DirectByteBuffer#17 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#6 mem:java.nio.DirectByteBuffer#18 chunksize:16777216 unusable:12 free:16506880io.netty.buffer.PoolChunk#7 mem:java.nio.DirectByteBuffer#19 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#8 mem:java.nio.DirectByteBuffer#20 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#10 mem:java.nio.DirectByteBuffer#21 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#11 mem:java.nio.DirectByteBuffer#22 chunksize:16777216 unusable:12 free:16515072io.netty.buffer.PoolChunk#12 mem:java.nio.DirectByteBuffer#23 chunksize:16777216 unusable:12 free:16769024io.netty.buffer.PoolChunk#14 mem:java.nio.DirectByteBuffer#25 chunksize:16777216 unusable:12 free:0io.netty.buffer.PoolChunk#15 mem:java.nio.DirectByteBuffer#26 chunksize:16777216 unusable:12 free:458752io.netty.buffer.PoolChunk#16 mem:java.nio.DirectByteBuffer#27 chunksize:16777216 unusable:12 free:0io.netty.buffer.PoolChunk#17 mem:java.nio.DirectByteBuffer#28 chunksize:16777216 unusable:12 free:0io.netty.buffer.PoolChunk#18 mem:java.nio.DirectByteBuffer#29 chunksize:16777216 unusable:12 free:1032192io.netty.buffer.PoolChunk#20 mem:java.nio.DirectByteBuffer#30 chunksize:16777216 unusable:12 free:262144io.netty.buffer.PoolChunk#21 mem:java.nio.DirectByteBuffer#31 chunksize:16777216 unusable:12 free:122880io.netty.buffer.PoolChunk#22 mem:java.nio.DirectByteBuffer#32 chunksize:16777216 unusable:12 free:1024000io.netty.buffer.PoolChunk#23 mem:java.nio.DirectByteBuffer#33 chunksize:16777216 unusable:12 free:851968io.netty.buffer.PoolChunk#25 mem:java.nio.DirectByteBuffer#34 chunksize:16777216 unusable:12 free:65536io.netty.buffer.PoolChunk#26 mem:java.nio.DirectByteBuffer#35 chunksize:16777216 unusable:12 free:0io.netty.buffer.PoolChunk#27 mem:java.nio.DirectByteBuffer#36 chunksize:16777216 unusable:12 free:327680io.netty.buffer.PoolChunk#28 mem:java.nio.DirectByteBuffer#37 chunksize:16777216 unusable:12 free:131072io.netty.buffer.PoolChunk#30 mem:java.nio.DirectByteBuffer#38 chunksize:16777216 unusable:12 free:663552io.netty.buffer.PoolChunk#31 mem:java.nio.DirectByteBuffer#39 chunksize:16777216 unusable:12 free:65536io.netty.buffer.PoolChunk#32 mem:java.nio.DirectByteBuffer#40 chunksize:16777216 unusable:12 free:294912io.netty.buffer.PoolChunk#33 mem:java.nio.DirectByteBuffer#41 chunksize:16777216 unusable:12 free:196608io.netty.buffer.PoolChunk#34 mem:java.nio.DirectByteBuffer#42 chunksize:16777216 unusable:12 free:3588096io.netty.buffer.PoolChunk#36 mem:java.nio.DirectByteBuffer#43 chunksize:16777216 unusable:12 free:196608io.netty.buffer.PoolChunk#37 mem:java.nio.DirectByteBuffer#44 chunksize:16777216 unusable:12 free:65536io.netty.buffer.PoolChunk#38 mem:java.nio.DirectByteBuffer#45 chunksize:16777216 unusable:12 free:327680io.netty.buffer.PoolChunk#39 mem:java.nio.DirectByteBuffer#46 chunksize:16777216 unusable:12 free:450560io.netty.buffer.PoolChunk#40 mem:java.nio.DirectByteBuffer#47 chunksize:16777216 unusable:12 free:1941504io.netty.buffer.PoolChunk#43 mem:java.nio.DirectByteBuffer#49 chunksize:16777216 unusable:12 free:0io.netty.buffer.PoolChunk#44 mem:java.nio.DirectByteBuffer#50 chunksize:16777216 unusable:12 free:1376256io.netty.buffer.PoolChunk#45 mem:java.nio.DirectByteBuffer#51 chunksize:16777216 unusable:12 free:917504io.netty.buffer.PoolChunk#46 mem:java.nio.DirectByteBuffer#52 chunksize:16777216 unusable:12 free:8650752io.netty.buffer.PoolChunk#48 mem:java.nio.DirectByteBuffer#53 chunksize:16777216 unusable:12 free:0io.netty.buffer.PoolChunk#49 mem:java.nio.DirectByteBuffer#54 chunksize:16777216 unusable:12 free:0io.netty.buffer.PoolChunk#50 mem:java.nio.DirectByteBuffer#55 chunksize:16777216 unusable:12 free:589824io.netty.buffer.PoolChunk#51 mem:java.nio.DirectByteBuffer#56 chunksize:16777216 unusable:12 free:1736704io.netty.buffer.PoolChunk#52 mem:java.nio.DirectByteBuffer#57 chunksize:16777216 unusable:12 free:892928io.netty.buffer.PoolChunk#53 mem:java.nio.DirectByteBuffer#58 chunksize:16777216 unusable:12 free:2490368io.netty.buffer.PoolChunk#54 mem:java.nio.DirectByteBuffer#59 chunksize:16777216 unusable:12 free:6160384

结论:netty chunk分配的与监控中的堆外内存基本一致,因此接下来一步需要解决的问题是,chunk中分配的内存是什么。

3.3. 查看DirectByteBuffer的引用问题已经chunk中分片的数据

oql说明:查询持有java.nio.DirectByteBuffer(cleaner 为null)引用的netty中的对象,包括memory、arena、chunk

select map(filter(referrers(s), "/io.netty.buffer.Pool/.test(classof(it).name)"),
"toHtml(it) + ' mem:' + toHtml(it.memory) + ' arena:' + toHtml(it.arena) + ' chunk:' + toHtml(it.chunk)")
from java.nio.DirectByteBuffer s
where s.cleaner == null & count(referrers(s)) > 0

小结:非引用的chunk呈现增长的趋势,此类chunk也极有可能是导致内存泄漏的问题点

先查看PooledUnsafeDirectByteBuf的引用,发现其主要分配对象为业务引用的静态对象以及协议的分隔符,这两种占用内存较少,并且其只占用2~3个chunk,并无增长的趋势

之后需要有效的分析的无引用的chunk,因为chunk是底层的分配单元,因此需要分析他的上一层级引用PoolArena

查看DirectArena发现其引用主要分为:PoolChunk/PoolChunkList/PoolThreadCache,由于PoolChunk/PoolChunkList与PoolArena是父子关系,因此暂不需要关注,只需关注PoolThreadCache关联的PooledUnsafeDirectByteBuf,

查看PooledUnsafeDirectByteBuf引用发现其最终关联到stock业务对象

明显发现SzL2FrameMap在不断的上涨,并且与JVM中old区的内存以及堆外内存成正比

小结:非引用类的chunk直接关联的是业务对象SzL2FrameMap,并且与堆内内存和堆外内存呈正相关的情况,并且SzL2FrameMap对象中也间接引用了DirectByteBuf对象 基本上初步判断是ringbuffer中消费问题导致其内部持有对象(SzL2FrameMap)无法被释放,最后导致堆外内存也无法释放,导致堆外内存的上涨

netty数据流堆外内存排查相关推荐

  1. java堆外内存6_Java堆外内存排查小结

    简介 JVM堆外内存难排查但经常会出现问题,这可能是目前最全的JVM堆外内存排查思路.之前的文章排版太乱,现在整理重发一下,内容是一样的. 通过本文,你应该了解: pmap 命令 gdb 命令 per ...

  2. java 堆外内存 查看_JAVA堆外内存排查小结

    简介JVM堆外内存难排查但经常会出现问题,这可能是目前最全的JVM堆外内存排查思路. 通过本文,你应该了解:pmap 命令 gdb 命令 perf 命令 内存 RSS.VSZ的区别 java NMT ...

  3. java 查看堆外内存占用_Java堆外内存排查小结

    简介 JVM堆外内存难排查但经常会出现问题,这可能是目前最全的JVM堆外内存排查思路. 通过本文,你应该了解: pmap 命令 gdb 命令 perf 命令 内存 RSS.VSZ的区别 java NM ...

  4. jvm堆外内存排查详解

    文章目录 前言 一.堆外内存排查 1.背景 2.内存对比 3.堆外内存检查 4.排查堆外内存 5.glibc内存泄露 结尾 前言 内存泄漏想必大家并不陌生,对于jvm的内存泄漏,有很多排查手段和方便的 ...

  5. Java jcmd内存远大于top_Java堆外内存排查小结

    问题描述 通过本文,你应该了解: 1. pmap 命令 2. gdb 命令 3. perf 命令 4. 内存 RSS.VSZ的区别 5. java NMT 这几天遇到一个比较奇怪的问题,觉得有必要和大 ...

  6. Cassandra Java堆外内存排查经历全记录

    背景 最近准备上线cassandra这个产品,同事在做一些小规格ECS(8G)的压测.压测时候比较容易触发OOM Killer,把cassandra进程干掉.问题是8G这个规格我配置的heap(Xmx ...

  7. 记一次Cassandra Java堆外内存排查经历

    背景 最近准备上线cassandra这个产品,同事在做一些小规格ECS(8G)的压测.压测时候比较容易触发OOM Killer,把cassandra进程干掉.问题是8G这个规格我配置的heap(Xmx ...

  8. java 堆外内存 查看_超干货!Cassandra Java堆外内存排查经历全记录

    背景 最近准备上线cassandra这个产品,同事在做一些小规格ECS(8G)的压测.压测时候比较容易触发OOM Killer,把cassandra进程干掉.问题是8G这个规格我配置的heap(Xmx ...

  9. Netty堆外内存泄露排查与总结

    导读 Netty 是一个异步事件驱动的网络通信层框架,用于快速开发高可用高性能的服务端网络框架与客户端程序,它极大地简化了 TCP 和 UDP 套接字服务器等网络编程. Netty 底层基于 JDK ...

最新文章

  1. MySQL IN、Exist关联查询时,我们为什么建议小表驱动大表?
  2. SQL SERVER 查看并结束某个进程
  3. pytorch安装教程(Windows版本)
  4. Docker教程-简介
  5. asp.net core中负载均衡场景下http重定向https的问题
  6. linux 添加编程环境变量配置
  7. 学做三件事、三句话、三乐、三不要
  8. 浏览器屏蔽flash视频广告
  9. Html+CSS基础之img标签
  10. Atitit.软件仪表盘(2)--vm子系统--资源占用监测
  11. 机顶盒网络包获取方式
  12. maya python 游戏与影视编程指南_Maya Python游戏与影视编程指南
  13. 浅谈安科瑞餐饮油烟在线监测系统在餐饮油烟治理中起到的作用
  14. 彻底卸载Xubuntu Kubuntu
  15. java 累加函数_请你编写一个方法(函数),功能要求从参数x累加到y,并返回累加后的整数结果。...
  16. 全球水深地形模型ETOPO1
  17. Mega软件操作教程
  18. css属性之padding和margin
  19. 什么是真正的资源整合,真正明白此方法后,可借万物为你赚钱!
  20. php注册验证用户名已存在,php ajax注册验证用户名是否存在代码_PHP教程

热门文章

  1. 停止服务[root@dev WAS]# /opt/IBM/WebSphere/AppServer/profiles/AppSrv01/bin/stopServer.sh server1
  2. C. Orac and LCM(gcd与lcm的性质)
  3. 【计算机网络漫游】浏览器输入一个URL后发生了什么
  4. 我所知道坦克大战(单机版)之使用键盘控制改变坦克位置
  5. 【售前运维】选方向必备之售前岗位详解
  6. SAS初级编程系列视频:第一章基本概念
  7. 一个让rm -rf 都头大的命令
  8. C语言:判断是否为素数
  9. 致力颠覆豪华智能纯电品牌,高合汽车的底气来自哪?
  10. 报错Failure to find org.apache或 报错 The container 'Maven Dependencies' references non existing