Dubbo thread pool is exhausted

生产环境真实发生的故障,在业务高峰期 某个业务接口频繁报Dubbo 线程池资源耗尽的问题,后面经过反复的排查,确定问题的原因是由于数据库表没有创建索引,导致慢sql 影响查询速度,高峰期并发量又比较大,直接将线程池资源耗尽,系统奔溃无法提供服务。

本篇模拟一下当时的情况,算是一个案例展示,可以学习以下内容:

Dubbo 线程池策略 (官网地址: https://dubbo.apache.org/zh/docsv2.7/dev/impls/threadpool/ )

Dubbo 消息派发机制 (官网地址: https://dubbo.apache.org/zh/docsv2.7/dev/impls/dispatcher/ )

线程池策略

核心接口定义

/** Licensed to the Apache Software Foundation (ASF) under one or more* contributor license agreements.  See the NOTICE file distributed with* this work for additional information regarding copyright ownership.* The ASF licenses this file to You under the Apache License, Version 2.0* (the "License"); you may not use this file except in compliance with* the License.  You may obtain a copy of the License at**     http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/
package org.apache.dubbo.common.threadpool;import org.apache.dubbo.common.URL;
import org.apache.dubbo.common.extension.Adaptive;
import org.apache.dubbo.common.extension.SPI;
import org.apache.dubbo.common.threadpool.support.fixed.FixedThreadPool;import java.util.concurrent.Executor;import static org.apache.dubbo.common.constants.CommonConstants.THREADPOOL_KEY;/*** ThreadPool * FixedThreadPool.NAME = "fixed" (即SPI机制默认使用 固定线程池)*/
@SPI(FixedThreadPool.NAME)
public interface ThreadPool {/*** Thread pool** @param url URL contains thread parameter* @return thread pool*/@Adaptive({THREADPOOL_KEY})Executor getExecutor(URL url);}

框默认实现

/** Licensed to the Apache Software Foundation (ASF) under one or more* contributor license agreements.  See the NOTICE file distributed with* this work for additional information regarding copyright ownership.* The ASF licenses this file to You under the Apache License, Version 2.0* (the "License"); you may not use this file except in compliance with* the License.  You may obtain a copy of the License at**     http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/
package org.apache.dubbo.common.threadpool.support.fixed;import org.apache.dubbo.common.URL;
import org.apache.dubbo.common.threadlocal.NamedInternalThreadFactory;
import org.apache.dubbo.common.threadpool.ThreadPool;
import org.apache.dubbo.common.threadpool.support.AbortPolicyWithReport;import java.util.concurrent.Executor;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.SynchronousQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;import static org.apache.dubbo.common.constants.CommonConstants.DEFAULT_QUEUES;
import static org.apache.dubbo.common.constants.CommonConstants.DEFAULT_THREADS;
import static org.apache.dubbo.common.constants.CommonConstants.DEFAULT_THREAD_NAME;
import static org.apache.dubbo.common.constants.CommonConstants.QUEUES_KEY;
import static org.apache.dubbo.common.constants.CommonConstants.THREADS_KEY;
import static org.apache.dubbo.common.constants.CommonConstants.THREAD_NAME_KEY;/*** Creates a thread pool that reuses a fixed number of threads** @see java.util.concurrent.Executors#newFixedThreadPool(int)*/
public class FixedThreadPool implements ThreadPool {public static final String NAME = "fixed";@Overridepublic Executor getExecutor(URL url) {String name = url.getParameter(THREAD_NAME_KEY, DEFAULT_THREAD_NAME);// 线程池的大小默认 200 DEFAULT_THREADS = 200int threads = url.getParameter(THREADS_KEY, DEFAULT_THREADS);int queues = url.getParameter(QUEUES_KEY, DEFAULT_QUEUES);return new ThreadPoolExecutor(threads, threads, 0, TimeUnit.MILLISECONDS,queues == 0 ? new SynchronousQueue<Runnable>() :(queues < 0 ? new LinkedBlockingQueue<Runnable>(): new LinkedBlockingQueue<Runnable>(queues)),new NamedInternalThreadFactory(name, true), new AbortPolicyWithReport(name, url));}}

框默认扩展

Dubbo 框架内部默认实现了4中线程池,分别为:

  1. 固定大小线程池 默认大小200
  2. 可缓存的线程池 核心线程数0 最大线程数 Integer.MAX_VALUE, 阻塞队列大小默认为0 (这个很有意思 后面再做研究)
  3. 受限的线程池 核心线程数0 最大线程数 Integer.MAX_VALUE, 阻塞队列大小默认为0
  4. eager 当线程池核心线程达到阈值时,新任务不会放入队列 而是开启新线程进行处理

消息派发机制

消息派发机制体现在Dubbo 的服务端,当netty接受到客户端的请求后,将请求消息派发到自定义的线程池中,派发的规则就是消息派发机制。在dubbo 服务端的处理程序中处理netty 自带的Reactor模型(主从Reactor服务器模型),即接受请求、处理请求分别利用boss线程、worker线程(也叫IO线程)进行处理。dubbo的消息派发处于workder线程的下游,我把它叫做业务线程处理机制。

Dubbo框架根据消息的请求类型,可以将请求粗略的分为心跳请求、连接请求、消息请求等。

核心接口定义

/** Licensed to the Apache Software Foundation (ASF) under one or more* contributor license agreements.  See the NOTICE file distributed with* this work for additional information regarding copyright ownership.* The ASF licenses this file to You under the Apache License, Version 2.0* (the "License"); you may not use this file except in compliance with* the License.  You may obtain a copy of the License at**     http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/
package org.apache.dubbo.remoting;import org.apache.dubbo.common.URL;
import org.apache.dubbo.common.extension.Adaptive;
import org.apache.dubbo.common.extension.SPI;
import org.apache.dubbo.remoting.transport.dispatcher.all.AllDispatcher;/*** ChannelHandlerWrapper (SPI, Singleton, ThreadSafe) 单例且线程安全* 默认使用 ALL的派发机制,即所有请求都放入到 业务线程进行处理*/
@SPI(AllDispatcher.NAME)
public interface Dispatcher {/*** dispatch the message to threadpool.** @param handler* @param url* @return channel handler*/@Adaptive({Constants.DISPATCHER_KEY, "dispather", "channel.handler"})// The last two parameters are reserved for compatibility with the old configurationChannelHandler dispatch(ChannelHandler handler, URL url);}

框架默认实现

/** Licensed to the Apache Software Foundation (ASF) under one or more* contributor license agreements.  See the NOTICE file distributed with* this work for additional information regarding copyright ownership.* The ASF licenses this file to You under the Apache License, Version 2.0* (the "License"); you may not use this file except in compliance with* the License.  You may obtain a copy of the License at**     http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/
package org.apache.dubbo.remoting.transport.dispatcher.all;import org.apache.dubbo.common.URL;
import org.apache.dubbo.remoting.ChannelHandler;
import org.apache.dubbo.remoting.Dispatcher;/*** default thread pool configure*/
public class AllDispatcher implements Dispatcher {public static final String NAME = "all";@Overridepublic ChannelHandler dispatch(ChannelHandler handler, URL url) {// 使用 AllChannelHandler 进行消息处理return new AllChannelHandler(handler, url);}}

框架默认扩展

all=org.apache.dubbo.remoting.transport.dispatcher.all.AllDispatcher
all2=org.apache.dubbo.remoting.transport.dispatcher.all2.AllDispatcher2
direct=org.apache.dubbo.remoting.transport.dispatcher.direct.DirectDispatcher
message=org.apache.dubbo.remoting.transport.dispatcher.message.MessageOnlyDispatcher
execution=org.apache.dubbo.remoting.transport.dispatcher.execution.ExecutionDispatcher
connection=org.apache.dubbo.remoting.transport.dispatcher.connection.ConnectionOrderedDispatcher

不做一一介绍了,比较推荐的方式是使用message的派发模型,将心跳、连接IO事件交给IO线程处理,业务请求交给业务线程处理。

Demo

接口定义

package org.example.api.day02;import org.example.api.RpcResult;import javax.validation.Valid;
import javax.validation.constraints.NotNull;public interface IQuery {RpcResult<String> query(@Valid @NotNull(message ="args 不能为空") String args);
}

接口实现

package org.example.provider.day02;import org.apache.dubbo.config.annotation.DubboService;
import org.example.api.RpcResult;
import org.example.api.day02.IQuery;import java.util.concurrent.TimeUnit;@DubboService(version = "1.0.0",validation = "true",timeout = 2000)
public class QueryProvider implements IQuery {@Overridepublic RpcResult<String> query(String args) {// 模拟DB线程池资源 耗尽 然后阻塞请求; 以下阻塞代码的逻辑并不严谨 仅仅是为了说明下当时的问题// 量大的情况下 没有该阻塞 同样会出现 thread pool is exhausted的bug (注释掉也可以复现问题)try {TimeUnit.MILLISECONDS.sleep(1500);} catch (Exception e) {return RpcResult.error(e.getMessage());}System.out.println("query arguments : " + args);return RpcResult.success("return DB data " + args);}
}

应用配置

由于本地环境不可能模拟大批量并发测试,因此只能通过修改默认线程池的大小,然后利用压力测试工具进行测试,application.yml文件内容如下

server:port: 8085
dubbo:application:name: dubbo-providerqos-enable: falseregistry:address: zookeeper://127.0.0.1:2181protocol:name: dubboport: 20889# 线程池大小为5threads: 5# 修改默认的消息派发机制 (all)dispatcher: messageconsumer:check: falseprovider:filter: -validation

接口压测

压测结果

启动Provider 后, 使用jmeter工具进行dubbo接口性能测试,工具的安装可以参考另一篇Blog Dubbo 测试

使用100个线程 循环两次对接口进行测试。模拟100并发,实际上系统配置的并发线程大小为5,因此肯定会报错

点击执行,测试的结果

显示95%的请求是报错的。

 "code": 1,"detailMessage": "Failfast invoke providers dubbo://192.168.0.6:20889/org.example.api.day02.IQuery?anyhost=true&application=DubboSample&async=false&check=false&cluster=failfast&connections=100&deprecated=false&dispatcher=message&dubbo=2.0.2&dynamic=true&generic=true&interface=org.example.api.day02.IQuery&loadbalance=random&metadata-type=remote&methods=query&pid=6373&protocol=dubbo&register.ip=192.168.0.6&release=2.7.15&remote.application=dubbo-provider&retries=0&revision=1.0.0&service.filter=-validation&service.name=ServiceBean:/org.example.api.day02.IQuery:1.0.0&side=consumer&sticky=false&timeout=2000&timestamp=1660513638233&version=1.0.0 RandomLoadBalance select from all providers [org.apache.dubbo.registry.integration.RegistryDirectory$InvokerDelegate@48dfd63f] for service org.apache.dubbo.rpc.service.GenericService method $invoke on consumer 192.168.0.6 use dubbo version 2.7.8-jar-with-dependencies, but no luck to perform the invocation. Last error is: Failed to invoke remote method: $invoke, provider: dubbo://192.168.0.6:20889/org.example.api.day02.IQuery?anyhost=true&application=DubboSample&async=false&check=false&cluster=failfast&connections=100&deprecated=false&dispatcher=message&dubbo=2.0.2&dynamic=true&generic=true&interface=org.example.api.day02.IQuery&loadbalance=random&metadata-type=remote&methods=query&pid=6373&protocol=dubbo&register.ip=192.168.0.6&release=2.7.15&remote.application=dubbo-provider&retries=0&revision=1.0.0&service.filter=-validation&service.name=ServiceBean:/org.example.api.day02.IQuery:1.0.0&side=consumer&sticky=false&timeout=2000&timestamp=1660513638233&version=1.0.0, cause: org.apache.dubbo.remoting.RemotingException: Server side(192.168.0.6,20889) thread pool is exhausted, detail msg:Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-192.168.0.6:20889, Pool Size: 5 (active: 5, core: 5, max: 5, largest: 5), Task: 5 (completed: 0), Executor status:(isShutdown:false, isTerminated:false, isTerminating:false), in dubbo://192.168.0.6:20889!","cause": {"detailMessage": "org.apache.dubbo.remoting.RemotingException: Server side(192.168.0.6,20889) thread pool is exhausted, detail msg:Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-192.168.0.6:20889, Pool Size: 5 (active: 5, core: 5, max: 5, largest: 5), Task: 5 (completed: 0), Executor status:(isShutdown:false, isTerminated:false, isTerminating:false),

源码分析

org.apache.dubbo.remoting.transport.dispatcher.all.AllChannelHandler#received

@Overridepublic void received(Channel channel, Object message) throws RemotingException {ExecutorService executor = getPreferredExecutorService(message);try {executor.execute(new ChannelEventRunnable(channel, handler, ChannelState.RECEIVED, message));} catch (Throwable t) {//执行业务线程时 如果出现线程池耗尽,触发线程拒绝策略 则直接报错if(message instanceof Request && t instanceof RejectedExecutionException){// 报错的真正代码sendFeedback(channel, (Request) message, t);return;}throw new ExecutionException(message, channel, getClass() + " error when process received event .", t);}}

org.apache.dubbo.remoting.transport.dispatcher.WrappedChannelHandler#sendFeedback

protected void sendFeedback(Channel channel, Request request, Throwable t) throws RemotingException {if (request.isTwoWay()) {String msg = "Server side(" + url.getIp() + "," + url.getPort()+ ") thread pool is exhausted, detail msg:" + t.getMessage();Response response = new Response(request.getId(), request.getVersion());response.setStatus(Response.SERVER_THREADPOOL_EXHAUSTED_ERROR);response.setErrorMessage(msg);channel.send(response);return;}
}

至此 以逆向的思维的方式,已经将问题复现了 并跟踪了一下源码,解决问题的方式无非是以下几种:

  1. 调整线程池的大小 调大
  2. 修改消息派发机制的类型
  3. 解决业务代码中影响效率的问题 (慢SQL、耗时的操作)

Dubbo thread pool is exhausted相关推荐

  1. [Done][DUBBO] dubbo Thread pool is EXHAUSTED!

    异常信息: com.alibaba.dubbo.remoting.ExecutionException: class com.alibaba.dubbo.remoting.transport.disp ...

  2. Dubbo 线上 Thread pool is EXHAUSTED 问题排查

    本文来自作者投稿,原创作者:Tom 前景提要 早上9点第一个到公司泡了一包枸杞,准备打开极客时间看两篇文章提提神.突然客服部反馈用户发送短信收取不到验证码还一通在有大领导的群里@所有人(负责这块的同事 ...

  3. Dubbo线程池问题思考Thread pool is EXHAUSTED!

    问题 前几天,我们的生产上突然出现了这样一个问题,调下面的查询方法报错,线程池满的问题,如下图: 问题思路 简单思考:我们都知道线程池的参数都包含什么含义!核心线程数,可建线程数,存储任务队列,拒绝策 ...

  4. mysql5.6 thread pool_mysql5.6 thread pool

    从percona 的压测来看,确实很牛笔啊.提升很大. http://www.mysqlperformanceblog.com/2014/01/29/percona-server-thread-poo ...

  5. Reporting Service 告警w WARN: Thread pool pressure. Using current thread for a work item

    如果Reporting Service偶尔出现不可访问或访问出错情况,这种情况一般没有做监控的话,很难捕捉到.出现这种问题,最好检查Reporting Service的日志文件. 今天早上就遇到这样一 ...

  6. 【案例】常驻查询引发的thread pool 性能问题之二

    一 现象     某业务单机4个实例中的一个实例出现连接数远高于其他三个实例(正常是4K,问题实例是8K+),但是这4个实例的配置完全相同.业务开发反馈为部分连接失败.     执行show proc ...

  7. 白话Elasticsearch67-不随意调节jvm和thread pool的原因jvm和服务器内存分配的最佳实践

    文章目录 概述 不随意调节jvm和thread pool的原因 jvm gc threadpool jvm和服务器内存分配的最佳实践 jvm heap分配 将机器上少于一半的内存分配给es 为什么不要 ...

  8. Thread pool引起的程序连接数据库响应慢

    数据库版本:percona-mysql 5.6.16 ​在很长一段时间,都会出现程序连接数据库,出现响应慢的情况,正常在几到几十毫秒之间,但是偶尔会出现上百毫秒的情况: 开始由于开发重新设置并调整过程 ...

  9. 自定义parallelStream的thread pool

    文章目录 简介 通常操作 使用自定义ForkJoinPool 总结 自定义parallelStream的thread pool 简介 之前我们讲到parallelStream的底层使用到了ForkJo ...

  10. worksteal thread pool

    worksteal的场景 对于一个线程池,每个线程有一个队列,想象这种场景,有的线程队列中有大量的比较耗时的任务堆积,而有的线程队列却是空的,现象就是有的线程处于饥饿状态,而有的线程处于消化不良的状态 ...

最新文章

  1. 【Machine Learning实验1】batch gradient descent(批量梯度下降) 和 stochastic gradient descent(随机梯度下降)
  2. Visual Studio中检测内存泄漏的方法(一)
  3. 浅谈AJAX并实现使用pagehelper-5.1.10.jar分页插件实现异步从数据库中获取数据分页显示
  4. python2和python3的print语句语法有什么不同_Python3.2的版本,输入print语句总是出错,是什么原因?...
  5. ArcGIS(A column was specified that does not exist)
  6. 书------编程(C#)
  7. Android开发笔记(八十)运行状态检查
  8. Swift - 环形进度条(UIActivityIndicatorView)的用法
  9. linux shell 后台执行脚本的方法 脚本后台运行 后台运行程
  10. JAVA集合Set之HashSet详解_Java基础———集合之HashSet详解
  11. php数组合成函数,PHP合并数组函数array_merge用法分析
  12. yii2 init初始化脚本分析
  13. Java Socket聊天室
  14. dell笔记本驱动安装失败_W10系统声卡驱动程序安装失败的原因及解决方法
  15. CSMA/CD和拥塞控制AIMD其实是一回事!
  16. 用python开发出一个桌面小程序
  17. 什么是实体-联系图(ER图)
  18. 局域网是计算机硬件和什么结合的,2017年计算机硬件知识备考试题及答案
  19. Dede URL优化拼音命名
  20. IOCP之accept、AcceptEx、WSAAccept的区别 .

热门文章

  1. 从低位开始取出长整型变量s中奇数位上的数,依次构成一个新数放在t中
  2. 武汉星起航跨境电商——亚马逊日本站JCT政策将实现改革
  3. Linux 查看与修改mtu值
  4. ps2模拟器pc版_如何在Windows PC上使用PS3控制器
  5. 表单验证:名称、电话号码、邮箱
  6. iDB-数据库自动化运维平台
  7. python把数据生成图表_python从Oracle读取数据生成图表
  8. 数理知识(1):虚无假设、显著性检验、统计推断、P值法
  9. 【NLP】NO5:文本聚类
  10. TypeScript 安装及基础运行环境搭建 -- 原文来自博客园用户[长岛冰茶。](https://www.cnblogs.com/gaoyd/p/13529026.html)【未修改完成】】...