Impala-shell卡顿分析——记一次曲折的Debug经历

问题发现

最近准备在Impala中增加对UTF-8的支持,以修正跟Hive、Spark等基于Java的系统在UTF-8字符串上的不兼容表现(如IMPALA-2019、IMPALA-5675、IMPALA-9662等)。结果在impala-shell里简单跑了个substring查询就挂住了:

[localhost:21050] default> select substr("引擎", 1, 3);
Query: select substr("引擎", 1, 3)
Query submitted at: 2020-10-28 11:54:26 (Coordinator: http://quanlong-OptiPlex-BJ:25000)
Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=464f44d405bb16a1:5b41e00e00000000
+----------------------+
| substr('引擎', 1, 3) |
+----------------------+
| 引                   |
+----------------------+
Fetched 1 row(s) in 0.13s
[localhost:21050] default> select substr("引擎", 1, 2);
Query: select substr("引擎", 1, 2)
Query submitted at: 2020-10-28 11:54:32 (Coordinator: http://quanlong-OptiPlex-BJ:25000)
Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=d048231be08f29b4:860c959500000000
Caught exception 'utf8' codec can't decode bytes in position 0-1: unexpected end of data, type=<type 'exceptions.UnicodeDecodeError'> in FetchResults.

第一个查询执行 substr(“引擎”, 1, 3) 没有问题,因为每个中文字符的UTF-8编码是3个字节,而Impala中把String都当成了字节数组(byte array),因此前3个字节就是第一个中文字符,没问题。

第二个查询读出来的是 “引” 的UTF-8编码的前2个字节,在解码成UTF-8字符时报了unexpected end of data的错误,并且impala-shell卡住不动,只能强行Ctrl+C.

这个在以前的版本是没问题的,以前解码失败的UTF-8字符是以乱码的形式展现,impala-shell也不会卡死。于是我先建了个JIRA汇报问题:IMPALA-10299,我所在的分支是master分支,commit为1bd27a3.

问题初探

定位Impala-Shell卡住的代码位置

进程卡死时一般是把堆栈打出来,看它卡在哪了。Impala-shell是个python进程,先用pystack看下:

$ ps aux | grep shell
quanlong 30957  0.5  0.0 296100 24716 pts/22   S+   15:06   0:00 bin/../infra/python/env-gcc7.5.0/bin/python /home/quanlong/workspace/Impala/shell/impala_shell.py
quanlong 31003  0.0  0.0  15960  1088 pts/20   S+   15:06   0:00 grep --color=auto shell$ sudo /home/quanlong/.local/bin/pystack 30957
Dumping Threads....File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 2058, in <module>impala_shell_main()File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 2023, in impala_shell_mainshell.cmdloop(intro)File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/cmd.py", line 142, in cmdloopstop = self.onecmd(line)File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 687, in onecmdreturn func(arg)File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1113, in do_selectreturn self._execute_stmt(query_str, print_web_link=True)File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1303, in _execute_stmtself.imp_client.close_query(self.last_query_handle)File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 847, in close_queryresp = self._do_hs2_rpc(CloseImpalaOperation, retry_on_error=True)File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 959, in _do_hs2_rpcreturn rpc()File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 845, in CloseImpalaOperationreturn self.imp_service.CloseImpalaOperation(req)File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 155, in CloseImpalaOperationreturn self.recv_CloseImpalaOperation()File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 167, in recv_CloseImpalaOperation(fname, mtype, rseqid) = iprot.readMessageBegin()File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 148, in readMessageBeginname = self.trans.readAll(sz)File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 60, in readAllchunk = self.read(sz - have)File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 162, in readself.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 117, in readbuff = self.handle.recv(sz)File "<string>", line 1, in <module>File "<string>", line 1, in <module>

这里能看到impala-shell是卡在一个close_query操作,具体是发hs2协议的CloseImpalaOperation RPC,在等impalad那边回复。可以看到thrift层的代码都在做各种read操作并卡住。那么问题很可能出现在server端,要排查一下CloseImpalaOperation这个RPC为什么没有返回。

定位Server端卡住的位置

查看Impalad的日志,看不出什么异常,看起来查询就是正常结束了。比如我重新跑了个 select substr(“引擎”, 1, 4) 的查询,impala-shell 也是卡住不动,impalad端的日志如下:

I1106 09:49:19.698134 15454 Frontend.java:1532] fc450a0fa19a6fbc:9315824900000000] Analyzing query: select substr("引擎", 1, 4) db: default
I1106 09:49:19.702113 15454 BaseAuthorizationChecker.java:110] fc450a0fa19a6fbc:9315824900000000] Authorization check took 3 ms
I1106 09:49:19.702214 15454 Frontend.java:1574] fc450a0fa19a6fbc:9315824900000000] Analysis and authorization finished.
W1106 09:49:19.706602 15459 executor-group.cc:164] fc450a0fa19a6fbc:9315824900000000] Executor group empty group (using coordinator only) is unhealthy: 0 out of 1 are available.
I1106 09:49:19.706990 15459 admission-controller.cc:1646] fc450a0fa19a6fbc:9315824900000000] Trying to admit id=fc450a0fa19a6fbc:9315824900000000 in pool_name=default-pool executor_group_name=empty group (using coordinator only) per_host_mem_estimate=10.00 MB dedicated_coord_mem_estimate=100.00 MB max_requests=-1 max_queued=200 max_mem=-1.00 B
I1106 09:49:19.707115 15459 admission-controller.cc:1654] fc450a0fa19a6fbc:9315824900000000] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0,  local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0, topN_query_stats: queries=[], total_mem_consumed=0; pool_level_stats: num_running=0, min=0, max=0, pool_total_mem=0)
I1106 09:49:19.707186 15459 admission-controller.cc:1200] fc450a0fa19a6fbc:9315824900000000] Admitting query id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.707155 15454 impala-hs2-server.cc:523] ExecuteStatement(): return_val=TExecuteStatementResp {01: status (struct) = TStatus {01: statusCode (i32) = 0, },02: operationHandle (struct) = TOperationHandle {01: operationId (struct) = THandleIdentifier {01: guid (string) = "\xbco\x9a\xa1\x0f\nE\xfc\x00\x00\x00\x00I\x82\x15\x93",02: secret (string) = "\xa5|\xf9A\xbc\x8aG\xd4\x8a\r\xb8\xae\xb4\xcf\x9f\xfc",},   02: operationType (i32) = 0, 03: hasResultSet (bool) = true,},
}
I1106 09:49:19.707319 15459 impala-server.cc:2062] fc450a0fa19a6fbc:9315824900000000] Registering query locations
I1106 09:49:19.707382 15459 coordinator.cc:149] fc450a0fa19a6fbc:9315824900000000] Exec() query_id=fc450a0fa19a6fbc:9315824900000000 stmt=select substr("引擎", 1, 4)
I1106 09:49:19.707787 15454 impala-hs2-server.cc:832] GetResultSetMetadata(): query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.707967 15459 coordinator.cc:473] fc450a0fa19a6fbc:9315824900000000] starting execution on 1 backends for query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.707902 15454 impala-hs2-server.cc:864] GetResultSetMetadata(): return_val=TGetResultSetMetadataResp {01: status (struct) = TStatus {01: statusCode (i32) = 0, },02: schema (struct) = TTableSchema {01: columns (list) = list<struct>[1] {[0] = TColumnDesc {01: columnName (string) = "substr('\xe5\xbc\x95\xe6\x93\x8e', 1, 4)", 02: typeDesc (struct) = TTypeDesc {01: types (list) = list<struct>[1] {[0] = TTypeEntry {01: primitiveEntry (struct) = TPrimitiveTypeEntry {01: type (i32) = 7, },},},},03: position (i32) = 0, },   },},
}
I1106 09:49:19.709297  7513 control-service.cc:142] fc450a0fa19a6fbc:9315824900000000] ExecQueryFInstances(): query_id=fc450a0fa19a6fbc:9315824900000000 coord=quanlong-OptiPlex-BJ:27000 #instances=1
I1106 09:49:19.711102 15462 query-state.cc:897] fc450a0fa19a6fbc:9315824900000000] Executing instance. instance_id=fc450a0fa19a6fbc:9315824900000000 fragment_idx=0 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=1
I1106 09:49:19.711118 15459 coordinator.cc:532] fc450a0fa19a6fbc:9315824900000000] started execution on 1 backends for query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.810067 15454 coordinator.cc:672] ExecState: query id=fc450a0fa19a6fbc:9315824900000000 execution completed
I1106 09:49:19.810063 15462 krpc-data-stream-mgr.cc:337] fc450a0fa19a6fbc:9315824900000000] cancelling active streams for fragment_instance_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.810087 15454 coordinator.cc:821] Coordinator waiting for backends to finish, 1 remaining. query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.810119 15462 query-state.cc:906] fc450a0fa19a6fbc:9315824900000000] Instance completed. instance_id=fc450a0fa19a6fbc:9315824900000000 #in-flight=0 status=OK
I1106 09:49:19.810176 15461 query-state.cc:464] fc450a0fa19a6fbc:9315824900000000] UpdateBackendExecState(): last report for fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.810976  7513 coordinator.cc:965] Backend completed: host=quanlong-OptiPlex-BJ:27000 remaining=1 query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.811048 15454 coordinator.cc:1284] Release admission control resources for query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.811172 15461 query-state.cc:686] fc450a0fa19a6fbc:9315824900000000] Cancelling fragment instances as directed by the coordinator. Returned status: Cancelled
I1106 09:49:19.811204 15461 query-state.cc:923] fc450a0fa19a6fbc:9315824900000000] Cancel: query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.811213 15461 krpc-data-stream-mgr.cc:337] fc450a0fa19a6fbc:9315824900000000] cancelling active streams for fragment_instance_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.811383 15454 impala-hs2-server.cc:800] CloseOperation(): query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.811399 15454 impala-server.cc:1320] UnregisterQuery(): query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.811409 15454 coordinator-backend-state.cc:921] query_id=fc450a0fa19a6fbc:9315824900000000 target backend=127.0.0.1:27000: Not cancelling because the backend is already done:
I1106 09:49:19.811415 15454 coordinator.cc:932] CancelBackends() query_id=fc450a0fa19a6fbc:9315824900000000, tried to cancel 0 backends
I1106 09:49:19.812331  8319 impala-server.cc:1352] Query successfully unregistered: query_id=fc450a0fa19a6fbc:9315824900000000
I1106 09:49:19.812489  8319 query-exec-mgr.cc:213] ReleaseQueryState(): deleted query_id=fc450a0fa19a6fbc:9315824900000000

那有一种可能,就是卡在了没有日志的地方。好在我这是开发环境,而且问题可固定重现,可以加点日志重新编译来Debug。先找一下处理 hs2 协议里 CloseImpalaOperation 这个RPC的代码,主要是在 be/src/service/impala-hs2-server.cc 中:https://github.com/apache/impala/blob/1bd27a3/be/src/service/impala-hs2-server.cc#L793-L822

void ImpalaServer::CloseOperation(TCloseOperationResp& return_val,const TCloseOperationReq& request) {TCloseImpalaOperationReq request2;request2.operationHandle = request.operationHandle;TCloseImpalaOperationResp tmp_resp;CloseImpalaOperation(tmp_resp, request2);return_val.status = tmp_resp.status;
}void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val,const TCloseImpalaOperationReq& request) {TUniqueId query_id;TUniqueId op_secret;HS2_RETURN_IF_ERROR(return_val, THandleIdentifierToTUniqueId(request.operationHandle.operationId, &query_id, &op_secret),SQLSTATE_GENERAL_ERROR);VLOG_QUERY << "CloseOperation(): query_id=" << PrintId(query_id);QueryHandle query_handle;HS2_RETURN_IF_ERROR(return_val, GetActiveQueryHandle(query_id, &query_handle), SQLSTATE_GENERAL_ERROR);ScopedSessionState session_handle(this);const TUniqueId session_id = query_handle->session_id();HS2_RETURN_IF_ERROR(return_val,session_handle.WithSession(session_id, SecretArg::Operation(op_secret, query_id)),SQLSTATE_GENERAL_ERROR);if (query_handle->stmt_type() == TStmtType::DML) {Status query_status;if (query_handle->GetDmlStats(&return_val.dml_result, &query_status)) {return_val.__isset.dml_result = true;}}// TODO: use timeout to get rid of unwanted query_handle.HS2_RETURN_IF_ERROR(return_val, UnregisterQuery(query_id, true),SQLSTATE_GENERAL_ERROR);return_val.status.__set_statusCode(thrift::TStatusCode::SUCCESS_STATUS);
}

这里有一些耗时的操作,比如最后的 UnregisterQuery,可能卡在这。简单暴力的方式就是加log,看这个函数卡在哪了。CloseImpalaOperation还被前面一个CloseOperation调用了,也加点日志。于是做了如下修改重新编译:

diff --git a/be/src/service/impala-hs2-server.cc b/be/src/service/impala-hs2-server.cc
index 5c618fb..82a2820 100644
--- a/be/src/service/impala-hs2-server.cc
+++ b/be/src/service/impala-hs2-server.cc
@@ -786,14 +786,17 @@ void ImpalaServer::CloseOperation(TCloseOperationResp& return_val,TCloseImpalaOperationReq request2;request2.operationHandle = request.operationHandle;TCloseImpalaOperationResp tmp_resp;
+  VLOG_QUERY << "0";CloseImpalaOperation(tmp_resp, request2);return_val.status = tmp_resp.status;
+  VLOG_QUERY << "5";}void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val,const TCloseImpalaOperationReq& request) {TUniqueId query_id;TUniqueId op_secret;
+  VLOG_QUERY << "1";HS2_RETURN_IF_ERROR(return_val, THandleIdentifierToTUniqueId(request.operationHandle.operationId, &query_id, &op_secret),SQLSTATE_GENERAL_ERROR);
@@ -802,12 +805,14 @@ void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val,QueryHandle query_handle;HS2_RETURN_IF_ERROR(return_val, GetActiveQueryHandle(query_id, &query_handle), SQLSTATE_GENERAL_ERROR);
+  VLOG_QUERY << "2";ScopedSessionState session_handle(this);const TUniqueId session_id = query_handle->session_id();HS2_RETURN_IF_ERROR(return_val,session_handle.WithSession(session_id, SecretArg::Operation(op_secret, query_id)),SQLSTATE_GENERAL_ERROR);
+  VLOG_QUERY << "3";if (query_handle->stmt_type() == TStmtType::DML) {Status query_status;if (query_handle->GetDmlStats(&return_val.dml_result, &query_status)) {@@ -819,6 +824,7 @@ void ImpalaServer::CloseImpalaOperation(TCloseImpalaOperationResp& return_val,HS2_RETURN_IF_ERROR(return_val, UnregisterQuery(query_id, true),SQLSTATE_GENERAL_ERROR);return_val.status.__set_statusCode(thrift::TStatusCode::SUCCESS_STATUS);
+  VLOG_QUERY << "4";}void ImpalaServer::GetResultSetMetadata(TGetResultSetMetadataResp& return_val,

改完只需要 ninja impalad 重新编译 impalad(如果最初编译时没启用ninja的话,则是用 make impalad ),因为只改了BE的代码。然后重启集群重现问题,发现 ImpalaServer::CloseImpalaOperation 新加的日志都打出来了,但 ImpalaServer::CloseOperation 没有打。说明 CloseOperation 没有被调用,而 CloseImpalaOperation 正常结束了。日志如下:

I1106 10:12:25.581028 29674 query-state.cc:464] bc46ff8a74592f6b:9146663900000000] UpdateBackendExecState(): last report for bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.581405 29667 impala-hs2-server.cc:799] 1
I1106 10:12:25.581435 29667 impala-hs2-server.cc:803] CloseOperation(): query_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.581471 29667 impala-hs2-server.cc:808] 2
I1106 10:12:25.581503 29667 impala-hs2-server.cc:815] 3
I1106 10:12:25.581519 29667 impala-server.cc:1320] UnregisterQuery(): query_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.581562 29667 coordinator.cc:672] ExecState: query id=bc46ff8a74592f6b:9146663900000000 execution cancelled
I1106 10:12:25.581620 29667 coordinator-backend-state.cc:921] query_id=bc46ff8a74592f6b:9146663900000000 target backend=127.0.0.1:27000: Sending CancelQueryFInstances rpc
I1106 10:12:25.582509 28909 control-service.cc:213] CancelQueryFInstances(): query_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.582554 28909 query-exec-mgr.cc:126] QueryState: query_id=bc46ff8a74592f6b:9146663900000000 refcnt=3
I1106 10:12:25.582571 28909 query-state.cc:923] Cancel: query_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.582594 28909 krpc-data-stream-mgr.cc:337] cancelling active streams for fragment_instance_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.582669 28908 impala-server.cc:1483] Invalid or unknown query handle: bc46ff8a74592f6b:9146663900000000.
I1106 10:12:25.582744 28908 control-service.cc:173] ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): bc46ff8a74592f6b:9146663900000000 remote host=127.0.0.1:43294
I1106 10:12:25.583061 29667 coordinator.cc:932] CancelBackends() query_id=bc46ff8a74592f6b:9146663900000000, tried to cancel 1 backends
I1106 10:12:25.583123 29674 query-state.cc:686] bc46ff8a74592f6b:9146663900000000] Cancelling fragment instances as directed by the coordinator. Returned status: ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): bc46ff8a74592f6b:9146663900000000 remote host=127.0.0.1:43294
I1106 10:12:25.583168 29674 query-state.cc:923] bc46ff8a74592f6b:9146663900000000] Cancel: query_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.583214 29667 coordinator.cc:1284] Release admission control resources for query_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.583902 29667 impala-hs2-server.cc:827] 4
I1106 10:12:25.586117 29513 impala-server.cc:1352] Query successfully unregistered: query_id=bc46ff8a74592f6b:9146663900000000
I1106 10:12:25.586696 29513 query-exec-mgr.cc:213] ReleaseQueryState(): deleted query_id=bc46ff8a74592f6b:9146663900000000

那么可能是卡在什么地方呢?看一下在impalad里是谁调用了CloseImpalaOperation,有可能卡在上层调用方。impala里有一个GetStackTrace()函数,定义在debug-util.h里,用来打印运行时的代码堆栈。于是我把 4 号日志改为:

+  VLOG_QUERY << "4" << "\n" << GetStackTrace();

再重现一次问题,可以看到堆栈:

I1106 10:26:58.427156   565 impala-hs2-server.cc:827] 4@          0x2580dac  impala::ImpalaServer::CloseImpalaOperation()@          0x2b5ccba  impala::ImpalaHiveServer2ServiceProcessor::process_CloseImpalaOperation()@          0x2b5b418  impala::ImpalaHiveServer2ServiceProcessor::dispatchCall()@          0x1cd7721  apache::thrift::TDispatchProcessor::process()@          0x21ef3fc  apache::thrift::server::TAcceptQueueServer::Task::run()@          0x21e4810  impala::ThriftThread::RunRunnable()@          0x21e5e4c  boost::_mfi::mf2<>::operator()()@          0x21e5ce0  boost::_bi::list3<>::operator()<>()@          0x21e5a26  boost::_bi::bind_t<>::operator()()@          0x21e5938  boost::detail::function::void_function_obj_invoker0<>::invoke()@          0x215e37b  boost::function0<>::operator()()@          0x274cef9  impala::Thread::SuperviseThread()@          0x2754e96  boost::_bi::list5<>::operator()<>()@          0x2754dba  boost::_bi::bind_t<>::operator()()@          0x2754d7b  boost::detail::thread_data<>::run()@          0x3f41e21  thread_proxy@     0x7f2e95b706b9  start_thread@     0x7f2e9263e4dc  clone
I1106 10:26:58.577329   565 query-exec-mgr.cc:213] ReleaseQueryState(): deleted query_id=ea4b5873a9a4bc14:240fca4800000000

ImpalaHiveServer2ServiceProcessor::process_CloseImpalaOperation 这种函数代码是Thrift编译出来的,在 be/generated-sources/gen-cpp/ImpalaHiveServer2Service.cpp 里,这种出问题的可能性较小。我用minidump打印出了impalad的所有线程堆栈(具体操作参考《Impala查询卡顿分析案例》中的 “使用Minidump替代Core Dump” 章节,或参考官方文档 Debugging Impala Minidumps),但并没有找到相关的线程,这条Debug思路走不下去。

定位哪个Commit开始有这个bug

转变思路,看是哪个Commit引入了这个bug,然后对症下药。impala-shell 卡住的地方是最近这个 commit 新加的代码:fe6e625
即 shell/impala_shell.py 第1303 行:

shell/impala_shell.py 里处理 UnicodeDecodeError 时会调 close_query:
1297     except UnicodeDecodeError as e:
1298       # An error occoured possibly during the fetching.
1299       # Depending of which protocol is at use it can come from different places.
1300       # Possibly occours because we try to display binary data which contains
1301       # undecodable elements.
1302       if self.last_query_handle is not None:
1303         self.imp_client.close_query(self.last_query_handle)
1304       print('UnicodeDecodeError : %s \nPlease check for columns containing binary data '
1305           'to find the possible source of the error.' % (e,), file=sys.stderr)

说明首先是遇到了 UnicodeDecodeError,然后想关闭查询时卡住了。所以有两个问题要解释:(1) 为什么抛 UnicodeDecodeError;(2) 为什么会卡住。

这个commit(fe6e625)其实加了一段会抛 UnicodeDecodeError 的代码,但只对 beeswax 协议的逻辑加了,在 shell/impala_client.py 里:

1016 class ImpalaBeeswaxClient(ImpalaClient):
...
1108   def fetch(self, query_handle):
1109     while True:
1110       result, rpc_status = self._do_beeswax_rpc(
1111          lambda: self.imp_service.fetch(query_handle, False,
1112                                         self.fetch_size))
1113       if rpc_status != RpcStatus.OK:
1114         raise RPCException()
1115
1116       def split_row_and_decode_if_needed(row):
1117         try:
1118           return row.split('\t')
1119         except UnicodeDecodeError:
1120           return row.decode('utf-8', 'replace').split('\t')
1121
1122       yield [split_row_and_decode_if_needed(row) for row in result.data]

突然发现还没有玩过 beeswax 协议的情况。在 impala 4.0 里,impala-shell 默认使用 hs2 协议和 impalad 通信,切回 beeswax 协议需要显式指定 --protocol=beeswax:

$ bin/impala-shell.sh --protocol=beeswax -q 'select substr("引擎", 1, 4)'
Unknown Exception : 'NoneType' object is not iterable
Traceback (most recent call last):File "/home/quanlong/workspace/Impala/shell/impala_shell.py", line 1255, in _execute_stmtfor rows in rows_fetched:File "/home/quanlong/workspace/Impala/shell/impala_client.py", line 1111, in fetchlambda: self.imp_service.fetch(query_handle, False,
TypeError: 'NoneType' object is not iterable
Could not execute command: select substr("引擎", 1, 4)

用 beeswax 协议的话是直接报错退出,表现也不一样,并没有报 utf-8 解码失败的错。

然而直接回滚这个commit并不能修复这个问题,应该是跟thrift相关的某个改动引起的,因为utf-8解码错误就是从thrift层抛出来的。其实在四月份发布的 Impala 3.4 中是没有这个bug的,可以把中间跟 thrift 相关的commit都找出来,然后用二分法去找出问题的commit。

$ git log --oneline -i --grep=thrift --since='2020-04-01'
981ef10 IMPALA-10215: Implement INSERT INTO for non-partitioned Iceberg tables (Parquet)
ee4043e IMPALA-10168: Expose JSON catalog objects in catalogd's debug page
6542b60 IMPALA-10243: ConcurrentModificationException during parallel INSERTs
63c435c IMPALA-9232 Potential overflow in serializeThriftMsg
6bb3b88 IMPALA-9180 (part 1): Remove legacy ImpalaInternalService
28181cb IMPALA-9930 (part 1): Initial refactor for admission control service
79b17db IMPALA-9382: part 1: transposed profile prototype
13f50ea IMPALA-9229: impala-shell 'profile' to show original and retried queries
fe6e625 IMPALA-10012: ds_hll_sketch() results ascii codec decoding error fix
75146c9 IMPALA-10012: ds_hll_sketch() results ascii codec decoding error fix
074731e IMPALA-3127: Support incremental metadata updates in partition level
ee18dc3 IMPALA-9987: Improve logging around HTTP connections
8aeb282 IMPALA-9540 Test that Impala Shell no longer sends duplicate "Host" headers in http mode.
b67c090 IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
8b09343 IMPALA-9719: Upgrade sasl-0.1.1 -> 0.2.1
e46bbf2 IMPALA-9692 (part 1): Refactor TBackendDescriptor to protobuf
bc9d7e0 IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
bd4458b IMPALA-9422: Re-visit and improve join node and builder's counters
2451846 IMPALA-9543 Reduce duplicate code in thrift CMakeLists.txt

最终定位到是这个commit开始有问题:

bc9d7e0 IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

具体内容见Github: https://github.com/apache/impala/commit/bc9d7e0
或Gerrit: http://gerrit.cloudera.org:8080/15524

但这个commit实在太大了,一次性做了好几件事情,比如给impala-shell加了个option可以指定任意virtualenv,设置PYTHONIOENCODING=utf8,把impala-shell依赖的thrift版本升级到0.11.0等。现在过了很久了没法clean地回滚这个commit,怀疑是PYTHONIOENCODING设置为utf8引起的,但改为ascii并不能修复问题。完全理解这个大commit工作量有点大,这条路不好走下去。

柳暗花明 —— tcpdump

还是回到最初的思路,直接定位impalad端卡在哪了导致不返回RPC结果。同事提议可以用 tcpdump 对比下正常和异常的查询,这确实是个很好的思路!其实之前也对比过正常查询和异常查询在 impalad 端的日志,发现都一样。但对比 TCP 包的内容却是个更深入的 debug 手段,应用层看不出端倪,我们就往下走一层,看 TCP 层是怎么样!

在我的开发环境里,impala-shell是通过localhost连接的impalad,所以监听的设备要选 lo (Local Loopback)。现在master分支的impala-shell默认使用hs2协议,因此连的是impalad的21050端口。由于只有我的 impala-shell 能连接到这个 impalad,因此我直接监听 21050 端口的 TCP 包就是所有该 impala-shell 的通信包了。如果是在生产环境中,则要只监听 impala-shell 使用的端口,以区分其它应用跟 impalad 的通信。

这里我用的指令是

sudo tcpdump -i lo 'port 21050'

把 impala-shell 启动起来,执行一条正常语句 select substr(“引擎”, 1, 3),然后再执行一条异常语句 select substr(“引擎”, 1, 4),明显能看到异常语句少了几个包。
正常语句的TCP包有26个:

11:04:52.169568 IP localhost.40938 > localhost.21050: Flags [P.], seq 2344715580:2344715671, ack 246892403, win 512, options [nop,nop,TS val 3249330901 ecr 3249311189], length 91
11:04:52.169913 IP localhost.21050 > localhost.40938: Flags [P.], seq 1:181, ack 91, win 512, options [nop,nop,TS val 3249330902 ecr 3249330901], length 180
11:04:52.169944 IP localhost.40938 > localhost.21050: Flags [.], ack 181, win 511, options [nop,nop,TS val 3249330902 ecr 3249330902], length 0
11:04:52.170831 IP localhost.40938 > localhost.21050: Flags [P.], seq 91:292, ack 181, win 512, options [nop,nop,TS val 3249330902 ecr 3249330902], length 201
11:04:52.187515 IP localhost.21050 > localhost.40938: Flags [P.], seq 181:290, ack 292, win 512, options [nop,nop,TS val 3249330919 ecr 3249330902], length 109
11:04:52.187852 IP localhost.40938 > localhost.21050: Flags [P.], seq 292:394, ack 290, win 512, options [nop,nop,TS val 3249330920 ecr 3249330919], length 102
11:04:52.188237 IP localhost.21050 > localhost.40938: Flags [P.], seq 290:411, ack 394, win 512, options [nop,nop,TS val 3249330920 ecr 3249330920], length 121
11:04:52.189283 IP localhost.40938 > localhost.21050: Flags [P.], seq 394:494, ack 411, win 512, options [nop,nop,TS val 3249330921 ecr 3249330920], length 100
11:04:52.189459 IP localhost.21050 > localhost.40938: Flags [P.], seq 411:464, ack 494, win 512, options [nop,nop,TS val 3249330921 ecr 3249330921], length 53
11:04:52.231963 IP localhost.40938 > localhost.21050: Flags [.], ack 464, win 512, options [nop,nop,TS val 3249330964 ecr 3249330921], length 0
11:04:52.290038 IP localhost.40938 > localhost.21050: Flags [P.], seq 494:594, ack 464, win 512, options [nop,nop,TS val 3249331022 ecr 3249330921], length 100
11:04:52.290276 IP localhost.21050 > localhost.40938: Flags [P.], seq 464:517, ack 594, win 512, options [nop,nop,TS val 3249331022 ecr 3249331022], length 53
11:04:52.290348 IP localhost.40938 > localhost.21050: Flags [.], ack 517, win 512, options [nop,nop,TS val 3249331022 ecr 3249331022], length 0
11:04:52.290815 IP localhost.40938 > localhost.21050: Flags [P.], seq 594:706, ack 517, win 512, options [nop,nop,TS val 3249331022 ecr 3249331022], length 112
11:04:52.291191 IP localhost.21050 > localhost.40938: Flags [P.], seq 517:620, ack 706, win 512, options [nop,nop,TS val 3249331023 ecr 3249331022], length 103
11:04:52.292094 IP localhost.40938 > localhost.21050: Flags [P.], seq 706:818, ack 620, win 512, options [nop,nop,TS val 3249331024 ecr 3249331023], length 112
11:04:52.294791 IP localhost.21050 > localhost.40938: Flags [P.], seq 620:715, ack 818, win 512, options [nop,nop,TS val 3249331026 ecr 3249331024], length 95
11:04:52.295077 IP localhost.40938 > localhost.21050: Flags [P.], seq 818:906, ack 715, win 512, options [nop,nop,TS val 3249331027 ecr 3249331026], length 88
11:04:52.295349 IP localhost.21050 > localhost.40938: Flags [P.], seq 715:823, ack 906, win 512, options [nop,nop,TS val 3249331027 ecr 3249331027], length 108
11:04:52.295624 IP localhost.40938 > localhost.21050: Flags [P.], seq 906:1008, ack 823, win 512, options [nop,nop,TS val 3249331027 ecr 3249331027], length 102
11:04:52.335924 IP localhost.21050 > localhost.40938: Flags [.], ack 1008, win 512, options [nop,nop,TS val 3249331068 ecr 3249331027], length 0
11:04:52.458929 IP localhost.21050 > localhost.40938: Flags [P.], seq 823:871, ack 1008, win 512, options [nop,nop,TS val 3249331191 ecr 3249331027], length 48
11:04:52.459074 IP localhost.40938 > localhost.21050: Flags [P.], seq 1008:1172, ack 871, win 512, options [nop,nop,TS val 3249331191 ecr 3249331191], length 164
11:04:52.459080 IP localhost.21050 > localhost.40938: Flags [.], ack 1172, win 511, options [nop,nop,TS val 3249331191 ecr 3249331191], length 0
11:04:52.460225 IP localhost.21050 > localhost.40938: Flags [P.], seq 871:12312, ack 1172, win 512, options [nop,nop,TS val 3249331192 ecr 3249331191], length 11441
11:04:52.503980 IP localhost.40938 > localhost.21050: Flags [.], ack 12312, win 512, options [nop,nop,TS val 3249331236 ecr 3249331192], length 0

异常语句的TCP包只有19个:

11:05:00.805456 IP localhost.40938 > localhost.21050: Flags [P.], seq 1172:1263, ack 12312, win 512, options [nop,nop,TS val 3249339537 ecr 3249331192], length 91
11:05:00.805554 IP localhost.21050 > localhost.40938: Flags [P.], seq 12312:12492, ack 1263, win 512, options [nop,nop,TS val 3249339537 ecr 3249339537], length 180
11:05:00.805561 IP localhost.40938 > localhost.21050: Flags [.], ack 12492, win 511, options [nop,nop,TS val 3249339537 ecr 3249339537], length 0
11:05:00.805751 IP localhost.40938 > localhost.21050: Flags [P.], seq 1263:1464, ack 12492, win 512, options [nop,nop,TS val 3249339537 ecr 3249339537], length 201
11:05:00.810144 IP localhost.21050 > localhost.40938: Flags [P.], seq 12492:12601, ack 1464, win 512, options [nop,nop,TS val 3249339542 ecr 3249339537], length 109
11:05:00.810290 IP localhost.40938 > localhost.21050: Flags [P.], seq 1464:1566, ack 12601, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 102
11:05:00.810415 IP localhost.21050 > localhost.40938: Flags [P.], seq 12601:12722, ack 1566, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 121
11:05:00.810576 IP localhost.40938 > localhost.21050: Flags [P.], seq 1566:1666, ack 12722, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 100
11:05:00.810650 IP localhost.21050 > localhost.40938: Flags [P.], seq 12722:12775, ack 1666, win 512, options [nop,nop,TS val 3249339542 ecr 3249339542], length 53
11:05:00.851921 IP localhost.40938 > localhost.21050: Flags [.], ack 12775, win 512, options [nop,nop,TS val 3249339583 ecr 3249339542], length 0
11:05:00.911128 IP localhost.40938 > localhost.21050: Flags [P.], seq 1666:1766, ack 12775, win 512, options [nop,nop,TS val 3249339643 ecr 3249339542], length 100
11:05:00.911375 IP localhost.21050 > localhost.40938: Flags [P.], seq 12775:12828, ack 1766, win 512, options [nop,nop,TS val 3249339643 ecr 3249339643], length 53
11:05:00.911468 IP localhost.40938 > localhost.21050: Flags [.], ack 12828, win 512, options [nop,nop,TS val 3249339643 ecr 3249339643], length 0
11:05:00.911948 IP localhost.40938 > localhost.21050: Flags [P.], seq 1766:1878, ack 12828, win 512, options [nop,nop,TS val 3249339644 ecr 3249339643], length 112
11:05:00.912416 IP localhost.21050 > localhost.40938: Flags [P.], seq 12828:12932, ack 1878, win 512, options [nop,nop,TS val 3249339644 ecr 3249339644], length 104
11:05:00.912848 IP localhost.40938 > localhost.21050: Flags [P.], seq 1878:1980, ack 12932, win 512, options [nop,nop,TS val 3249339644 ecr 3249339644], length 102
11:05:00.959907 IP localhost.21050 > localhost.40938: Flags [.], ack 1980, win 512, options [nop,nop,TS val 3249339691 ecr 3249339644], length 0
11:05:01.097326 IP localhost.21050 > localhost.40938: Flags [P.], seq 12932:12980, ack 1980, win 512, options [nop,nop,TS val 3249339829 ecr 3249339644], length 48
11:05:01.139967 IP localhost.40938 > localhost.21050: Flags [.], ack 12980, win 512, options [nop,nop,TS val 3249339871 ecr 3249339829], length 0

在这里我们终于看到了区别!接下来再重现一次,把两个查询的TCP包都dump下来。在一个终端打开impala-shell,另一个终端执行

sudo tcpdump -i lo 'port 21050' -w good_query_tcp_dump.txt

这样会把tcp包的内容都dump到文件good_query_tcp_dump.txt里。在impala-shell中执行正常查询 select substr(“引擎”, 1, 3),然后Ctrl+C关掉tcpdump进程。再执行

sudo tcpdump -i lo 'port 21050' -w bad_query_tcp_dump.txt

这次在impala-shell中跑异常查询 select substr(“引擎”, 1, 4),再次Ctrl+C关掉tcpdump。这样我们就采集到两个查询的TCP包了。直接用vim打开,能看到里面有文本内容,FetchResults、GetLog、CloseImpalaOperation这些都是成对出现,一个是request,另一个是response。正常查询在CloseImpalaOperation后还会拿profile:

但是异常查询在CloseImpalaOperation后只有一个未知的小包:

从这里我们已经能看出,impalad其实已经返回了CloseImpalaOperation的结果!看起来是 impala-shell 收到 CloseImpalaOperation 的结果后,又发出了个很短的包,然后在等结果。看一下 impala-shell 这边 thrift 层的代码在做什么。

Thrift相关代码研究

前面已经用 pystack 打出了 impala-shell 的代码堆栈,最底下几层是 thrift 代码:

  File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 155, in CloseImpalaOperationreturn self.recv_CloseImpalaOperation()File "/home/quanlong/workspace/Impala/shell/build/thrift-11-gen/gen-py/ImpalaService/ImpalaHiveServer2Service.py", line 167, in recv_CloseImpalaOperation(fname, mtype, rseqid) = iprot.readMessageBegin()File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 148, in readMessageBeginname = self.trans.readAll(sz)File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 60, in readAllchunk = self.read(sz - have)File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 162, in readself.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))File "/home/quanlong/workspace/Impala/toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 117, in readbuff = self.handle.recv(sz)

好在都给出了文件名和行号,直接就能找到代码。从上到下依次看,recv_CloseImpalaOperation() 卡在第二行代码的 iprot.readMessageBegin():

def recv_CloseImpalaOperation(self):iprot = self._iprot(fname, mtype, rseqid) = iprot.readMessageBegin()if mtype == TMessageType.EXCEPTION:x = TApplicationException()x.read(iprot)iprot.readMessageEnd()raise xresult = CloseImpalaOperation_result()result.read(iprot)iprot.readMessageEnd()if result.success is not None:return result.successraise TApplicationException(TApplicationException.MISSING_RESULT, "CloseImpalaOperation failed: unknown result")

readMessageBegin() 是在 thrift/protocol/TBinaryProtocol.py 里定义的,这是 thrift 自己的代码。前面的 recv_CloseImpalaOperation 所在的文件是 ImpalaService/ImpalaHiveServer2Service.py,这是 thrift 编译出来的 python 代码。

133 def readMessageBegin(self):
134     sz = self.readI32()
135     if sz < 0:
136         version = sz & TBinaryProtocol.VERSION_MASK
137         if version != TBinaryProtocol.VERSION_1:
138             raise TProtocolException(
139                 type=TProtocolException.BAD_VERSION,
140                 message='Bad version in readMessageBegin: %d' % (sz))
141         type = sz & TBinaryProtocol.TYPE_MASK
142         name = self.readString()
143         seqid = self.readI32()
144     else:
145         if self.strictRead:
146             raise TProtocolException(type=TProtocolException.BAD_VERSION,
147                                      message='No protocol version header')
148         name = self.trans.readAll(sz)  # <-------- 卡在了这一行
149         type = self.readByte()
150         seqid = self.readI32()
151     return (name, type, seqid)

如上代码,堆栈显示是卡在了第148行的 readAll() 调用上。这个函数主体是个 if 语句,由第一行读入的 int32 值的正负来决定走哪个分支。可以直接改这段代码,把 sz 的值 print 出来。对比正常和异常和查询,发现正常查询的 sz 值都是负的,异常查询在读到一个正的 sz 值后进入了 else 分支,然后就卡住不动了。所以 impala-shell 里的 thrift 代码很可能解读了错误的 sz 值,从而在等一些不可能到来的数据。而这个误读的 sz 值,很可能是前面解码失败的 utf-8 字符的遗留字节。

带着这个猜想,我回头调查为什么会出现 utf-8 解码失败。研究了一些 thrift 的代码、python2 和 python3 在str和unicode上的不同表现、以及处理 utf-8 的规范 (即 “UTF-8 Sandwith”——系统外部是byte数组,系统内部是utf8字符串,在系统边界做编码/解码),最终发现 thrift 的 fast_decode 逻辑里有一个 bug:THRIFT-5303。大致解析一下:Thrift在解译(decode) 字节数组到thrift对象时有一个fast_decode的方案,调用C++的library来加速解码。在fast_decode的代码(C++代码)里调用了 Python 的decode函数来解码 utf-8 字符串,但却没有指定错误处理函数,因此在里面直接抛了 UnicodeDecodeError 出来。

修完这个后 impala-shell 的问题也就修复了。其实 thrift-0.11.0 已经做到了 “UTF-8 Sandwith” 的规范,因此在客户端解译 thrift 对象时是不应该抛 UnicodeDecodeError 的,因为这属于 Thrift 层的内部,字符串都已经是正常的 utf-8 形式了。问题出在 impalad 这边还用的是 thrift-0.9.3,因此impalad这边 thrift 内部的字符串还是字节数组的形式,并没有做任何 utf-8 相关的事情,所以坑了一把高版本的 thrift 客户端。当然这本身是客户端的 bug,修完就好了。

总结

这次 Debug 之所以曲折,是因为一开始走错了方向,一味地在找 impalad 端卡在哪,因为无法确认其是否正常返回了 RPC,一开始也很难怀疑 Thrift 层有问题。陷入僵局后,还是用 tcpdump 打开了局面,证实了 impalad 其实已经返回了 RPC,只是 impala-shell 这边在等过多的输入。然后才走上正轨,去研究 Thrift 层为什么会抛 utf-8 解码失败的错,从而修复了问题。这次最大的感悟就是,系统都是分层的,这一层看不清的问题,不防往下走一层,或许就看清了。

最后列举一下一些有用的知识点:

  • pystack: 用来打印 python 进程的堆栈
  • minidump: 用来打印 C++ 进程的堆栈,Impala BE端 debug 经常用到
  • tcpdump: 用来监听TCP层的传输
  • UTF-8 Sandwith:Python程序在 UTF-8 方面做到Python2和Python3兼容的方案,具体可以参考这个 PPT: https://nedbatchelder.com/text/unipain/unipain.html

Impala-shell卡顿分析——记一次曲折的Debug经历相关推荐

  1. android 动画卡顿分析工具

    android 动画卡顿分析工具 Android应用性能优化之分析工具 上一次记录了解决过度绘制的过程,这一次,想先弄清个概念性的东西,就是如何判断顺不顺畅? 这东西其实最初我自己也觉得有点废话,用起 ...

  2. Systrace 流畅性实战 2 :案例分析: MIUI 桌面滑动卡顿分析

    当我们说 流畅度 的时候,我们说的是什么?不同的人对流畅性(卡顿掉帧)有不同的理解,对卡顿阈值也有不同的感知,所以有必要在开始这个系列文章之前,先把涉及到的内容说清楚,防止出现不同的理解,也方便大家带 ...

  3. 移动端测试 APP启动性能分析 WebView性能分析 H5性能分析 卡顿分析 帧分析 CPU统计 网络流量分析 耗电量指标 弱网测试 健壮性测试 兼容性测试 Amdahl

    Android官网使用指南性能:https://developer.android.com/topic/performance 一.APP启动性能分析 APP的启动过程 调用起APP.创建一个空白窗口 ...

  4. Android列表滑动卡顿分析与优化

    一 目标 尝试解决首页 HomeFragment 在低端机型上滑动存在卡顿的问题. 二 测试设备 华为荣耀 9i Android 8.0 内存 4G CPU 麒麟659 三 数据采样 刚进应用存在 M ...

  5. android 动画卡顿分析工具

    Android应用性能优化之分析工具[二] Android应用性能优化之分析工具 上一次记录了解决过度绘制的过程,这一次,想先弄清个概念性的东西,就是如何判断顺不顺畅? 这东西其实最初我自己也觉得有点 ...

  6. Android性能分析之---卡顿分析

    一.性能分析---场景 低性能的APP常见的表现有启动/界面切换慢.动画掉帧.卡顿(启动卡,点击按钮卡,进入新页面卡等).耗电,甚至出现应用无响应.程序崩溃的现象.当我们着手解决这些性能问题时,面对的 ...

  7. ListView流畅度翻倍!Flutter卡顿分析和优化方案

    最近因为在做Flutter中相关的性能优化,在收集很多性能数据之后发现ListView组件在一些场景下(例如加载更多)容易引起页面卡顿,看到了闲鱼的Flutter 高性能.多功能的全场景滚动容器.但奈 ...

  8. iOS开发-App应用崩溃卡顿分析

    文章目录 App崩溃问题 可捕获的崩溃信息收集 PLCrashReporter实现 系统接口 不可捕获的崩溃 iOS后台模式 Runloop 卡顿 Watch Dog 内存达到单个App上限被杀死 J ...

  9. 安防摄像头Onvif、RTSP、GB28181转web无插件直播卡顿分析

    监控摄像头网络直播 越来越多人的开始讲普通安防摄像机接入网络流媒体服务器,进行网络直播.这方面的解决方案比较多,最近测试了一个比较轻巧的解决方案:LiveNVR.这个就是实现接入普通RTSP/Onvi ...

最新文章

  1. Channel延续篇
  2. 聚焦数字货币的发展与监管
  3. python任务调度框架_python任务调度框架apscheduler【转】
  4. Linux命令- echo、grep 、重定向、1>2、2>1的介绍
  5. C++之类的静态成员变量和静态成员函数
  6. 「2019 嵌入式智能国际大会」 399 元超值学生票来啦,帮你豪省 2600 元!
  7. Linux驱动实现灯循环闪烁,TQ2440上LED灯闪烁效果的驱动程序实现
  8. Kmalloc和Vmalloc的区别
  9. NYOJ31 5个数求最值
  10. python测试开发django-114.ModelForm中局部钩子(clean_)和全局钩子校验
  11. 高数篇:05柯西定理和泰勒公式
  12. 纵横算法之四:算法应该怎么学
  13. 大数据模型案例库-涵盖多行业,超百个实战项目案例
  14. 中望3D 2019破解补丁|中望3D 2019注册破解补丁下载(附许可破解文件及破解教程)
  15. alert#40;1#41; to xss.haozi.me with #0x02
  16. vue双花括号的使用
  17. UniPro、Bugzilla和Teambition 缺陷管理工具优劣势对比
  18. 四柱笔记(一):基本阴阳理论
  19. 蛮力算法解决问题(c语言)
  20. Unity 之 ShaderGraph Input节点解析汇总

热门文章

  1. android微信第三方登陆混淆,基于微信api Android程序签名+代码混淆
  2. 信息技术相关的期刊杂志介绍
  3. jedis模糊删除key
  4. 错误代码0x80004005解决方案
  5. NAS系列 为什么你需要一台NAS
  6. WINDOWS网络服务第一章预习导图
  7. 前端CSS第二阶段-001
  8. 【七七八八】记录一下自己的小作品
  9. linux网络工具iproute2的使用简介 ip addr
  10. RabbitMQ 限流 流量削峰