1.简介

data stall检测机制就我现在的理解看来是Android 在网络校验成功后,对网络的一种持续监控措施,一旦发现当前网络断网,则通知ConnectivityService,进行相应的处理措施。

2.流程分析

2.1 tcp health 获取

NetworkMonitor.java

    private class ValidatedState extends State {@Overridepublic void enter() {maybeLogEvaluationResult(networkEventType(validationStage(), EvaluationResult.VALIDATED));// If the user has accepted partial connectivity and HTTPS probing is disabled, then// mark the network as validated and partial so that settings can keep informing the// user that the connection is limited.int result = NETWORK_VALIDATION_RESULT_VALID;if (!mUseHttps && mAcceptPartialConnectivity) {result |= NETWORK_VALIDATION_RESULT_PARTIAL;}mEvaluationState.reportEvaluationResult(result, null /* redirectUrl */);mValidations++;initSocketTrackingIfRequired();// start periodical polling.sendTcpPollingEvent();maybeStopCollectionAndSendMetrics();}

看下网络在校验通过后干了啥

        private void initSocketTrackingIfRequired() {if (!isValidationRequired()) return;final TcpSocketTracker tst = getTcpSocketTracker();if (tst != null) {tst.pollSocketsInfo();}}/*** Request to send a SockDiag Netlink request. Receive and parse the returned message. This* function is not thread-safe and should only be called from only one thread.** @Return if this polling request executes successfully or not.*/public boolean pollSocketsInfo() {if (!mDependencies.isTcpInfoParsingSupported()) return false;FileDescriptor fd = null;try {final long time = SystemClock.elapsedRealtime();fd = mDependencies.connectToKernel();final TcpStat stat = new TcpStat();for (final int family : ADDRESS_FAMILIES) {mDependencies.sendPollingRequest(fd, mSockDiagMsg.get(family));// Messages are composed with the following format. Stop parsing when receiving// message with nlmsg_type NLMSG_DONE.// +------------------+---------------+--------------+--------+// | Netlink Header   | Family Header | Attributes   | rtattr |// | struct nlmsghdr  | struct rtmsg  | struct rtattr|  data  |// +------------------+---------------+--------------+--------+//               :           :               :// +------------------+---------------+--------------+--------+// | Netlink Header   | Family Header | Attributes   | rtattr |// | struct nlmsghdr  | struct rtmsg  | struct rtattr|  data  |// +------------------+---------------+--------------+--------+final ByteBuffer bytes = mDependencies.recvMessage(fd);try {while (enoughBytesRemainForValidNlMsg(bytes)) {final StructNlMsgHdr nlmsghdr = StructNlMsgHdr.parse(bytes);if (nlmsghdr == null) {Log.e(TAG, "Badly formatted data.");break;}final int nlmsgLen = nlmsghdr.nlmsg_len;log("pollSocketsInfo: nlmsghdr=" + nlmsghdr + ", limit=" + bytes.limit());// End of the message. Stop parsing.if (nlmsghdr.nlmsg_type == NLMSG_DONE) break;if (nlmsghdr.nlmsg_type != SOCK_DIAG_BY_FAMILY) {Log.e(TAG, "Expect to get family " + family+ " SOCK_DIAG_BY_FAMILY message but get "+ nlmsghdr.nlmsg_type);break;}if (isValidInetDiagMsgSize(nlmsgLen)) {// Get the socket cookie value. Composed by two Integers value.// Corresponds to inet_diag_sockid in// <linux_src>/include/uapi/linux/inet_diag.hbytes.position(bytes.position() + IDIAG_COOKIE_OFFSET);// It's stored in native with 2 int. Parse it as long for convenience.final long cookie = bytes.getLong();// Skip the rest part of StructInetDiagMsg.bytes.position(bytes.position()+ StructInetDiagMsg.STRUCT_SIZE - IDIAG_COOKIE_OFFSET- Long.BYTES);final SocketInfo info = parseSockInfo(bytes, family, nlmsgLen, time);// Update TcpStats based on previous and current socket info.stat.accumulate(calculateLatestPacketsStat(info, mSocketInfos.get(cookie)));mSocketInfos.put(cookie, info);}}} catch (IllegalArgumentException | BufferUnderflowException e) {Log.wtf(TAG, "Unexpected socket info parsing, family " + family+ " buffer:" + bytes + " "+ Base64.getEncoder().encodeToString(bytes.array()), e);}}// Calculate mLatestReceiveCount, mSentSinceLastRecv and mLatestPacketFailPercentage.mSentSinceLastRecv = (stat.receivedCount == 0)? (mSentSinceLastRecv + stat.sentCount) : 0;mLatestReceivedCount = stat.receivedCount;mLatestPacketFailPercentage = ((stat.sentCount != 0)? ((stat.retransmitCount + stat.lostCount) * 100 / stat.sentCount) : 0);// Remove out-of-date socket info.cleanupSocketInfo(time);return true;} catch (ErrnoException | SocketException | InterruptedIOException e) {Log.e(TAG, "Fail to get TCP info via netlink.", e);} finally {NetworkStackUtils.closeSocketQuietly(fd);}return false;}

获取当前最新的发包数+失败率+收包数

    // Number of packets sent since the last received packetprivate int mSentSinceLastRecv;// The latest fail rate calculated by the latest tcp info.private int mLatestPacketFailPercentage;// Number of packets received in the latest polling cycle.private int mLatestReceivedCount;

上面的获取收发包情况的逻辑会在如下消息处理中循环往复,间隔为20s

    @VisibleForTestingvoid sendTcpPollingEvent() {if (isValidationRequired()) {sendMessageDelayed(EVENT_POLL_TCPINFO, getTcpPollingInterval());}}case EVENT_POLL_TCPINFO:final TcpSocketTracker tst = getTcpSocketTracker();if (tst == null) break;// Transit if retrieve socket info is succeeded and suspected as a stall.if (tst.pollSocketsInfo() && evaluateDataStall()) {transitionTo(mEvaluatingState);} else {sendTcpPollingEvent();}break;

2.2 tcp health 判定

        boolean evaluateDataStall() {if (isDataStall()) {validationLog("Suspecting data stall, reevaluate");return true;}return false;}@VisibleForTestingprotected boolean isDataStall() {if (!isValidationRequired()) {return false;}Boolean result = null;final StringJoiner msg = (DBG || VDBG_STALL) ? new StringJoiner(", ") : null;// Reevaluation will generate traffic. Thus, set a minimal reevaluation timer to limit the// possible traffic cost in metered network.if (!mNetworkCapabilities.hasCapability(NET_CAPABILITY_NOT_METERED)&& (SystemClock.elapsedRealtime() - getLastProbeTime()< mDataStallMinEvaluateTime)) {return false;}// Check TCP signal. Suspect it may be a data stall if :// 1. TCP connection fail rate(lost+retrans) is higher than threshold.// 2. Accumulate enough packets count.final TcpSocketTracker tst = getTcpSocketTracker();if (dataStallEvaluateTypeEnabled(DATA_STALL_EVALUATION_TYPE_TCP) && tst != null) {if (tst.getLatestReceivedCount() > 0) {result = false;} else if (tst.isDataStallSuspected()) {result = true;mDataStallTypeToCollect = DATA_STALL_EVALUATION_TYPE_TCP;final DataStallReportParcelable p = new DataStallReportParcelable();p.detectionMethod = DETECTION_METHOD_TCP_METRICS;p.timestampMillis = SystemClock.elapsedRealtime();p.tcpPacketFailRate = tst.getLatestPacketFailPercentage();p.tcpMetricsCollectionPeriodMillis = getTcpPollingInterval();notifyDataStallSuspected(p);}if (DBG || VDBG_STALL) {msg.add("tcp packets received=" + tst.getLatestReceivedCount()).add("latest tcp fail rate=" + tst.getLatestPacketFailPercentage());}}// Check dns signal. Suspect it may be a data stall if both :// 1. The number of consecutive DNS query timeouts >= mConsecutiveDnsTimeoutThreshold.// 2. Those consecutive DNS queries happened in the last mValidDataStallDnsTimeThreshold ms.final DnsStallDetector dsd = getDnsStallDetector();if ((result == null) && (dsd != null)&& dataStallEvaluateTypeEnabled(DATA_STALL_EVALUATION_TYPE_DNS)) {if (dsd.isDataStallSuspected(mConsecutiveDnsTimeoutThreshold,mDataStallValidDnsTimeThreshold)) {result = true;mDataStallTypeToCollect = DATA_STALL_EVALUATION_TYPE_DNS;logNetworkEvent(NetworkEvent.NETWORK_CONSECUTIVE_DNS_TIMEOUT_FOUND);final DataStallReportParcelable p = new DataStallReportParcelable();p.detectionMethod = DETECTION_METHOD_DNS_EVENTS;p.timestampMillis = SystemClock.elapsedRealtime();p.dnsConsecutiveTimeouts = mDnsStallDetector.getConsecutiveTimeoutCount();notifyDataStallSuspected(p);}if (DBG || VDBG_STALL) {msg.add("consecutive dns timeout count=" + dsd.getConsecutiveTimeoutCount());}}// log only data stall suspected.if ((DBG && Boolean.TRUE.equals(result)) || VDBG_STALL) {log("isDataStall: result=" + result + ", " + msg);}return (result == null) ? false : result;}

首先是统计当前tcp是否能收到包,若能收到则认为网络正常,其次判定收发包失败率是否大于80%

    /*** Default tcp packets fail rate to suspect as a data stall.** Calculated by ((# of packets lost)+(# of packets retrans))/(# of packets sent)*100. Ideally,* the percentage should be 100%. However, the ongoing packets may not be considered as neither* lost or retrans yet. It will cause the percentage lower.*/public static final int DEFAULT_TCP_PACKETS_FAIL_PERCENTAGE = 80;

再接着会判定dns的情况,若30min中连续5次dns失败则认为是断网

    // Default configuration values for data stall detection.public static final int DEFAULT_CONSECUTIVE_DNS_TIMEOUT_THRESHOLD = 5;public static final int DEFAULT_DATA_STALL_VALID_DNS_TIME_THRESHOLD_MS = 30 * 60 * 1000;

3.后续处理

NetworkMonitor发现当前网络没往后会重新进行网络校验,后续通知给CS

另外发现断网的即刻就会回调通知CS告知与当前网络绑定的各应用,该网络断网了

    private void handleDataStallSuspected(@NonNull NetworkAgentInfo nai, long timestampMillis, int detectionMethod,@NonNull PersistableBundle extras) {final NetworkCapabilities networkCapabilities =getNetworkCapabilitiesWithoutUids(nai.networkCapabilities);final DataStallReport report =new DataStallReport(nai.network,timestampMillis,detectionMethod,nai.linkProperties,networkCapabilities,extras);final List<IConnectivityDiagnosticsCallback> results =getMatchingPermissionedCallbacks(nai);for (final IConnectivityDiagnosticsCallback cb : results) {try {cb.onDataStallSuspected(report);} catch (RemoteException ex) {loge("Error invoking onDataStallSuspected", ex);}}}

4.总结

网络连接成功后有个叫做data stall的检测机制来持续检测网络可达性,判定标准为是否可以正常收包或者包失败率大于80%或者在30min内dns连续失败5次,即判定断网,通报给ConnectivityService。

(两百八十八)Android R data stall检测机制学习相关推荐

  1. 八十八、Python | 十大排序算法系列(下篇)

    @Author:Runsen @Date:2020/7/10 人生最重要的不是所站的位置,而是内心所朝的方向.只要我在每篇博文中写得自己体会,修炼身心:在每天的不断重复学习中,耐住寂寞,练就真功,不畏 ...

  2. JavaScript学习(八十八)—数组知识点总结,超详细!!!

    JavaScript学习(八十八)-爆肝 数组知识点总结,超详细!!! 每天都要进步一点点 小王加油!!! 一.数组的概念 所谓数组就是指内存中开辟出来的用来存储大量数据的连续的存储空间 数组可以把一 ...

  3. Android开发笔记(八十八)同步与加锁

    同步synchronized 同步方法 synchronized可用来给方法或者代码块加锁,当它修饰一个方法或者一个代码块的时候,同一时刻最多只有一个线程执行这段代码.这就意味着,当两个并发线程同时访 ...

  4. 2022Java学习笔记八十八(网络编程:UDP通信,一发一收,多发多收消息接收实现)

    2022Java学习笔记七十八(网络编程:UDP通信,一发一收,多发多收消息接收实现) 一.快速入门 DatagramPacket:数据包对象 实例代码 定义发送端 package com.zcl.d ...

  5. Android进程间通信之一:Binder机制学习

    Binder机制学习 Binder驱动 Binder核心API Linux 使用两级保护机制:0 级供系统内核使用,3 级供用户程序使用. Linux 下的传统 IPC 通信原理 Linux 下的传统 ...

  6. unity3d android assets,Unity3D之Android同步方法读取streamingAssets(八十八)

    版本unity5.3.3 android 小米pad1 streamingAssets 这个目录在IOS下是可以同步读取的,但是在Android下必须用www来异步读取..这就很恶心了-所以最近我就在 ...

  7. 第八十八回 渡泸水再缚番王  识诈降三擒孟获

    却说孔明抬了孟获,众将上帐答曰:"孟获乃南蛮渠魁,今幸被擒,南便利定:丞相何故放之?"孔明笑曰:"吾擒此人,如囊中取物耳.直须降伏其心,天然平矣."诸将闻言,皆 ...

  8. 《东周列国志》第八十八回 孙膑佯狂脱祸 庞涓兵败桂陵

    话说孙膑行至魏国,即寓于庞涓府中,膑谢涓举荐之恩.涓有德色,膑又述鬼谷先生改宾为膑之事.涓惊曰:"膑非佳语,何以改易?"膑曰:"先生之命,不敢违也!"次日,同入 ...

  9. “约见”面试官系列之常见面试题之第八十八篇之什么是vue生命周期(建议收藏)

    我们知道vue是一个构建数据驱动的 web 界面的渐进式框架,那么vue生命周期是什么呢?本篇文章就给大家来介绍一下vue生命周期的内容,希望可以帮助到有需要的朋友. vue生命周期是什么? Vue生 ...

最新文章

  1. OSPF 协议中各种邻居状态的含义是什么?
  2. 服务机器人---建图工具
  3. 目录服务软件 AD和LDAP 的理解
  4. quartz mysql 报错_7月27 mysql quartz 连接报错
  5. 【Python3爬虫】用Python中的队列来写爬虫
  6. opencv处理视频颠倒问题
  7. 我的firefox插件开发历程
  8. c语言中同级运算符的运算顺序,二 如何学习C语言的运算符和运算顺序
  9. php 右键下拉菜单,iview通过Dropdown(下拉菜单)实现的右键菜单
  10. windows CMD 下 长ping 加时间戳,亲测有效
  11. UG NX 12 内部草图和外部草图的区别
  12. tornado程序中出现ValueError: invalid literal for int() with base 16: ‘
  13. python alpha通道_Python PIL完全删除每个alpha通道
  14. cannot find zipfile directory
  15. script type=text/JavaScript是什么
  16. RxJava串行执行任务
  17. 最全的PC【UA】UserAgent大全
  18. Hie with the Pie(Floyd+状压dp)
  19. 震惊,这款控件的速度比姆巴佩还快
  20. jxl导出excel(合并单元格)

热门文章

  1. 英国留学毕业论文标题部分怎么写?
  2. 《Java编程思想》读书笔记一
  3. Android智能心跳微信
  4. 学会锻炼感悟爱与幸福的能力
  5. 制作Windows系统盘教程
  6. Cisco路由器操作汇总(一)
  7. (第十五集——第2章)MySQL数据操作
  8. 关于我在大专学院毕业后的一些感触
  9. 运行一段时间报Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit
  10. 利用Python实现远程控制电脑