<div style="font: 14px/21px 微软雅黑; text-align: left; color: rgb(0, 0, 0); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; white-space: normal; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;"><strong>   一,namenode倒换原因分析  </strong></div><div style="font: 14px/21px 微软雅黑; text-align: left; color: rgb(0, 0, 0); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; white-space: normal; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;">    ZKFC做的HA的HADOOP集群,某信升级网络以后,经常在凌晨出现这种会话超时的情况</div><div style="font: 14px/21px 微软雅黑; text-align: left; color: rgb(0, 0, 0); text-transform: none; text-indent: 0px; letter-spacing: normal; word-spacing: 0px; white-space: normal; font-size-adjust: none; font-stretch: normal; background-color: rgb(255, 255, 255); -webkit-text-stroke-width: 0px;">下面是zkfc的日志,初步认为是网络问题引起的,以前的超时时间为5000ms.</div>
2015-06-23 11:34:53,393 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Caught an exception, leaving main loop due to Socket closed
2015-06-23 11:34:53,403 INFO org.apache.hadoop.ha.ZKFailoverController: Trying to make NameNode at M-172-16-189-5/172.16.189.5:8020 active...
2015-06-23 11:34:55,672 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNode at M-172-16-189-5/172.16.189.5:8020 to active stat
e
2015-06-24 02:00:10,088 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x14e1e6b5b2d0001, likely server has close
d socket, closing socket connection and attempting reconnect
2015-06-24 02:00:10,202 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session disconnected. Entering neutral mode...
2015-06-24 02:00:10,642 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server 172.16.189.46/172.16.189.46:2181. Will not attempt to authe
nticate using SASL (unknown error)
2015-06-24 02:00:10,643 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to 172.16.189.46/172.16.189.46:2181, initiating session
2015-06-24 02:00:10,647 INFO org.apache.zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x14e1e6b5b2d0001 has expired, closing socketconnection
2015-06-24 02:00:10,650 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session expired. Entering neutral mode and rejoining...
2015-06-24 02:00:10,650 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session
2015-06-24 02:00:10,656 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=172.16.189.120:2181,172.16.189.46:2181,172.16.189.13
4:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@2540d625
2015-06-24 02:00:10,694 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server 172.16.189.134/172.16.189.134:2181. Will not attempt to aut
henticate using SASL (unknown error)
2015-06-24 02:00:10,696 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to 172.16.189.134/172.16.189.134:2181, initiating session
2015-06-24 02:00:10,848 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server 172.16.189.134/172.16.189.134:2181, sessionid = 0x34e1
e6b62c90004, negotiated timeout = 5000
2015-06-24 02:00:10,856 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
2015-06-24 02:00:10,857 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x14e1e6b5b2d0001
2015-06-24 02:00:10,857 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
2015-06-24 02:00:10,864 INFO org.apache.hadoop.ha.ZKFailoverController: ZK Election indicated that NameNode at M-172-16-189-5/172.16.189.5:8020 should becomestandby
2015-06-24 02:00:10,874 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNode at M-172-16-189-5/172.16.189.5:8020 to standby sta
te
二,解决办法
由于是租用的服务器,网络问题也管不着尝试一下修改zkfc的session超时时间。
发现在core-site.xml中没有设置,ha.zookeeper.session-timeout.ms,默认为5000ms,果断修改为下面
        <property><name>ha.zookeeper.session-timeout.ms</name><value>10000</value><description>ms</description></property>
重新启动zkfc。
修改的时候注意一下zookeeer的配置,客户端设置的值要在这个范围
#The minimum session timeout in milliseconds that the server will allow the client to negotiate
   minSessionTimeout=2000                                                                                                                                   
   #The maximum session timeout in milliseconds that the server will allow the client to negotiate
    maxSessionTimeout=60000

   

namenode倒换原因分析相关推荐

  1. 【linux】ARM开发板上设置RTC时间,断电重启后,设置失效的原因分析

    问题描述 linux中使用date设置时间后用hwclock -w同步到RTC,断电重启后,有时会失效 原因分析 保存时间戳 1.使用命令关机(halt)会调用rc0.d中的脚本: 2.使用命令重启( ...

  2. Lua(Codea) 中 table.insert 越界错误原因分析

    2019独角兽企业重金招聘Python工程师标准>>> Lua(Codea) 中 table.insert(touches, touch.id, touch) 越界错误原因分析 背景 ...

  3. SAP MM ME21N 创建PO时报错 - Net price in CNY becomes too large – 之原因分析

    SAP MM ME21N 创建PO时报错 - Net price in CNY becomes too large – 之原因分析 昨天笔者在微信公众号里发布了一篇文章<SAP MM ME21N ...

  4. DB time抖动的原因分析

    9月22日,"DBA+社群"开讲啦!由搜狐畅游高级DBA杨建荣在"DBA+北京群"进行了一次关于DB time抖动的原因分析的线上主题分享.小编特别整理出其中精 ...

  5. TypeError: 'module' object is not callable 原因分析

    程序代码  class Person:      #constructor      def __init__(self,name,sex):           self.Name = name   ...

  6. “undefined reference to JNI_GetCreatedJavaVM”和“File format not recognized”错误原因分析...

    "undefined reference to JNI_GetCreatedJavaVM"和"File format not recognized"错误原因分析 ...

  7. 和平精英为什么找不到服务器,和平精英为什么登不上去 和平精英游戏登不上原因分析...

    和平精英上线各大下载平台后,备受玩家关注,但是慢慢的网友会发现游戏登不上去,其他人都能登上去,自己怎么登不上去,那么这到底是什么情况呢?下面就跟小编一起去看看详细情况吧. 和平精英游戏登不上原因分析. ...

  8. 路由器级联后网速慢的原因分析和问题解决

    路由器级联后网速慢的原因分析和问题解决 参考文章: (1)路由器级联后网速慢的原因分析和问题解决 (2)https://www.cnblogs.com/jackkwok/p/5233342.html ...

  9. ORA-04030: 在尝试分配...字节(...)时进程内存不足的原因分析解决方法

    ORA-04030: 在尝试分配...字节(...)时进程内存不足的原因分析解决方法 参考文章: (1)ORA-04030: 在尝试分配...字节(...)时进程内存不足的原因分析解决方法 (2)ht ...

  10. AppStore IPv6-only审核被拒原因分析及解决方案-b

    AppStore IPv6-only审核被拒原因分析及解决方案-b 参考文章: (1)AppStore IPv6-only审核被拒原因分析及解决方案-b (2)https://www.cnblogs. ...

最新文章

  1. python是高级动态编程语言-python是一种跨平台、开源、免费的高级动态编程语言,对么...
  2. JavaScript匿名函数以及在循环中的匿名函数
  3. 工作270:el-dialog的open回调
  4. mysql表连接_SELECT中的多表连接
  5. 金银猫 服务器维护,金银猫案情进展 投资者可在网上登记报案
  6. java保护表格_java poi Excel单元格保护
  7. 微软云服务Azure所有产品简介
  8. python json模块_python json模块使用详情
  9. 数仓工具—Hive源码之SQL解析Antlr入门(7)
  10. 【超详细】Docker从入门到干活,就看这一篇文章
  11. 【Office】解决Excel关闭Personal工作簿导致宏不见的情况
  12. 【Jmeter基础篇】03:如何进行post接口压力测试
  13. UOJ224 NOI2016 旷野大计算 构造、造计算机
  14. 手动/自动/交叉验证评估Keras深度学习模型的性能
  15. 从材料硕士到算法工程师的转行之路,有三不建议
  16. sql中like与%%的用法
  17. covmatrix matlab,matlab cov函数
  18. iOS15 API 新特性
  19. 乔布斯在斯坦福大学的演讲感悟
  20. LimeSDR实验教程(15) Lattepanda + LimeSDR Mini转接板

热门文章

  1. 【C#】第2章学习要点
  2. 在华为服务器 RH 2288H V2上装 windows 2008
  3. MySql中PreparedStatement对象与Statement对象
  4. OpenStreetMap/Google/百度/Bing瓦片地图服务(TMS)
  5. 数据结构实验3-带头结点的单链表
  6. python3调用js_关于python3运行JS文件的问题
  7. 面试题ajax干什么的,ajax面试题
  8. python集合常用方法_python基础-集合set的常用方法
  9. android密码用户名和密码错误,Android之输入用户名和密码验证
  10. linux解析器错误权限不够,实例解析Linux下目录的权限