主节点间歇性报错其他没有问题 ,SNN的NN没有问题,相关的journalNode也都在,就是主节点的NN会停止。

查看hadoop主节点的NN日志。

2016-11-21 22:36:40,908 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19822 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
2016-11-21 22:36:41,088 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.58.183:8485, 192.168.58.181:8485, 192.168.58.182:8485], stream=QuorumOutputStream starting at txid 24533))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2645)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2520)at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:579)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:975)at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)
2016-11-21 22:36:41,089 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 24533
2016-11-21 22:36:41,113 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2016-11-21 22:36:41,122 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave2/192.168.58.182:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave1/192.168.58.181:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: StandByNameNode/192.168.58.183:8485. Already tried 0 time(s); maxRetries=45
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20050ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.182:8485
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20052ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.181:8485
2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20065ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.183:8485
2016-11-21 22:36:41,145 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at CentOSMaster/192.168.58.180
************************************************************/

  首先保证设置dfs.namenode.edits.dir和dfs.journalnode.edits.dir,然后设置在hdfs-site.xml中超时时间如下:

<property><name>dfs.qjournal.start-segment.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.prepare-recovery.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.accept-recovery.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.prepare-recovery.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.accept-recovery.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.finalize-segment.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.select-input-streams.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.get-journal-state.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.new-epoch.timeout.ms</name><value>600000000</value></property><property><name>dfs.qjournal.write-txns.timeout.ms</name><value>600000000</value></property>

  貌似解决了,至今今天早上没出问题。

Namenode主节点停止报错 Error: flush failed for required journal相关推荐

  1. 运行 vue-typescript-admin-template 报错 error Command failed with signal “SIGABRT“. 切换node版本

    使用 vue-typescript-admin-template 时 正常运行 yarn serve 发行到了 40%就会报错 error Command failed with signal &qu ...

  2. OpenCV drawMatches出现报错Error: Assertion failed

    OpenCV drawMatches出现报错Error: Assertion failed (i2 >= 0 && i2 < static_cast<int>( ...

  3. 使用hexo s命令报错ERROR Process failed _posts name md

    使用hexo s或hexo g终端命令是报错ERROR Process failed: _posts/name.md,如下图 192:my_hexo ******$ hexo s INFO Start ...

  4. Error: recoverUnfinalizedSegments failed for required journal

    转自:https://blog.csdn.net/dudefu011/article/details/78463207# 一.问题描述 HA按照规划配置好,启动后,NameNode不能正常启动.刚启动 ...

  5. 安卓中运行报错Error:Execution failed for task ':app:transformClassesWithDexForDebug'解决

    在androidstuio中运行我的未完项目,报错: Error:Execution failed for task ':app:transformClassesWithDexForDebug'. & ...

  6. github clone报错error: RPC failed; result=56, HTTP code = 200

    报错信息为: error: RPC failed; result=56, HTTP code = 200 fatal: The remote end hung up unexpectedly fata ...

  7. 社区版IDEA创建SpringBoot项目及报错Error: Request failed with status code 404解决

    社区版IDEA创建SpringBoot项目 IDEA创建由于社区版没有创建SpringBoot的工具,我们一般使用插件进行创建 搜索插件spring assistant 进行下载 下载后我们进行项目创 ...

  8. 创建Vue项目把报错Error: command failed: yarn

    ERROR Error: command failed: yarn bug现象 原因 新建Vue项目,在新建过程中,报错 解决方案 更改项目创建时的包管理器 步骤 进入到Windows环境中C:/us ...

  9. imagemagick和gm报错{ Error: Command failed: ��Ч���� - -resize

    { Error: Command failed: ��Ч���� - -resize at XXXXXXX 想把自己截出来的图片转化格式或者改变大小等,在网上找到了两个npm的包 按照npm官网的例子 ...

  10. Vivado 报错Error:‘launch_simulation‘failed due to earlier errors.

    项目场景: ` 在使用Vivado 2018.3过程中,编辑完代码一会,仿真出错,故进行纪录. 问题描述 `Verilog语言在Vivado中编程,在进行仿真时出现错误提示如下: 之后点击OK,继续报 ...

最新文章

  1. 企业库应用实践系列三:自定义构造函数
  2. 条件、循环、函数定义、字符串操作练习
  3. 记一次 Python Web 接口优化
  4. Java多线程 - 线程组
  5. [密码学基础][每个信息安全博士生应该知道的52件事][Bristol Cryptography][第30篇]大致简述密钥协商中的BR安全定义
  6. oracle 源代码输出,oracle-如何将DBMS_OUTPUT.PUT_LINE的输出重定向到文件?
  7. Django访问java建立的数据库
  8. nodejs注册为windows服务实现开机自启动
  9. 批量下载哨兵数据的方法探索
  10. 直流电机PID控制源码c语言,51单片机PID+PWM直流电机转速闭环控制源码(12864液晶显示)...
  11. w10打开网络计算机退出,Win10网络发现已关闭怎么办?|Win10启用网络发现方法
  12. 小米手机4获取ROOT权限的步骤
  13. 补能的争议路线:快充会走向大一统吗?
  14. 一元二次方程求解(C语言版)
  15. Java实现 LeetCode 403 青蛙过河
  16. 新版Zotero插件更新
  17. 2012年6月婚假四川游记
  18. 计算机网络(一):网络基础知识
  19. 在线网校系统源码【支付、微信登录、题库、考试、直播】等功能
  20. R5F1006CASP#X0 16位微控制器MCU RL78 / G13 MCU低功耗 通用应用RENESAS

热门文章

  1. setInterval 函数
  2. APUE学习笔记 - Chapter 2 . Unix Standardization and Implementations
  3. SQL Server 2005 分页SQL
  4. FFTNTT数学解释
  5. Java——面向对象三大特性学习笔记
  6. jsp中的session和上下文
  7. OSI七层模型:TCP/IP HTTP WebSocket MQTT
  8. UIWindow statusBar消失
  9. 使用WndProc来处理消息
  10. JDK,JRE,JVM区别与联系(ZZ)