报错复现:

flink run -m yarn-cluster -p 2 -yjm 700m -ytm 1024m -c WordCount target/bbb-1.0-SNAPSHOT.jar

完整报错如下:

 The program finished with the following exception:org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Could not deploy Yarn job cluster.at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:662)at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:893)at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:966)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:966)
Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster.at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:398)at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:70)at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1733)at org.apache.flink.streaming.api.environment.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:94)at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:63)at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1620)at WordCount.main(WordCount.java:47)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:321)... 11 more
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
Diagnostics from YARN: Application application_1591614969089_0002 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1591614969089_0002_000001 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2020-06-08 19:18:12.457]Exception from container-launch.
Container id: container_1591614969089_0002_01_000001
Exit code: 1[2020-06-08 19:18:12.466]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :[2020-06-08 19:18:12.467]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :For more detailed output, check the application tracking page: http://Desktop:8188/applicationhistory/app/application_1591614969089_0002 Then click on links to logs of each attempt.
. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1591614969089_0002at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:391)... 22 more
2020-06-08 19:18:12,659 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cancelling deployment from Deployment Failure Hook
2020-06-08 19:18:12,660 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at Desktop/192.168.0.103:8032
2020-06-08 19:18:12,661 INFO  org.apache.hadoop.yarn.client.AHSProxy                        - Connecting to Application History server at Desktop/192.168.0.103:10201
2020-06-08 19:18:12,661 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Killing YARN application
2020-06-08 19:18:12,668 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Killed application application_1591614969089_0002
2020-06-08 19:18:12,769 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deleting files in hdfs://Desktop:9000/user/appleyuchi/.flink/application_1591614969089_0002.

比较难排查的一个报错,注意确保HADOOP的日志服务器打开,即确保jps中有:

JobHistoryServer,启动命令为:

"$HADOOP_HOME/bin/mapred --daemon start historyserver"

打开时间线服务器

yarn timelineserver

进行完上述操作后,yarn界面的各个端口应该都能打开了。
#######################################################################################

然后在yarn界面的log中看到如下报错:

2020-06-08 19:21:02,071 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Shutting YarnJobClusterEntrypoint down with application status FAILED. Diagnostics org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:261)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:215)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
Caused by: java.net.BindException: Could not start rest endpoint on any port in port range 8082at org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:228)at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:165)... 9 more
.
2020-06-08 19:21:02,076 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:37633
2020-06-08 19:21:02,077 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2020-06-08 19:21:02,082 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2020-06-08 19:21:02,087 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2020-06-08 19:21:02,088 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2020-06-08 19:21:02,095 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2020-06-08 19:21:02,095 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2020-06-08 19:21:02,110 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2020-06-08 19:21:02,110 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2020-06-08 19:21:02,130 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.
2020-06-08 19:21:02,131 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.
2020-06-08 19:21:02,132 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Could not start cluster entrypoint YarnJobClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint.at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:119)
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:261)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:215)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)... 2 more
Caused by: java.net.BindException: Could not start rest endpoint on any port in port range 8082at org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:228)at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:165)... 9 more

##############################################################

端口问题,但是这个端口并没有占用啊,所以我也懵逼了一会儿。

犯错原因:

这两个文件中的端口要保持统一,我忘记修改masters文件了,从而导致了上述复杂的报错。

这里之所以默认的8081要改成8082是因为8081被spark给占用了,所以我当时修改完flink-conf.yaml就忘乎所以了。

最终解决方案:

flink-conf.yaml:rest.port: 8082

masters:Desktop:8082

然后别忘记这两个文件同步更新到集群中的其他节点。

关闭眼前的所有终端,重新开一个终端,因为配置文件只有在你开启新终端的情况下才会生效。

flink on yarn模式出现The main method caused an error: Could not deploy Yarn job cluster问题排查+解决相关推荐

  1. Flink : exitCode=1 the main method caused an error: could not deploy yarn job cluster

    1.美图 2.背景 执行一个flink on yarn命令 su -hdfs -c "export HADOOP_CONF_DIR=xxx && export HADOOP_ ...

  2. org.apache.flink.client.program.ProgramInvocationException: The main method caused an error

    flink任务开启检查点并设置状态后端后,提交任务运行,出现以上错误,具体错误如下: org.apache.flink.client.program.ProgramInvocationExceptio ...

  3. spark on yarn模式下SparkStream整合kafka踩的各种坑(已解决)_fqzzzzz的博客

    项目场景: 使用sparkStream接收kafka的数据进行计算,并且打包上传到linux进行spark任务的submit 错误集合: 1.错误1: Failed to add file:/usr/ ...

  4. Spark的安装(Standalone模式,高可用模式,基于Yarn模式)

    目录 spark的Standalone模式安装 一.安装流程 1.将spark-2.2.0-bin-hadoop2.7.tgz  上传到 /usr/local/spark/ 下,然后解压 2.进入到c ...

  5. 2021年大数据Flink(六):Flink On Yarn模式

    目录 Flink On Yarn模式 原理 为什么使用Flink On Yarn? Flink如何和Yarn进行交互? 两种方式 操作 1.关闭yarn的内存检查 2.同步 3.重启yarn 测试 S ...

  6. Flink On Yarn模式,为什么使用Flink On Yarn?Session模式、Per-Job模式、关闭yarn的内存检查,由Yarn模式切换回standalone模式时需要注意的点

    Flink On Yarn模式 原理 为什么使用Flink On Yarn? 在实际开发中,使用Flink时,更多的使用方式是Flink On Yarn模式,原因如下: -1.Yarn的资源可以按需使 ...

  7. flink的Yarn模式

    以Yarn模式部署的Flink任务时,要求Flink是有Hadoop支持的版本,并且集群中安装HDFS服务 Flink on YarnFlink提供了两种在yarn上运行的模式,分别为Session- ...

  8. 【FLINK 】 Flink on YARN模式下TaskManager的内存分配

    解决背景: 总的ytm分配的不变的情况下怎么划分给堆内内存JVM 一个更大的内存空间 对于心急的同学来说,我们直接先给一个解决方案,后面想去了解的再往下看: 原来的命令,-ytm 8192,分配给ta ...

  9. flink yarn模式HA部署

    文章目录 1.yarn cluster 模式部署介绍 2.flink session HA模式 3.flink-per-job模式 该文章基于上一篇: Flink的local和standalone H ...

最新文章

  1. 计算机导论excel,[计算机导论实验三Excel.doc
  2. nginx配置详解与优化
  3. unity, 非public变量需要加[SerializeField]才能序列化
  4. 3.定义一个有10个元素的数组,用其代表10个学生的考试成绩,从键盘输入10个成绩,统计平均成绩。
  5. 【图论】【模板】静态仙人掌(luogu 5236)
  6. WebStrom Sass 编译配置 windows
  7. 从零开始撸一个Kotlin Demo
  8. mui dtpicker 时间的设置 以及MUI的弹窗
  9. 命令行开启一个unity实例和执行其中的脚本方法的使用和注意
  10. django多条件筛选搜索(项目实例)
  11. 群晖ds3617xs_23739虚拟机安装与半洗白教程
  12. python生成随机imei
  13. Buffer Overflow with Shellcode-protostar-stak5-bin-0x06
  14. 红警3修改器无法连接服务器,红警3序列号修改器-不能加入游戏怎么办?红警3连局域网说cd-– 手机爱问...
  15. 第三方软件测试z5x电池,vivo Z5x第三方续航测试结果公布,刷新手机业续航排行榜...
  16. vue引入Echarts画饼图详解
  17. 如何查看夜神、逍遥模拟器的端口
  18. 小程序RSA加密 - 公钥加密
  19. 5-3中央处理器-数据通路的功能和基本结构
  20. 百度推广——搜索营销新视角(百度官方出品,俞敏洪、吴晓波、徐雷力荐!)

热门文章

  1. 工作中常用到的一些方法集合
  2. Cheatsheet: 2011 12.01 ~ 12.12
  3. WeChall_PHP-Local File Inclusion(LFI)
  4. Python编写自动化脚本(无验证码)
  5. 分享一张前端知识点思维导图
  6. java创建指定日期_如何创建指定的日期和时间
  7. 安装了silverlight还是提示_win10系统安装.netframework3.5方法
  8. 单片机中断机制对日常生活的启示_单片机原理部分课后习题解
  9. ES6结构赋值的用途
  10. JS复制内容到剪贴板