在配置为4C16G的虚拟机上安装hadoop生态全家桶,在安装Spark2,使用了社区版2.3的版本。
安装完毕后,使用spark2自带的样例程序 org.apache.spark.examples.SparkPi 测试了下,结果报了如下错误:

Spark context stopped while waiting for backend

完整报错日志如下:

2021-03-12 15:05:32 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-c0b92da6-a56a-47c4-b3a5-bc2e37e6cf84
[root@cdh-001 conf]# spark-submit     --class org.apache.spark.examples.SparkPi     --master yarn     --deploy-mode client     --driver-memory 600m     --executor-memory 600m     --executor-cores 2     $SPARK_HOME/examples/jars/spark-examples*.jar     10
2021-03-12 15:05:46 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-12 15:05:46 INFO  SparkContext:54 - Running Spark version 2.3.0
2021-03-12 15:05:46 INFO  SparkContext:54 - Submitted application: Spark Pi
2021-03-12 15:05:46 INFO  SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:46 INFO  SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:46 INFO  SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:46 INFO  SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:46 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:46 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 40299.
2021-03-12 15:05:46 INFO  SparkEnv:54 - Registering MapOutputTracker
2021-03-12 15:05:46 INFO  SparkEnv:54 - Registering BlockManagerMaster
2021-03-12 15:05:46 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2021-03-12 15:05:46 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2021-03-12 15:05:46 INFO  DiskBlockManager:54 - Created local directory at /tmp/blockmgr-b49e6294-3bf9-42f3-956e-897aa442cf58
2021-03-12 15:05:46 INFO  MemoryStore:54 - MemoryStore started with capacity 140.1 MB
2021-03-12 15:05:47 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2021-03-12 15:05:47 INFO  log:192 - Logging initialized @1691ms
2021-03-12 15:05:47 INFO  Server:346 - jetty-9.3.z-SNAPSHOT
2021-03-12 15:05:47 INFO  Server:414 - Started @1762ms
2021-03-12 15:05:47 INFO  AbstractConnector:278 - Started ServerConnector@2a551a63{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2021-03-12 15:05:47 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7cd1ac19{/jobs,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2a2bb0eb{/jobs/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3c291aad{/jobs/job,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@733037{/jobs/job/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7728643a{/stages,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@320e400{/stages/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5167268{/stages/stage,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2c444798{/stages/stage/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1af7f54a{/stages/pool,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6ebd78d1{/stages/pool/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@436390f4{/storage,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4d157787{/storage/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@68ed96ca{/storage/rdd,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6d1310f6{/storage/rdd/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3228d990{/environment,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54e7391d{/environment/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@50b8ae8d{/executors,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@255990cc{/executors/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@51c929ae{/executors/threadDump,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3c8bdd5b{/executors/threadDump/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@29d2d081{/static,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@60afd40d{/,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@28a2a3e7{/api,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@33a2499c{/jobs/job/kill,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@e72dba7{/stages/stage/kill,null,AVAILABLE,@Spark}
2021-03-12 15:05:47 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://cdh-001:4040
2021-03-12 15:05:47 INFO  SparkContext:54 - Added JAR file:/data/my_bdc_apps/spark/examples/jars/spark-examples_2.11-2.3.0.jar at spark://cdh-001:40299/jars/spark-examples_2.11-2.3.0.jar with timestamp 1615532747291
2021-03-12 15:05:47 INFO  RMProxy:98 - Connecting to ResourceManager at cdh-002/10.6.2.245:8032
2021-03-12 15:05:48 INFO  Client:54 - Requesting a new application from cluster with 4 NodeManagers
2021-03-12 15:05:48 INFO  Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
2021-03-12 15:05:48 INFO  Client:54 - Will allocate AM container, with 896 MB memory including 384 MB overhead
2021-03-12 15:05:48 INFO  Client:54 - Setting up container launch context for our AM
2021-03-12 15:05:48 INFO  Client:54 - Setting up the launch environment for our AM container
2021-03-12 15:05:48 INFO  Client:54 - Preparing resources for our AM container
2021-03-12 15:05:49 WARN  Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2021-03-12 15:05:50 INFO  Client:54 - Uploading resource file:/tmp/spark-bf6b7bbf-3512-4522-8b31-47a6a65187ac/__spark_libs__6875100117796151482.zip -> hdfs://nameservice1/user/root/.sparkStaging/application_1615455919840_0004/__spark_libs__6875100117796151482.zip
2021-03-12 15:05:51 INFO  Client:54 - Uploading resource file:/tmp/spark-bf6b7bbf-3512-4522-8b31-47a6a65187ac/__spark_conf__38977155164719923.zip -> hdfs://nameservice1/user/root/.sparkStaging/application_1615455919840_0004/__spark_conf__.zip
2021-03-12 15:05:51 INFO  SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:51 INFO  SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:51 INFO  SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:51 INFO  SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:51 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:51 INFO  Client:54 - Submitting application application_1615455919840_0004 to ResourceManager
2021-03-12 15:05:51 INFO  YarnClientImpl:273 - Submitted application application_1615455919840_0004
2021-03-12 15:05:51 INFO  SchedulerExtensionServices:54 - Starting Yarn extension services with app application_1615455919840_0004 and attemptId None
2021-03-12 15:05:52 INFO  Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:52 INFO  Client:54 -client token: N/Adiagnostics: N/AApplicationMaster host: N/AApplicationMaster RPC port: -1queue: defaultstart time: 1615532751522final status: UNDEFINEDtracking URL: http://cdh-002:8088/proxy/application_1615455919840_0004/user: root
2021-03-12 15:05:53 INFO  Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:54 INFO  Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:55 INFO  Client:54 - Application report for application_1615455919840_0004 (state: ACCEPTED)
2021-03-12 15:05:55 INFO  YarnClientSchedulerBackend:54 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> cdh-002, PROXY_URI_BASES -> http://cdh-002:8088/proxy/application_1615455919840_0004), /proxy/application_1615455919840_0004
2021-03-12 15:05:55 INFO  JettyUtils:54 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2021-03-12 15:05:56 INFO  YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
2021-03-12 15:05:56 INFO  Client:54 - Application report for application_1615455919840_0004 (state: RUNNING)
2021-03-12 15:05:56 INFO  Client:54 -client token: N/Adiagnostics: N/AApplicationMaster host: 10.6.2.248ApplicationMaster RPC port: 0queue: defaultstart time: 1615532751522final status: UNDEFINEDtracking URL: http://cdh-002:8088/proxy/application_1615455919840_0004/user: root
2021-03-12 15:05:56 INFO  YarnClientSchedulerBackend:54 - Application application_1615455919840_0004 has started running.
2021-03-12 15:05:56 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46294.
2021-03-12 15:05:56 INFO  NettyBlockTransferService:54 - Server created on cdh-001:46294
2021-03-12 15:05:56 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2021-03-12 15:05:56 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO  BlockManagerMasterEndpoint:54 - Registering block manager cdh-001:46294 with 140.1 MB RAM, BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, cdh-001, 46294, None)
2021-03-12 15:05:56 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@626b639e{/metrics/json,null,AVAILABLE,@Spark}
2021-03-12 15:05:59 INFO  YarnClientSchedulerBackend:54 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> cdh-002, PROXY_URI_BASES -> http://cdh-002:8088/proxy/application_1615455919840_0004), /proxy/application_1615455919840_0004
2021-03-12 15:05:59 INFO  JettyUtils:54 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2021-03-12 15:05:59 INFO  YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
2021-03-12 15:06:03 ERROR YarnClientSchedulerBackend:70 - Yarn application has already exited with state FINISHED!
2021-03-12 15:06:03 INFO  AbstractConnector:318 - Stopped Spark@2a551a63{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2021-03-12 15:06:03 INFO  SparkUI:54 - Stopped Spark web UI at http://cdh-001:4040
2021-03-12 15:06:03 ERROR TransportClient:233 - Failed to send RPC 8719934078843937066 to /10.6.2.248:50270: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelExceptionat io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
2021-03-12 15:06:03 ERROR YarnSchedulerBackend$YarnSchedulerEndpoint:91 - Sending RequestExecutors(0,0,Map(),Set()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 8719934078843937066 to /10.6.2.248:50270: java.nio.channels.ClosedChannelExceptionat org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)at io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)at io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedChannelExceptionat io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
2021-03-12 15:06:03 INFO  SchedulerExtensionServices:54 - Stopping SchedulerExtensionServices
(serviceOption=None,services=List(),started=false)
2021-03-12 15:06:03 ERROR Utils:91 - Uncaught exception in thread Yarn application state monitor
org.apache.spark.SparkException: Exception thrown in awaitResult:at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:566)at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:95)at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:155)at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:508)at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1752)at org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1924)at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357)at org.apache.spark.SparkContext.stop(SparkContext.scala:1923)at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:112)
Caused by: java.io.IOException: Failed to send RPC 8719934078843937066 to /10.6.2.248:50270: java.nio.channels.ClosedChannelExceptionat org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)at io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)at io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedChannelExceptionat io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
2021-03-12 15:06:03 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2021-03-12 15:06:03 INFO  MemoryStore:54 - MemoryStore cleared
2021-03-12 15:06:03 INFO  BlockManager:54 - BlockManager stopped
2021-03-12 15:06:03 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2021-03-12 15:06:03 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2021-03-12 15:06:03 ERROR SparkContext:91 - Error initializing SparkContext.
java.lang.IllegalStateException: Spark context stopped while waiting for backendat org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:669)at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:177)at org.apache.spark.SparkContext.<init>(SparkContext.scala:558)at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)at org.apache.spark.examples.SparkPi.main(SparkPi.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-03-12 15:06:03 INFO  SparkContext:54 - SparkContext already stopped.
2021-03-12 15:06:03 INFO  SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" java.lang.IllegalStateException: Spark context stopped while waiting for backendat org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:669)at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:177)at org.apache.spark.SparkContext.<init>(SparkContext.scala:558)at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)at org.apache.spark.examples.SparkPi.main(SparkPi.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2021-03-12 15:06:03 INFO  ShutdownHookManager:54 - Shutdown hook called
2021-03-12 15:06:03 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-ad5a6ab4-8aa4-4c06-a783-02195cd8c569
2021-03-12 15:06:03 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-bf6b7bbf-3512-4522-8b31-47a6a65187ac

猜测是在安装了大量hadoop生态组件的服务器之后,服务器上又陆续安装了Spark,在spark-submit任务提交过程中会对系统可用内存进行检测,当发现内存不足时,报出了上述错误。

在安装了 Hadoop 生态全家桶后,机器内存剩余不多,这里配置了如下的spark运行参数:

  • –driver-memory 600m
  • –executor-memory 600m

通过对这2个参数进行不同极端值的设置,可以根据日志推断出当前程序需要的内存与目前服务器的内存限制,测试结果如下:
①、第一次测试

[root@cdh-001 conf]# spark-submit     --class org.apache.spark.examples.SparkPi     --master yarn     --deploy-mode client     --driver-memory 500m     --executor-memory 100m     --executor-cores 2     $SPARK_HOME/examples/jars/spark-examples*.jar     10
2021-03-12 15:05:17 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-12 15:05:18 INFO  SparkContext:54 - Running Spark version 2.3.0
2021-03-12 15:05:18 INFO  SparkContext:54 - Submitted application: Spark Pi
2021-03-12 15:05:18 INFO  SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:18 INFO  SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:18 INFO  SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:18 INFO  SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:18 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:18 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 43552.
2021-03-12 15:05:18 INFO  SparkEnv:54 - Registering MapOutputTracker
2021-03-12 15:05:18 ERROR SparkContext:91 - Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 466092032 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration.at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:217)at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:199)

②、第二次测试

[root@cdh-001 conf]# spark-submit     --class org.apache.spark.examples.SparkPi     --master yarn     --deploy-mode client     --driver-memory 600m     --executor-memory 100m     --executor-cores 2     $SPARK_HOME/examples/jars/spark-examples*.jar     10
2021-03-12 15:05:31 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-12 15:05:31 INFO  SparkContext:54 - Running Spark version 2.3.0
2021-03-12 15:05:31 INFO  SparkContext:54 - Submitted application: Spark Pi
2021-03-12 15:05:31 INFO  SecurityManager:54 - Changing view acls to: root
2021-03-12 15:05:31 INFO  SecurityManager:54 - Changing modify acls to: root
2021-03-12 15:05:31 INFO  SecurityManager:54 - Changing view acls groups to:
2021-03-12 15:05:31 INFO  SecurityManager:54 - Changing modify acls groups to:
2021-03-12 15:05:31 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2021-03-12 15:05:31 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 39530.
2021-03-12 15:05:31 INFO  SparkEnv:54 - Registering MapOutputTracker
2021-03-12 15:05:31 ERROR SparkContext:91 - Error initializing SparkContext.
java.lang.IllegalArgumentException: Executor memory 104857600 must be at least 471859200. Please increase executor memory using the --executor-memory option or spark.executor.memory in Spark configuration.at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:225)at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:199)at org.apache.spark.SparkEnv$.create(SparkEnv.scala:330)

看表象是运行spark时,服务器资源不足,无法分配到指定大小的内存。
机器为16G内存,安装了全家桶后,根据free -g命令检查,系统可用内容是高于报错中的内存大小限制的。
但因为运行的Spark on yarn机制,那么会不会是yarn的配置限制或配置错误引起的呢?

经查,Yarn的nodemanager节点会对提交上来的任务(本例为spark on yarn)进行内存可分配性检查,涉及到对物理内存和虚拟内存的检查,当机器内存性能不太高时,可能无法通过内存检查。
当然可用尝试关闭此选项,来通过不预检内存来尝试启动程序的目的(受限于物理内存的制约,可能会失败)。

解决方案:

通过在yarn-site.xml中添加如下配置项,并重启yarn,程序在 “–driver-memory 600m --executor-memory 600m”的参数下已可以成功运行。

<property><name>yarn.nodemanager.pmem-check-enabled</name><value>false</value>
</property>
<property><name>yarn.nodemanager.vmem-check-enabled</name><value>false</value>
</property>

解决“Spark context stopped while waiting for backend“ issue相关推荐

  1. Spark Session 与 Spark Context的区别

    Spark Session是Spark 2.0中Spark应用程序的统一入口点. 它提供了一种以较少数量的构造与各种spark功能交互的方法. 此时就不需要spark context, hive co ...

  2. 解决Spark数据倾斜(Data Skew)的 N 种姿势 与 问题定位

    Spark性能优化之道--解决Spark数据倾斜(Data Skew)的N种姿势 本文结合实例详细阐明了Spark数据倾斜的问题定位和几种场景以及对应的解决方案,包括避免数据源倾斜,调整并行度,使用自 ...

  3. CC00082.spark——|HadoopSpark.V08|——|Spark.v08|Spark 原理 源码|Spark Context|

    一.SparkContext启动流程 ### --- sparkContext启动流程~~~ SparkContext 涉及到的组件多,源码比较庞大. ~~~ 有些边缘性的模块主要起到辅助的功能,暂时 ...

  4. 解决spark中遇到的数据倾斜问题

    一. 数据倾斜的现象 多数task执行速度较快,少数task执行时间非常长,或者等待很长时间后提示你内存不足,执行失败. 二. 数据倾斜的原因 常见于各种shuffle操作,例如reduceByKey ...

  5. hive解决数据倾斜问题_八种解决 Spark 数据倾斜的方法

    有的时候,我们可能会遇到大数据计算中一个最棘手的问题--数据倾斜,此时Spark作业的性能会比期望差很多.数据倾斜调优,就是使用各种技术方案解决不同类型的数据倾斜问题,以保证Spark作业的性能. 数 ...

  6. 使用spark.streaming.kafka.consumer.poll.ms和reconnect.backoff.ms解决spark streaming消费kafka时任务不稳定的问题

    问题描述 在用spark streaming程序消费kafka的数据时,遇到了一个神奇的现象:同样的数据量.相似的数据,在消费时,有些批次的数据在做map操作时神奇的多了40多秒,具体看下面的数据:在 ...

  7. 解决Spark窗口统计函数rank()、row_number()、percent_rank()的OOM问题

    目录 1.    窗口函数功能介绍 一个简单的例子 一个复杂的例子 2.数据量过大时的OOM问题 问题及原因 解决方法1:用SQL处理 解决方法2:转为rdd进行处理 解决方法3:将数据量过多的分组进 ...

  8. 使用maven-shade-plugin插件解决spark依赖冲突问题

    (尊重劳动成果,转载请注明出处:http://blog.csdn.net/qq_25827845/article/details/54973182冷血之心的博客) 依赖冲突:NoSuchMethodE ...

  9. 解决spark on yarn报错:File /tmp/hadoop-root/nm-local-dir/filecache does not exist

    在测试过程中遇到了类似如下的错误: /tmp/hadoop-root/root为用户名 Application application_xxxxxxxxx_yyyy failed 2 times du ...

最新文章

  1. python常用变量名_python基础知识整理
  2. 被鹤岗买房鼓励,我带上6万来到另一小城
  3. 一文带你入门图论和网络分析(附Python代码)
  4. 【svn】svn报错:“Previous operation has not finished; run ‘cleanup‘ if it was interrupted“ 的解决方法
  5. 【干货】救火必备:线上故障排查套路大全
  6. android录屏软件冲突,关于Android同时录制多个录像的问题
  7. gpl2 gpl3区别_GPL的下降?
  8. 数据结构C语言版第二版答案 严蔚敏 李冬梅 吴伟民 编著
  9. LumaQQ.NET 试用
  10. c语言大计基题库,2016年大学计算机基础试题题库及答案
  11. Hive数据分析案例
  12. 拦截QT关闭窗口的CloseEvent()解析
  13. 【R语言】5种探索数据分布的可视化技术
  14. 论文摘要部分如何撰写
  15. 8月20日 仿163邮箱中遇到的问题及解决(二)
  16. MySql事务4种隔离级别以及悲观锁和乐观锁
  17. python,selenium爬取微博热搜存入Mysql
  18. leetcode98.验证二叉搜索树 Python
  19. 商品新零售行业——客户价值分析驾驶舱(附详细操作)
  20. 9.app后端选择什么服务器

热门文章

  1. 轻量级高性能多维分析套件
  2. python 利用json获取5天的天气
  3. Visual Studio安装及下载
  4. 基于STM32F407结合HC-SR04、TCRT5000模块 设计的智能小车(下篇)
  5. 红外循迹TCRT5000 舵机SG90
  6. js格式化日期为各种国际格式
  7. quartz定时任务中遇到的坑
  8. 读书笔记:《漫画九型人格》
  9. linux串口输出系统日志,linux系统连接串口工具打印log
  10. AirDisk存宝Q2有什么功能?