jupyter-notebook 以yarn模式运行出现的问题及解决
原创小白programmer 最后发布于2018-11-21 10:53:01 阅读数 519  收藏
展开
jupyter-notebook 以yarn模式运行的出现的问题及解决方法

之前用pyspark虚拟机只跑了单机程序,现在想试试分布式运算。
在做之前找了书和博客来看,总是有各种各样的问题,无法成功。现在特记录一下过程:
这里一共有两个虚拟机,一个做master,一个做slave1

虚拟机slave1安装spark
slave1之前已经安装了hadoop,并且可以成功进行Hadoop集群运算。这里就不多说了。
将master的spark安装包复制到slave1,
(1)进入到spark/conf文件夹中,将slaves.template复制成slaves,在里面添加slave1

(2)增加路径到/etc/profile

master与slave1都要做(1),(2)的步骤

slave1安装anaconda
可以用scp直接将master的anaconda复制过来,接下来修改/etc/profile就可。上面的图已经显示了修改的内容

启动,这时候遇到了好多问题
在master终端输入start-all.sh,使用jps查看,master和slave1都能正常启动
在master终端输入
HADOOP_CONF_DIR=/hadoop/hadoop/etc/hadoop PYSPARK_DRIVER_PYTHON="jupyter" PYSPARK_DRIVER_PYTHON_OPTS="notebook" MASTER=yarn-client pyspark
看资料说,如果没有在spark.env.sh中配置HADOOP_CONF_DIR,需要像上面代码在终端写出。这时候,jupyter-notebook可以成功启动,但是我在其中写入sc.master看它是何种模式运行时,却给我报了好多错误

[root@master home]#HADOOP_CONF_IR=/hadoop/hadoop/etc/hadoop PYSPARK_DRIVER_PYTHON="jupyter"
PYSPARK_DRIVER_PYTHON_OPTS="notebook"  pyspark

[I 18:58:24.475 NotebookApp]
[nb_conda_kernels] enabled, 2 kernels found

[I 18:58:25.101 NotebookApp] ✓ nbpresent HTML export ENABLED
[W 18:58:25.101 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'
[I 18:58:25.163 NotebookApp]
[nb_anacondacloud] enabled
[I 18:58:25.167 NotebookApp] [nb_conda] enabled
[I 18:58:25.167 NotebookApp] Serving
notebooks from local directory: /home
[I 18:58:25.167 NotebookApp] 0 active
kernels 
[I 18:58:25.168 NotebookApp] The Jupyter
Notebook is running at: http://localhost:8888/
[I 18:58:25.168 NotebookApp] Use Control-C
to stop this server and shut down all kernels (twice to skip confirmation).
[I 18:58:33.844 NotebookApp] Kernel
started: c15aabde-b441-45f2-b78d-9933e6534c27
Exception in thread "main"
java.lang.Exception: When running with master 'yarn-client' either
HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
       at
org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:263)
       at
org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:240)
       at
org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:116)
      at
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
       at
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[IPKernelApp] WARNING | Unknown error in
handling PYTHONSTARTUP file /hadoop/spark/python/pyspark/shell.py:
[I 19:00:33.829 NotebookApp] Saving file at
/Untitled2.ipynb
[I 19:00:57.754 NotebookApp] Creating new
notebook in 
[I 19:00:59.174 NotebookApp] Kernel
started: ebfbdfd5-2343-4149-9fef-28877967d6c6
Exception in thread "main"
java.lang.Exception: When running with master 'yarn-client' either
HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
       at
org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:263)
       at
org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:240)
       at
org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:116)
       at
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
       at
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[IPKernelApp] WARNING | Unknown error in
handling PYTHONSTARTUP file /hadoop/spark/python/pyspark/shell.py:
[I 19:01:12.315 NotebookApp] Saving file at
/Untitled3.ipynb
^C[I 19:01:15.971 NotebookApp] interrupted
Serving notebooks from local directory:
/home
2 active kernels 
The Jupyter Notebook is running at:
http://localhost:8888/
Shutdown this notebook server (y/[n])? y
[C 19:01:17.674 NotebookApp] Shutdown
confirmed
[I 19:01:17.675 NotebookApp] Shutting down
kernels
[I 19:01:18.189 NotebookApp] Kernel
shutdown: ebfbdfd5-2343-4149-9fef-28877967d6c6

[I 19:01:18.190 NotebookApp] Kernel
shutdown: c15aabde-b441-45f2-b78d-9933e6534c27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
通过日志显示:

Exception in thread "main"  java.lang.Exception: When running with master 'yarn-client' either  HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
1
于是配置spark.env.sh

再次运行:

[root@master conf]#
HADOOP_CONF_DIR=/hadoop/hadoop/etc/hadoop pyspark --master yarn --deploy-mode
client

[TerminalIPythonApp] WARNING | Subcommand
`ipython notebook` is deprecated and will be removed in future versions.

[TerminalIPythonApp] WARNING | You likely
want to use `jupyter notebook` in the future

[I 19:15:28.816 NotebookApp]
[nb_conda_kernels] enabled, 2 kernels found

[I 19:15:28.923 NotebookApp] ✓ nbpresent HTML export ENABLED

[W 19:15:28.923 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'

[I 19:15:28.986 NotebookApp]
[nb_anacondacloud] enabled

[I 19:15:28.989 NotebookApp] [nb_conda]
enabled

[I 19:15:28.990 NotebookApp] Serving
notebooks from local directory: /hadoop/spark/conf

[I 19:15:28.990 NotebookApp] 0 active
kernels

[I 19:15:28.990 NotebookApp] The Jupyter
Notebook is running at: http://localhost:8888/

[I 19:15:28.990 NotebookApp] Use Control-C
to stop this server and shut down all kernels (twice to skip confirmation).

[I 19:15:44.862 NotebookApp] Creating new
notebook in

[I 19:15:45.742 NotebookApp] Kernel
started: 98d8605a-804a-47af-83fb-2efc8b5a3d60

Setting default log level to
"WARN".

To adjust logging level use
sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

18/11/20 19:15:48 WARN
util.NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable

18/11/20 19:15:51 WARN yarn.Client: Neither
spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading
libraries under SPARK_HOME.

[W 19:15:55.943 NotebookApp] Timeout
waiting for kernel_info reply from 98d8605a-804a-47af-83fb-2efc8b5a3d60

18/11/20 19:16:11 ERROR spark.SparkContext:
Error initializing SparkContext.

org.apache.spark.SparkException: Yarn
application has already ended! It might have been killed or unable to launch
application master.

at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)

at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)

at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)

at
org.apache.spark.SparkContext.<init>(SparkContext.scala:509)

at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at
java.lang.reflect.Constructor.newInstance(Constructor.java:423)

at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)

at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

at
py4j.Gateway.invoke(Gateway.java:236)

at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)

at
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)

at
py4j.GatewayConnection.run(GatewayConnection.java:214)

at
java.lang.Thread.run(Thread.java:748)

18/11/20 19:16:11 ERROR
client.TransportClient: Failed to send RPC 7790789781121901013 to
/192.168.127.131:55928: java.nio.channels.ClosedChannelException

java.nio.channels.ClosedChannelException

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)

18/11/20 19:16:11 ERROR
cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending
RequestExecutors(0,0,Map(),Set()) to AM was unsuccessful

java.io.IOException: Failed to send RPC
7790789781121901013 to /192.168.127.131:55928:
java.nio.channels.ClosedChannelException

at
org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)

at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)

at
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)

at
io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)

at
io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)

at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)

at
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)

at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)

at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)

at
java.lang.Thread.run(Thread.java:748)

Caused by: java.nio.channels.ClosedChannelException

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)

18/11/20 19:16:11 ERROR util.Utils:
Uncaught exception in thread Thread-2

org.apache.spark.SparkException: Exception
thrown in awaitResult:

at
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)

at
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

at
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:551)

at
org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:93)

at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:151)

at
org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:517)

at
org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1652)

at
org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1921)

at
org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1317)

at
org.apache.spark.SparkContext.stop(SparkContext.scala:1920)

at
org.apache.spark.SparkContext.<init>(SparkContext.scala:587)

at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at
java.lang.reflect.Constructor.newInstance(Constructor.java:423)

at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)

at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

at
py4j.Gateway.invoke(Gateway.java:236)

at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)

at
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)

at
py4j.GatewayConnection.run(GatewayConnection.java:214)

at
java.lang.Thread.run(Thread.java:748)

Caused by: java.io.IOException: Failed to
send RPC 7790789781121901013 to /192.168.127.131:55928:
java.nio.channels.ClosedChannelException

at
org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)

at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)

at
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)

at
io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)

at
io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)

at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)

at
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)

at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)

at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)

...
1 more

Caused by:
java.nio.channels.ClosedChannelException

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)

18/11/20 19:16:11 WARN metrics.MetricsSystem:
Stopping a MetricsSystem that is not running

18/11/20 19:16:11 WARN spark.SparkContext:
Another SparkContext is being constructed (or threw an exception in its constructor).  This may indicate an error, since only one
SparkContext may be running in this JVM (see SPARK-2243). The other
SparkContext was created at:

org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)

sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

java.lang.reflect.Constructor.newInstance(Constructor.java:423)

py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)

py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

py4j.Gateway.invoke(Gateway.java:236)

py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)

py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)

py4j.GatewayConnection.run(GatewayConnection.java:214)

java.lang.Thread.run(Thread.java:748)

18/11/20 19:16:11 WARN yarn.Client: Neither
spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading
libraries under SPARK_HOME.

18/11/20 19:16:29 ERROR spark.SparkContext:
Error initializing SparkContext.

org.apache.spark.SparkException: Yarn
application has already ended! It might have been killed or unable to launch
application master.

at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)

at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)

at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)

at
org.apache.spark.SparkContext.<init>(SparkContext.scala:509)

at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at
java.lang.reflect.Constructor.newInstance(Constructor.java:423)

at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)

at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

at
py4j.Gateway.invoke(Gateway.java:236)

at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)

at
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)

at
py4j.GatewayConnection.run(GatewayConnection.java:214)

at
java.lang.Thread.run(Thread.java:748)

18/11/20 19:16:29 ERROR
client.TransportClient: Failed to send RPC 6243011927050432229 to
/192.168.127.131:59702: java.nio.channels.ClosedChannelException

java.nio.channels.ClosedChannelException

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)

18/11/20 19:16:29 ERROR
cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending
RequestExecutors(0,0,Map(),Set()) to AM was unsuccessful

java.io.IOException: Failed to send RPC 6243011927050432229
to /192.168.127.131:59702: java.nio.channels.ClosedChannelException

at
org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)

at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)

at
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)

at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)

at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)

at
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:852)

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:738)

at
io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1251)

at
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:733)

at
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:725)

at
io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:35)

at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1062)

at
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)

at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1051)

at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)

at
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)

at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)

at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)

at
java.lang.Thread.run(Thread.java:748)

Caused by:
java.nio.channels.ClosedChannelException

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)

18/11/20 19:16:29 ERROR util.Utils:
Uncaught exception in thread Thread-2

org.apache.spark.SparkException: Exception
thrown in awaitResult:

at
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)

at
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

at
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.requestTotalExecutors(CoarseGrainedSchedulerBackend.scala:551)

at
org.apache.spark.scheduler.cluster.YarnSchedulerBackend.stop(YarnSchedulerBackend.scala:93)

at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:151)

at
org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:517)

at
org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1652)

at
org.apache.spark.SparkContext$$anonfun$stop$8.apply$mcV$sp(SparkContext.scala:1921)

at
org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1317)

at
org.apache.spark.SparkContext.stop(SparkContext.scala:1920)

at
org.apache.spark.SparkContext.<init>(SparkContext.scala:587)

at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at
java.lang.reflect.Constructor.newInstance(Constructor.java:423)

at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)

at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

at
py4j.Gateway.invoke(Gateway.java:236)

at
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)

at
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)

at
py4j.GatewayConnection.run(GatewayConnection.java:214)

at
java.lang.Thread.run(Thread.java:748)

Caused by: java.io.IOException: Failed to
send RPC 6243011927050432229 to /192.168.127.131:59702:
java.nio.channels.ClosedChannelException

at
org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)

at
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)

at
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)

at
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)

at
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)

at
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:852)

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:738)

at
io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1251)

at
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:733)

at
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:725)

at
io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:35)

at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1062)

at
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1116)

at
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1051)

at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)

at
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)

at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)

at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)

...
1 more

Caused by:
java.nio.channels.ClosedChannelException

at
io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)

18/11/20 19:16:29 WARN
metrics.MetricsSystem: Stopping a MetricsSystem that is not running

[IPKernelApp] WARNING | Unknown error in
handling PYTHONSTARTUP file /hadoop/spark/python/pyspark/shell.py:

[I 19:17:00.221 NotebookApp] Saving file at
/Untitled.ipynb

^C[I 19:17:03.428 NotebookApp] interrupted

Serving notebooks from local directory:
/hadoop/spark/conf

1 active kernels

The Jupyter Notebook is running at:
http://localhost:8888/

Shutdown this notebook server (y/[n])? y

[C 19:17:04.983 NotebookApp] Shutdown confirmed

[I 19:17:04.983 NotebookApp] Shutting down
kernels

[I 19:17:05.587 NotebookApp] Kernel
shutdown: 98d8605a-804a-47af-83fb-2efc8b5a3d60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
这里主要出现了两个错误:
(1)

18/11/20 19:16:11 ERROR spark.SparkContext:
Error initializing SparkContext.

org.apache.spark.SparkException: Yarn
application has already ended! It might have been killed or unable to launch
application master.

1
2
3
4
5
6
7
(2)

Caused by: java.io.IOException: Failed to
send RPC 7790789781121901013 to /192.168.127.131:55928:
java.nio.channels.ClosedChannelException
1
2
3
分别将这两个错误百度下
有的说是内存不足,有的说是需要两个内核
对于内存不足,在yarn-site.xml增加两个点
就是下面图片上的最后两个点

又修改虚拟机设置给slave1增加了两个处理器,使它变成两个核
然而仍旧出现相同的错误
继续修改,中间不知道修改了什么,再次运行
出现了不一样的错误

[root@master hadoop]# pyspark --master yarn

[TerminalIPythonApp] WARNING | Subcommand
`ipython notebook` is deprecated and will be removed in future versions.

[TerminalIPythonApp] WARNING | You likely
want to use `jupyter notebook` in the future

[I 21:04:49.200 NotebookApp]
[nb_conda_kernels] enabled, 2 kernels found

[I 21:04:49.310 NotebookApp] ✓ nbpresent HTML export ENABLED

[W 21:04:49.310 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named 'nbbrowserpdf'

[I 21:04:49.373 NotebookApp]
[nb_anacondacloud] enabled

[I 21:04:49.376 NotebookApp] [nb_conda]
enabled

[I 21:04:49.377 NotebookApp] Serving
notebooks from local directory: /hadoop/hadoop/etc/hadoop

[I 21:04:49.377 NotebookApp] 0 active
kernels

[I 21:04:49.377 NotebookApp] The Jupyter
Notebook is running at: http://localhost:8888/

[I 21:04:49.377 NotebookApp] Use Control-C
to stop this server and shut down all kernels (twice to skip confirmation).

[I 21:04:54.440 NotebookApp] Creating new
notebook in

[I 21:04:55.832 NotebookApp] Kernel
started: c526700a-7ee9-4bdc-9bf1-675db15d1799

SLF4J: Class path contains multiple SLF4J
bindings.

SLF4J: Found binding in
[jar:file:/hadoop/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in
[jar:file:/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See
http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type
[org.slf4j.impl.Log4jLoggerFactory]

Setting default log level to
"WARN".

To adjust logging level use sc.setLogLevel(newLevel).
For SparkR, use setLogLevel(newLevel).

18/11/20 21:04:59 WARN util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using builtin-java
classes where applicable

18/11/20 21:05:02 WARN yarn.Client: Neither
spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading
libraries under SPARK_HOME.

[W 21:05:05.954 NotebookApp] Timeout
waiting for kernel_info reply from c526700a-7ee9-4bdc-9bf1-675db15d1799

18/11/20 21:06:09 WARN hdfs.DFSClient:
DataStreamer Exception

org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File /user/root/.sparkStaging/application_1542716519992_0009/__spark_libs__6100798743446340760.zip
could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s)
are excluded in this operation.

at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1562)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3245)

at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:663)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)

at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Subject.java:422)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)

at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)

at
org.apache.hadoop.ipc.Client.call(Client.java:1470)

at
org.apache.hadoop.ipc.Client.call(Client.java:1401)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)

at
com.sun.proxy.$Proxy11.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at
java.lang.reflect.Method.invoke(Method.java:498)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at
com.sun.proxy.$Proxy12.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1528)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1345)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:587)

18/11/20 21:06:09 ERROR spark.SparkContext:
Error initializing SparkContext.

org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File /user/root/.sparkStaging/application_1542716519992_0009/__spark_libs__6100798743446340760.zip
could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s)
are excluded in this operation.

at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1562)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3245)

at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:663)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)

at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Subject.java:422)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)

at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)

at
org.apache.hadoop.ipc.Client.call(Client.java:1470)

at
org.apache.hadoop.ipc.Client.call(Client.java:1401)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)

at
com.sun.proxy.$Proxy11.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at
java.lang.reflect.Method.invoke(Method.java:498)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at
com.sun.proxy.$Proxy12.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1528)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1345)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:587)

18/11/20 21:06:09 WARN
cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request
executors before the AM has registered!

18/11/20 21:06:09 WARN
metrics.MetricsSystem: Stopping a MetricsSystem that is not running

18/11/20 21:06:09 WARN spark.SparkContext:
Another SparkContext is being constructed (or threw an exception in its
constructor).  This may indicate an
error, since only one SparkContext may be running in this JVM (see SPARK-2243).
The other SparkContext was created at:

org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)

sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

java.lang.reflect.Constructor.newInstance(Constructor.java:423)

py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)

py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)

py4j.Gateway.invoke(Gateway.java:236)

py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)

py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)

py4j.GatewayConnection.run(GatewayConnection.java:214)

java.lang.Thread.run(Thread.java:748)

18/11/20 21:06:09 WARN yarn.Client: Neither
spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading
libraries under SPARK_HOME.

[I 21:06:55.876 NotebookApp] Saving file at
/Untitled.ipynb

18/11/20 21:07:16 WARN hdfs.DFSClient:
DataStreamer Exception

org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File /user/root/.sparkStaging/application_1542716519992_0010/__spark_libs__8564260734942060287.zip
could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and 1 node(s)
are excluded in this operation.

at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1562)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3245)

at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:663)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)

at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Subject.java:422)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)

at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)
       at
org.apache.hadoop.ipc.Client.call(Client.java:1470)

at
org.apache.hadoop.ipc.Client.call(Client.java:1401)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)

at
com.sun.proxy.$Proxy11.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at
java.lang.reflect.Method.invoke(Method.java:498)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at
com.sun.proxy.$Proxy12.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1528)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1345)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:587)

18/11/20 21:07:16 ERROR spark.SparkContext:
Error initializing SparkContext.

org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File
/user/root/.sparkStaging/application_1542716519992_0010/__spark_libs__8564260734942060287.zip
could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and 1 node(s)
are excluded in this operation.

at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1562)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3245)

at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:663)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)

at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)

at
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)

at
java.security.AccessController.doPrivileged(Native Method)

at
javax.security.auth.Subject.doAs(Subject.java:422)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)

at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)

at
org.apache.hadoop.ipc.Client.call(Client.java:1470)

at
org.apache.hadoop.ipc.Client.call(Client.java:1401)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)

at
com.sun.proxy.$Proxy11.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)

at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at
java.lang.reflect.Method.invoke(Method.java:498)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at
com.sun.proxy.$Proxy12.addBlock(Unknown Source)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1528)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1345)

at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:587)

18/11/20 21:07:16 WARN
cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors
before the AM has registered!

18/11/20 21:07:16 WARN
metrics.MetricsSystem: Stopping a MetricsSystem that is not running

[IPKernelApp] WARNING | Unknown error in
handling PYTHONSTARTUP file /hadoop/spark/python/pyspark/shell.py:

[I 21:07:36.291 NotebookApp] Saving file at
/Untitled.ipynb

[I 21:07:42.092 NotebookApp] Kernel
shutdown: c526700a-7ee9-4bdc-9bf1-675db15d1799

[W 21:07:42.095 NotebookApp] delete
/Untitled.ipynb

^C[I 21:07:46.458 NotebookApp] interrupted

Serving notebooks from local directory: /hadoop/hadoop/etc/hadoop

0 active kernels

The Jupyter Notebook is running at:
http://localhost:8888/

Shutdown this notebook server (y/[n])? y

[C 21:07:48.224 NotebookApp] Shutdown
confirmed

[I 21:07:48.225 NotebookApp] Shutting down
kernels
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
继续按照日志给出的信息继续寻找,
当我用

hadoop dfsadmin -report 查看一下磁盘使用情况时
1
Configured Capacity: 0 (0 B)

Present Capacity: 0 (0 B)

DFS Remaining: 0 (0 B)

DFS Used: 0 (0 B)

DFS Used%: NaN%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
于是重新格式化namenode,
因为上面提到hdfs,我有修改了一下hdfs-site.xml。将里面的replication值从1变到2
再一次start-all.sh,

[root@master bin]# hadoop dfsadmin -report

DEPRECATED: Use of this script to execute
hdfs command is deprecated.

Instead use the hdfs command for it.

Configured Capacity: 18238930944 (16.99 GB)

Present Capacity: 6707884032 (6.25 GB)

DFS Remaining: 6707879936 (6.25 GB)

DFS Used: 4096 (4 KB)

DFS Used%: 0.00%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0
-------------------------------------------------
Live datanodes (1):

Name: 192.168.127.131:50010 (slave1)

Hostname: slave1

Decommission Status : Normal

Configured Capacity: 18238930944 (16.99 GB)

DFS Used: 4096 (4 KB)

Non DFS Used: 11531046912 (10.74 GB)

DFS Remaining: 6707879936 (6.25 GB)

DFS Used%: 0.00%

DFS Remaining%: 36.78%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Tue Nov 20 21:26:11 CST 2018
在终端输入

pyspark --master yarn
惊喜了一下,结果出来了

jupyter-notebook 以yarn模式运行出现的问题及解决相关推荐

  1. Jupyter Notebook出现kernel error或者kernel starting please wait解决/处理方法

    Jupyter Notebook出现kernel error或者kernel starting please wait解决/处理方法 解决方案 1.检测内核路径 2.检查kernel内核文件 正确路径 ...

  2. Jupyter Notebook 中查看当前 运行哪个python

    在运行 Jupyter Notebook时候, 往往由于我们机器上装有多个版本的python, 我们不知道哪个python 是我们正在用的. 可以使用sys.executable 查看路径, 如下: ...

  3. 「JupyterLab」 Jupyter Notebook 新生代IDE模式页面

    参考:Overview 安装: $ pip install jupyterlab 启动(不是jupyter notebook): $ jupyter lab Jupyterlab中最好用的就是显示cs ...

  4. Selenium启动Chrome浏览器提示“请停用以开发者模式运行的扩展程序”的解决办法

    安装了selenium,python运行下面代码: from selenium import webdriverbrowser = webdriver.Chrome() browser.get('ht ...

  5. Jupyter notebook 配置无问题 但就是无法远程访问,解决方法

    关于Jupyter notebook远程访问的配置我就在这赘述了. 配置好之后可能还是访问不了!这时候需要建立个通道 ssh root@47.94.*.* -L127.0.0.1:1234:127.0 ...

  6. flinkx 部署,on yarn模式运行

    目录 1.flinkx部署 参考官方安装文档,但是会有一些坑 2.FlinkX版本需要与Flink版本保持一致,最好小版本也保持一致 下载最新编译包 下载后,lib里面缺少flinkx-launche ...

  7. select报错 spark_spark-sql master on yarn 模式运行 select count(*) 报错日志

    启动hive --service metastore 启动 dfs yarn [root@bigdatastorm bin]# ./spark-sql --master yarn --deploy-m ...

  8. jupyter notebook安装后不能运行问题

    1.问题:内核启动失败 2.解决办法: 在Anaconda Prompt中输入jupyter kernelspec list点击kernel.json查看文件中的python路径是否为anaconda ...

  9. jupyter notebook绘制cos,sin图像 加入%matplotlib inline解决Figure size 640x480 with 1 Axes问题

    1.matplotib绘制图像 import numpy as np import matplotlib.pyplot as plt %matplotlib inline #解决Figure size ...

最新文章

  1. Cacti 插件中setup.php 文件的编写
  2. php5.5.9 新特性,php,_PHP 5.5.9版本中COOKIE的奇怪现象,php - phpStudy
  3. phoneGap异步加载JS失败
  4. springboot 配置资源映射路径
  5. 如何在Kali Linux中安装Google Chrome浏览器
  6. php集合与数组的区别,php数组和链表的区别总结
  7. 【转】vim ctag使用方法
  8. GCAlloc 问题一则
  9. 最新iOS面试题:APP性能优化(①系列更新)
  10. 工业互联网标识解析体系
  11. 关于msn 微软关闭MSN聊天信息超级链接功能
  12. 嵌入式Linux入门-手把手教你初始化SDRAM(附代码)
  13. 梦想还是要有的,万一实现了呢?
  14. 塞拉菲娜创始人 - 钰儿
  15. cad画多段线时不显示轨迹_CAD显示不出所画线段的解决方法
  16. NLP( 包括语音识别)
  17. Fluke 1550C、FLUKE 1555高压绝缘电阻测试仪
  18. 【SQL开发实战技巧】系列(十六):数据仓库中时间类型操作(初级)日、月、年、时、分、秒之差及时间间隔计算
  19. 第一个100万!!!
  20. python之实例分析

热门文章

  1. 设计模式--享元模式
  2. SpringBoot——SpringBoot集成jsp
  3. BCG波士顿咨询:以数据为引擎,迈上数字化转型快车道
  4. App 自动化解决方案 [开源项目] 基于 Appium 的 UI 自动化测试框架完美版
  5. 1010: 平行四边形
  6. 机器学习之Pandas教程(上)
  7. 探索性数据分析-如何描述业务量数据
  8. 小工具:用C++读取TGA并输出数据到文本
  9. ubuntu如何开放对外端口_ubuntu开放指定端口
  10. java的switch_Java中Switch用法代码示例