1:spark下载,解压

[jifeng@jifeng01 hadoop]$ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.2.0-bin-hadoop1.tgz
--2015-02-03 21:50:25--  http://d3kbcqa49mib13.cloudfront.net/spark-1.2.0-bin-hadoop1.tgz
正在解析主机 d3kbcqa49mib13.cloudfront.net... 54.192.150.88, 54.230.149.156, 54.230.150.31, ...
Connecting to d3kbcqa49mib13.cloudfront.net|54.192.150.88|:80... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:203977086 (195M) [application/x-compressed]
Saving to: `spark-1.2.0-bin-hadoop1.tgz'100%[===========================================================================>] 203,977,086  410K/s   in 7m 55s  2015-02-03 21:58:21 (419 KB/s) - `spark-1.2.0-bin-hadoop1.tgz' saved [203977086/203977086]
[jifeng@jifeng01 hadoop]$ tar zxf spark-1.2.0-bin-hadoop1.tgz 

2:scala下载

在jifeng04下载并复制到jifeng01

[jifeng@jifeng04 hadoop]$ wget http://downloads.typesafe.com/scala/2.11.4/scala-2.11.4.tgz?_ga=1.248348352.61371242.1418807768
--2015-02-03 21:55:35--  http://downloads.typesafe.com/scala/2.11.4/scala-2.11.4.tgz?_ga=1.248348352.61371242.1418807768
正在解析主机 downloads.typesafe.com... 54.230.151.79, 54.230.151.80, 54.230.151.162, ...
Connecting to downloads.typesafe.com|54.230.151.79|:80... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:26509669 (25M) [application/octet-stream]
Saving to: `scala-2.11.4.tgz?_ga=1.248348352.61371242.1418807768'100%[===========================================================================>] 26,509,669   320K/s   in 71s     2015-02-03 21:56:48 (366 KB/s) - `scala-2.11.4.tgz?_ga=1.248348352.61371242.1418807768' saved [26509669/26509669][jifeng@jifeng04 hadoop]$ ls
hadoop-1.2.1  scala-2.11.4.tgz?_ga=1.248348352.61371242.1418807768  tmp
[jifeng@jifeng04 hadoop]$ mv scala-2.11.4.tgz\?_ga\=1.248348352.61371242.1418807768 scala-2.11.4.tgz
[jifeng@jifeng04 hadoop]$ tar zxf scala-2.11.4.tgz
[jifeng@jifeng04 hadoop]$ ls
hadoop-1.2.1  scala-2.11.4  scala-2.11.4.tgz  tmp
[jifeng@jifeng04 hadoop]$ scp ./scala-2.11.4.tgz jifeng@jifeng01.sohudo.com:/home/jifeng/hadoop
scala-2.11.4.tgz                                                                   100%   25MB  25.3MB/s   00:01
[jifeng@jifeng04 hadoop]$ 

jifeng01 解压

[jifeng@jifeng01 hadoop]$ tar zxf scala-2.11.4.tgz

3:配置scala

配置scala环境变量

export SCALA_HOME=$HOME/hadoop/scala-2.11.4
export PATH=$PATH:$ANT_HOME/bin:$HBASE_HOME/bin:$SQOOP_HOME/bin:$HADOOP_HOME/bin:$MAHOUT_HOME/bin:$SCALA_HOME/bin

[jifeng@jifeng01 ~]$ vi ~/.bash_profile# .bash_profile# Get the aliases and functions
if [ -f ~/.bashrc ]; then. ~/.bashrc
fi# User specific environment and startup programsPATH=$PATH:$HOME/binexport PATH
export JAVA_HOME=$HOME/jdk1.7.0_45
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=$HOME/hadoop/hadoop-1.2.1
export HADOOP_CONF_DIR=$HOME/hadoop/hadoop-1.2.1/conf
export ANT_HOME=$HOME/apache-ant-1.9.4
export HBASE_HOME=$HOME/hbase-0.94.21
export SQOOP_HOME=$HOME/sqoop-1.99.3-bin-hadoop100
export CATALINA_HOME=$SQOOP_HOME/server
export LOGDIR=$SQOOP_HOME/logsexport MAHOUT_HOME=$HOME/hadoop/mahout-distribution-0.9
export MAHOUT_CONF_DIR=$HOME/hadoop/mahout-distribution-0.9/confexport SCALA_HOME=$HOME/hadoop/scala-2.11.4
export PATH=$PATH:$ANT_HOME/bin:$HBASE_HOME/bin:$SQOOP_HOME/bin:$HADOOP_HOME/bin:$MAHOUT_HOME/bin:$SCALA_HOME/bin
~
~
".bash_profile" 28L, 889C 已写入
[jifeng@jifeng01 ~]$ source ~/.bash_profile

source ~/.bash_profile 配置文件马上生效

查看版本

[jifeng@jifeng01 hadoop]$ scala -version
Scala code runner version 2.11.4 -- Copyright 2002-2013, LAMP/EPFL
[jifeng@jifeng01 hadoop]$ scala
Welcome to Scala version 2.11.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45).
Type in expressions to have them evaluated.
Type :help for more information.scala> 

4:配置spark环境变量

加入环境变量

export SPARK_HOME=$HOME/hadoop/spark-1.2.0-bin-hadoop1
export PATH=$PATH:$SPARK_HOME/bin

[jifeng@jifeng01 ~]$ vi ~/.bash_profile    # .bash_profile# Get the aliases and functions
if [ -f ~/.bashrc ]; then. ~/.bashrc
fi# User specific environment and startup programsPATH=$PATH:$HOME/binexport PATH
export JAVA_HOME=$HOME/jdk1.7.0_45
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=$HOME/hadoop/hadoop-1.2.1
export HADOOP_CONF_DIR=$HOME/hadoop/hadoop-1.2.1/conf
export ANT_HOME=$HOME/apache-ant-1.9.4
export HBASE_HOME=$HOME/hbase-0.94.21
export SQOOP_HOME=$HOME/sqoop-1.99.3-bin-hadoop100
export CATALINA_HOME=$SQOOP_HOME/server
export LOGDIR=$SQOOP_HOME/logsexport MAHOUT_HOME=$HOME/hadoop/mahout-distribution-0.9
export MAHOUT_CONF_DIR=$HOME/hadoop/mahout-distribution-0.9/confexport SCALA_HOME=$HOME/hadoop/scala-2.11.4
export SPARK_HOME=$HOME/hadoop/spark-1.2.0-bin-hadoop1
export PATH=$PATH:$ANT_HOME/bin:$HBASE_HOME/bin:$SQOOP_HOME/bin:$HADOOP_HOME/bin:$MAHOUT_HOME/bin:$SCALA_HOME/bin
export PATH=$PATH:$SPARK_HOME/bin
".bash_profile" 30L, 978C 已写入
[jifeng@jifeng01 ~]$ source ~/.bash_profile

5:配置spark

复制配置文件 cp spark-env.sh.template spark-env.sh

[jifeng@jifeng01 ~]$ cd hadoop
[jifeng@jifeng01 hadoop]$ cd spark-1.2.0-bin-hadoop1
[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ ls
bin  conf  data  ec2  examples  lib  LICENSE  NOTICE  python  README.md  RELEASE  sbin
[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ cd conf
[jifeng@jifeng01 conf]$ ls
fairscheduler.xml.template  metrics.properties.template  spark-defaults.conf.template
log4j.properties.template   slaves.template              spark-env.sh.template
[jifeng@jifeng01 conf]$ cp spark-env.sh.template spark-env.sh

在spark-env.sh最后添加下面

export SCALA_HOME=/home/jifeng/hadoop/scala-2.11.4
export SPARK_MASTER_IP=jifeng01.sohudo.com
export SPARK_WORKER_MEMORY=2G
export JAVA_HOME=/home/jifeng/jdk1.7.0_45

# Generic options for the daemons used in the standalone deploy mode
# - SPARK_CONF_DIR      Alternate conf dir. (Default: ${SPARK_HOME}/conf)
# - SPARK_LOG_DIR       Where log files are stored.  (Default: ${SPARK_HOME}/logs)
# - SPARK_PID_DIR       Where the pid file is stored. (Default: /tmp)
# - SPARK_IDENT_STRING  A string representing this instance of spark. (Default: $USER)
# - SPARK_NICENESS      The scheduling priority for daemons. (Default: 0)
export SCALA_HOME=/home/jifeng/hadoop/scala-2.11.4
export SPARK_MASTER_IP=jifeng01.sohudo.com
export SPARK_WORKER_MEMORY=2G
export JAVA_HOME=/home/jifeng/jdk1.7.0_45
"spark-env.sh" 54L, 3378C 已写入
[jifeng@jifeng01 conf]$ cd ..

6:启动

[jifeng@jifeng01 sbin]$ cd ..
[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ sbin/start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /home/jifeng/hadoop/spark-1.2.0-bin-hadoop1/sbin/../logs/spark-jifeng-org.apache.spark.deploy.master.Master-1-jifeng01.sohudo.com.out

通过 http://jifeng01:8080/ 看到对应界面

7:启动worker

sbin/start-slaves.sh park://jifeng01:7077

[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ sbin/start-slaves.sh park://jifeng01:7077
localhost: Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
localhost: starting org.apache.spark.deploy.worker.Worker, logging to /home/jifeng/hadoop/spark-1.2.0-bin-hadoop1/sbin/../logs/spark-jifeng-org.apache.spark.deploy.worker.Worker-1-jifeng01.sohudo.com.out[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ sbin/start-slaves.sh park://jifeng01.sohudo.com:7077
localhost: org.apache.spark.deploy.worker.Worker running as process 10273. Stop it first.
[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ 

看界面的变化

8:测试

进入交互模式

[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ master=spark://jifeng01.sohudo.com:7077 ./bin/spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/02/03 23:05:32 INFO spark.SecurityManager: Changing view acls to: jifeng
15/02/03 23:05:32 INFO spark.SecurityManager: Changing modify acls to: jifeng
15/02/03 23:05:32 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jifeng); users with modify permissions: Set(jifeng)
15/02/03 23:05:32 INFO spark.HttpServer: Starting HTTP Server
15/02/03 23:05:32 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/03 23:05:32 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:46617
15/02/03 23:05:32 INFO util.Utils: Successfully started service 'HTTP class server' on port 46617.
Welcome to____              __/ __/__  ___ _____/ /___\ \/ _ \/ _ `/ __/  '_//___/ .__/\_,_/_/ /_/\_\   version 1.2.0/_/Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45)
Type in expressions to have them evaluated.
Type :help for more information.
15/02/03 23:05:36 INFO spark.SecurityManager: Changing view acls to: jifeng
15/02/03 23:05:36 INFO spark.SecurityManager: Changing modify acls to: jifeng
15/02/03 23:05:36 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jifeng); users with modify permissions: Set(jifeng)
15/02/03 23:05:37 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/03 23:05:37 INFO Remoting: Starting remoting
15/02/03 23:05:37 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@jifeng01.sohudo.com:41533]
15/02/03 23:05:37 INFO util.Utils: Successfully started service 'sparkDriver' on port 41533.
15/02/03 23:05:37 INFO spark.SparkEnv: Registering MapOutputTracker
15/02/03 23:05:37 INFO spark.SparkEnv: Registering BlockManagerMaster
15/02/03 23:05:37 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150203230537-f1cb
15/02/03 23:05:37 INFO storage.MemoryStore: MemoryStore started with capacity 265.4 MB
15/02/03 23:05:37 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-0e198e02-fea8-49df-a1ab-8dd08e988cab
15/02/03 23:05:37 INFO spark.HttpServer: Starting HTTP Server
15/02/03 23:05:37 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/03 23:05:37 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:52156
15/02/03 23:05:37 INFO util.Utils: Successfully started service 'HTTP file server' on port 52156.
15/02/03 23:05:38 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/03 23:05:38 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/02/03 23:05:38 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/02/03 23:05:38 INFO ui.SparkUI: Started SparkUI at http://jifeng01.sohudo.com:4040
15/02/03 23:05:38 INFO executor.Executor: Using REPL class URI: http://10.5.4.54:46617
15/02/03 23:05:38 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@jifeng01.sohudo.com:41533/user/HeartbeatReceiver
15/02/03 23:05:38 INFO netty.NettyBlockTransferService: Server created on 39115
15/02/03 23:05:38 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/02/03 23:05:38 INFO storage.BlockManagerMasterActor: Registering block manager localhost:39115 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 39115)
15/02/03 23:05:38 INFO storage.BlockManagerMaster: Registered BlockManager
15/02/03 23:05:38 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.scala> 

测试

输入命令:

val file=sc.textFile("hdfs://jifeng01.sohudo.com:9000/user/jifeng/in/test1.txt")
val count=file.flatMap(line=>line.split(" ")).map(word=>(word,1)).reduceByKey(_+_)
count.collect()

scala> val file=sc.textFile("hdfs://jifeng01.sohudo.com:9000/user/jifeng/in/test1.txt")
15/02/03 23:07:25 INFO storage.MemoryStore: ensureFreeSpace(32768) called with curMem=0, maxMem=278302556
15/02/03 23:07:25 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 32.0 KB, free 265.4 MB)
15/02/03 23:07:25 INFO storage.MemoryStore: ensureFreeSpace(5070) called with curMem=32768, maxMem=278302556
15/02/03 23:07:25 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.0 KB, free 265.4 MB)
15/02/03 23:07:25 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:39115 (size: 5.0 KB, free: 265.4 MB)
15/02/03 23:07:25 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/02/03 23:07:25 INFO spark.SparkContext: Created broadcast 0 from textFile at <console>:12
file: org.apache.spark.rdd.RDD[String] = hdfs://jifeng01.sohudo.com:9000/user/jifeng/in/test1.txt MappedRDD[1] at textFile at <console>:12scala> val count=file.flatMap(line=>line.split(" ")).map(word=>(word,1)).reduceByKey(_+_)
15/02/03 23:07:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/03 23:07:38 WARN snappy.LoadSnappy: Snappy native library not loaded
15/02/03 23:07:38 INFO mapred.FileInputFormat: Total input paths to process : 1
count: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:14scala> count.collect()
15/02/03 23:07:44 INFO spark.SparkContext: Starting job: collect at <console>:17
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Registering RDD 3 (map at <console>:14)
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Got job 0 (collect at <console>:17) with 3 output partitions (allowLocal=false)
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Final stage: Stage 1(collect at <console>:17)
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Parents of final stage: List(Stage 0)
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Missing parents: List(Stage 0)
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Submitting Stage 0 (MappedRDD[3] at map at <console>:14), which has no missing parents
15/02/03 23:07:44 INFO storage.MemoryStore: ensureFreeSpace(3584) called with curMem=37838, maxMem=278302556
15/02/03 23:07:44 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.5 KB, free 265.4 MB)
15/02/03 23:07:44 INFO storage.MemoryStore: ensureFreeSpace(2543) called with curMem=41422, maxMem=278302556
15/02/03 23:07:44 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.5 KB, free 265.4 MB)
15/02/03 23:07:44 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:39115 (size: 2.5 KB, free: 265.4 MB)
15/02/03 23:07:44 INFO storage.BlockManagerMaster: Updated info of block broadcast_1_piece0
15/02/03 23:07:44 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:838
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Submitting 3 missing tasks from Stage 0 (MappedRDD[3] at map at <console>:14)
15/02/03 23:07:44 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 3 tasks
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, ANY, 1310 bytes)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, ANY, 1310 bytes)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, ANY, 1310 bytes)
15/02/03 23:07:44 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/02/03 23:07:44 INFO executor.Executor: Running task 2.0 in stage 0.0 (TID 2)
15/02/03 23:07:44 INFO executor.Executor: Running task 1.0 in stage 0.0 (TID 1)
15/02/03 23:07:44 INFO rdd.HadoopRDD: Input split: hdfs://jifeng01.sohudo.com:9000/user/jifeng/in/test1.txt:6+6
15/02/03 23:07:44 INFO rdd.HadoopRDD: Input split: hdfs://jifeng01.sohudo.com:9000/user/jifeng/in/test1.txt:0+6
15/02/03 23:07:44 INFO rdd.HadoopRDD: Input split: hdfs://jifeng01.sohudo.com:9000/user/jifeng/in/test1.txt:12+1
15/02/03 23:07:44 INFO executor.Executor: Finished task 2.0 in stage 0.0 (TID 2). 1897 bytes result sent to driver
15/02/03 23:07:44 INFO executor.Executor: Finished task 1.0 in stage 0.0 (TID 1). 1897 bytes result sent to driver
15/02/03 23:07:44 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 1897 bytes result sent to driver
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 164 ms on localhost (1/3)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 160 ms on localhost (2/3)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 160 ms on localhost (3/3)
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Stage 0 (map at <console>:14) finished in 0.185 s
15/02/03 23:07:44 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/02/03 23:07:44 INFO scheduler.DAGScheduler: looking for newly runnable stages
15/02/03 23:07:44 INFO scheduler.DAGScheduler: running: Set()
15/02/03 23:07:44 INFO scheduler.DAGScheduler: waiting: Set(Stage 1)
15/02/03 23:07:44 INFO scheduler.DAGScheduler: failed: Set()
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Missing parents for Stage 1: List()
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Submitting Stage 1 (ShuffledRDD[4] at reduceByKey at <console>:14), which is now runnable
15/02/03 23:07:44 INFO storage.MemoryStore: ensureFreeSpace(2112) called with curMem=43965, maxMem=278302556
15/02/03 23:07:44 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.1 KB, free 265.4 MB)
15/02/03 23:07:44 INFO storage.MemoryStore: ensureFreeSpace(1546) called with curMem=46077, maxMem=278302556
15/02/03 23:07:44 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1546.0 B, free 265.4 MB)
15/02/03 23:07:44 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:39115 (size: 1546.0 B, free: 265.4 MB)
15/02/03 23:07:44 INFO storage.BlockManagerMaster: Updated info of block broadcast_2_piece0
15/02/03 23:07:44 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:838
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Submitting 3 missing tasks from Stage 1 (ShuffledRDD[4] at reduceByKey at <console>:14)
15/02/03 23:07:44 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 3 tasks
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 3, localhost, PROCESS_LOCAL, 1056 bytes)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 1.0 (TID 4, localhost, PROCESS_LOCAL, 1056 bytes)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 1.0 (TID 5, localhost, PROCESS_LOCAL, 1056 bytes)
15/02/03 23:07:44 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 3)
15/02/03 23:07:44 INFO executor.Executor: Running task 2.0 in stage 1.0 (TID 5)
15/02/03 23:07:44 INFO executor.Executor: Running task 1.0 in stage 1.0 (TID 4)
15/02/03 23:07:44 INFO storage.ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 3 blocks
15/02/03 23:07:44 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 3 blocks
15/02/03 23:07:44 INFO storage.ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 3 blocks
15/02/03 23:07:44 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 8 ms
15/02/03 23:07:44 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 7 ms
15/02/03 23:07:44 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 8 ms
15/02/03 23:07:44 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 3). 820 bytes result sent to driver
15/02/03 23:07:44 INFO executor.Executor: Finished task 2.0 in stage 1.0 (TID 5). 820 bytes result sent to driver
15/02/03 23:07:44 INFO executor.Executor: Finished task 1.0 in stage 1.0 (TID 4). 990 bytes result sent to driver
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 3) in 44 ms on localhost (1/3)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Finished task 2.0 in stage 1.0 (TID 5) in 44 ms on localhost (2/3)
15/02/03 23:07:44 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 1.0 (TID 4) in 45 ms on localhost (3/3)
15/02/03 23:07:44 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Stage 1 (collect at <console>:17) finished in 0.048 s
15/02/03 23:07:44 INFO scheduler.DAGScheduler: Job 0 finished: collect at <console>:17, took 0.366077 s
res0: Array[(String, Int)] = Array((hadoop,1), (hello,1))scala> 

看到控制台有如下结果:res0: Array[(String, Int)] = Array((hadoop,1), (hello,1))

同时也可以将结果保存到HDFS上

scala>count.saveAsTextFile("hdfs://jifeng01.sohudo.com:9000/user/jifeng/out/test1.txt")

9:停止

./sbin/stop-master.sh

[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ ./sbin/stop-master.sh
stopping org.apache.spark.deploy.master.Master

Spark 1.2 Standalone Mode 单机安装相关推荐

  1. Spark笔记整理(一):spark单机安装部署、分布式集群与HA安装部署+spark源码编译...

    [TOC] spark单机安装部署 1.安装scala 解压:tar -zxvf soft/scala-2.10.5.tgz -C app/ 重命名:mv scala-2.10.5/ scala 配置 ...

  2. linux spark单节点环境搭建,Linux下基于Hadoop的Spark1.2单机安装

    一,安装环境 硬件:虚拟机 操作系统:Centos 6.4 64位 IP:10.51.121.10 主机名:datanode-4 安装用户:root Hadoop:Hadoop2.6,Hadoop2. ...

  3. spark (3)Spark Standalone集群安装介绍

    (1)初学者对于spark的几个疑问 http://aperise.iteye.com/blog/2302481 (2)spark开发环境搭建 http://aperise.iteye.com/blo ...

  4. Spark Standalone架构及安装部署

    文章目录 Standalone架构 Standalone群集部署 Standalone程序测试 Standalone架构 Standalone模式是Spark自带的一种集群模式,不同于前面本地模式启动 ...

  5. Windows下单机安装Spark开发环境

    机器:windows 10 64位. 因Spark支持java.python等语言,所以尝试安装了两种语言环境下的spark开发环境. 1.Java下Spark开发环境搭建 1.1.jdk安装 安装o ...

  6. 产品迭代更新 | 阿列夫科技基于Linkis+DataSphere Studio的单机安装部署实战

    作者:萧寒 GitHub ID :hx23840 阿列夫科技原来的技术平台是基于 Hadoop,Spark 平台搭建的,为了充分的满足业务需求,做了大量接口封装.但是随着业务发展,现有技术平台日渐满足 ...

  7. Spark 1.2 集群环境安装

    我是在单机环境下修改下配置完成的集群模式 单机安装查看:http://blog.csdn.net/wind520/article/details/43458925 参考官网配置:http://spar ...

  8. Centos Linux 单机安装 Hive 、使用 Hive

    Centos Linux 单机安装 Hive .使用 Hive 视频教程链接:https://www.bilibili.com/video/BV1Rv4y117NR/ 1. Hive 简介 hive ...

  9. linux arcgis10.4安装教程,ArcGIS 10.1 for Server安装教程系列—— Linux下的单机安装

    因为Linux具有稳定,功能强大等特性,因此常常被用来做为企业内部的服务器,我们的很多用户也是将ArcGIS Server安装在Linux上,但是对于初次接触Linux的用户,他们都觉得无从下手,Li ...

最新文章

  1. 卷积神经网络必读的100篇经典论文,包含检测/识别/分类/分割多个领域
  2. 更别致的词向量模型(一):simpler glove
  3. 如何对android菜单,Android菜单构造技巧
  4. @程序员,不要瞎努力!比起熬夜更可怕的是“熬日”!
  5. 一个Python小白5个小时爬虫经历,分享一下
  6. web端功能测试总结(一)
  7. python全栈生鲜电商_Vue+Django REST framework 打造生鲜电商项目(学习笔记一)
  8. iOS常用的第三方类库
  9. 苹果被拒的血泪史。。。(update 2015.11)
  10. 破解vysor为专业版
  11. 二次规划问题和MATLAB函数quadprog的使用
  12. Java Base64 加密与解密
  13. 关于字体的px和pt
  14. 8篇论文详解用户历史行为序列建模方法
  15. 亚马逊中东站好做吗?这或许是迄今为止最好的回答!
  16. 赛效:如何在线更改图片格式 图片格式在线转换方法介绍
  17. 1.2.1 python中的函数
  18. Golang big.int类型转int
  19. 基于Google 验证器 实现内网的双因素认证
  20. 解决Angular Kendo UI 导出PDF中文乱码

热门文章

  1. 【数据结构与算法-1】常用数据结构
  2. exfat文件系统_u盘文件系统exfat格式优缺点有哪些【详细介绍】
  3. android 仿微信聊天气泡显示图片,实现仿照微信聊天气泡里显示图片效果的自定义View...
  4. oa项目经验描述_OA系统为企业带来多少实用价值?移动OA又为企业解决哪些问题?...
  5. abstract类中不可以有private的成员_C++类成员的三种访问权限:public/protected/private...
  6. 微信小程序实现文件下载 以及微信小程序保存Excel
  7. Parameter-Efficient Fine-tuning 相关工作梳理
  8. CVPR 2021|可操控的GAN——Hijack-GAN
  9. 机器知道哪吒是部电影吗?解读阿里巴巴概念图谱AliCG
  10. CVPR 2021 | 天津大学提出PISE:形状与纹理解耦的人体图像生成与编辑方法