1.什么是Hadoop

1.1 Hadoop历史渊源

Doug Cutting是Apache Lucene创始人, Apache Nutch项目开始于2002年,Apache Nutch是Apache Lucene项目的一部分。2005年Nutch所有主要算法均完成移植,用MapReduce和NDFS来运行。2006年2月,Nutch将MapReduce和NDFS移出Nutch形成Lucene一个子项目,命名Hadoop。

Hadoop不是缩写,而是虚构名。项目创建者Doug Cutting解释Hadoop的得名:“这个名字是我孩子给一个棕黄色的大象玩具命名的。我的命名标准就是简短,容易发音和拼写,没有太多的意义,并且不会被用于别处。小孩子恰恰是这方面的高手。

1.2 狭义的Hadoop

个人认为,狭义的Hadoop指Apache下Hadoop子项目,该项目由以下模块组成:

  • Hadoop Common: 一系列组件和接口,用于分布式文件系统和通用I/O
  • Hadoop Distributed File System (HDFS?): 分布式文件系统
  • Hadoop YARN: 一个任务调调和资源管理框架
  • Hadoop MapReduce: 分布式数据处理编程模型,用于大规模数据集并行运算

狭义的Hadoop主要解决三个问题,提供HDFS解决分布式存储问题,提供YARN解决任务调度和资源管理问题,提供一种编程模型,让开发者可以进来编写代码做离线大数据处理

1.3 广义的Hadoop

个人认为,广义的Hadoop指整个Hadoop生态圈,生态圈中包含各个子项目,每个子项目为了解决某种场合问题而生,主要组成如下图:

2.Hadoop集群部署两种集群部署方式

2.1 hadoop1.x和hadoop2.x都支持的namenode+secondarynamenode方式

2.2 仅hadoop2.x支持的active namenode+standby namenode方式

2.3 Hadoop官网关于集群方式介绍

1)单机Hadoop环境搭建

http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html

2)集群方式

集群方式一(hadoop1.x和hadoop2.x都支持的namenode+secondarynamenode方式)

http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/ClusterSetup.html

集群方式二(仅hadoop2.x支持的active namenode+standby namenode方式,也叫HADOOP HA方式),这种方式又将HDFS的HA和YARN的HA单独分开讲解

HDFS HA(zookeeper+journalnode)http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

HDFS HA(zookeeper+NFS)http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailability

 YARN HA(zookeeper)http://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html

生产环境多采用HDFS(zookeeper+journalnode)(active NameNode+standby NameNode+JournalNode+DFSZKFailoverController+DataNode)+YARN(zookeeper)(active ResourceManager+standby ResourceManager+NodeManager)方式,这里我讲解的是hadoop1.x和hadoop2.x都支持的namenode+secondarynamenode方式,这种方式主要用于学习实践,因为它需要的机器台数低但存在namenode单节点问题

3.Hadoop安装

3.1 所需软件包

  1. JavaTM1.7.x,必须安装,建议选择Sun公司发行的Java版本。经验证目前hadoop2.7.1暂不支持jdk1.6,这里用的是jdk1.7,下载地址为:http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
  2. ssh 必须安装并且保证 sshd一直运行,以便用Hadoop 脚本管理远端Hadoop守护进程。
  3. hadoop安装包下载地址:http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz

3.2 环境

  1. 操作系统: Red Hat Enterprise Linux Server release 5.8 (Tikanga)
  2. 主从服务器:Master 192.168.181.66  Slave1 192.168.88.21   Slave2 192.168.88.22

3.3 SSH免密码登录

首先需要在linux上安装SSH(因为Hadoop要通过SSH链接集群其他机器进行读写操作),请自行安装。Hadoop需要通过SSH登录到各个节点进行操作,我用的是hadoop用户,每台服务器都生成公钥,再合并到authorized_keys。

1.CentOS默认没有启动ssh无密登录,去掉/etc/ssh/sshd_config其中2行的注释,每台服务器都要设置。       修改前:

Java代码  
  1. #RSAAuthentication yes
  2. #PubkeyAuthentication yes

修改后(修改后需要执行service sshd restart):

Java代码  
  1. RSAAuthentication yes
  2. PubkeyAuthentication yes

后续请参考http://aperise.iteye.com/blog/2253544

3.4 安装JDK

Hadoop2.7需要JDK7,JDK1.6在Hadoop启动时候会报如下错误

Java代码  
  1. [hadoop@nmsc1 bin]# ./hdfs namenode -format
  2. Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/hadoop/hdfs/server/namenode/NameNode : Unsupported major.minor version 51.0
  3. at java.lang.ClassLoader.defineClass1(Native Method)
  4. at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
  5. at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
  6. at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
  7. at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
  8. at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
  9. at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
  10. at java.security.AccessController.doPrivileged(Native Method)
  11. at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
  12. at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
  13. at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
  14. at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
  15. Could not find the main class: org.apache.hadoop.hdfs.server.namenode.NameNode.  Program will exit.

1.下载jdk-7u65-linux-x64.gz放置于/opt/java/jdk-7u65-linux-x64.gz.

2.解压,输入命令tar -zxvf jdk-7u65-linux-x64.gz.

3.编辑/etc/profile,在文件末尾追加如下内容

Java代码  
  1. export JAVA_HOME=/opt/java/jdk1.7.0_65
  2. export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
  3. export PATH=$PATH:$JAVA_HOME/bin

4.使配置生效,输入命令,source /etc/profile

5.输入命令java -version,检查JDK环境是否配置成功。

Java代码  
  1. [hadoop@nmsc2 java]# java -version
  2. java version "1.7.0_65"
  3. Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
  4. Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
  5. [hadoop@nmsc2 java]#

3.5 安装Hadoop2.7

1.只在master上下载hadoop-2.7.1.tar.gz并放置于/opt/hadoop-2.7.1.tar.gz.

2.解压,输入命令tar -xzvf hadoop-2.7.1.tar.gz.

3.在/home目录下创建数据存放的文件夹,hadoop/tmp、hadoop/hdfs、hadoop/hdfs/data、hadoop/hdfs/name.

4.配置/opt/hadoop-2.7.1/etc/hadoop目录下的core-site.xml

Xml代码  
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <!--
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License. See accompanying LICENSE file.
  13. -->
  14. <!-- Put site-specific property overrides in this file. -->
  15. <configuration>
  16. <!-- 开启垃圾回收站功能,HDFS文件删除后先进入垃圾回收站,垃圾回收站最长保留数据时间为1天,超过一天后就删除 -->
  17. <property>
  18. <name>fs.trash.interval</name>
  19. <value>1440</value>
  20. </property>
  21. <property>
  22. <!--NameNode的URI。格式:【hdfs://主机名/】-->
  23. <name>fs.defaultFS</name>
  24. <value>hdfs://192.168.181.66:9000</value>
  25. </property>
  26. <property>
  27. <!--hadoop.tmp.dir 是hadoop文件系统依赖的基础配置,很多路径都依赖它。如果hdfs-site.xml中不配置namenode和datanode的存放位置,默认就放在这个路径中-->
  28. <name>hadoop.tmp.dir</name>
  29. <value>file:/home/hadoop/tmp</value>
  30. </property>
  31. <property>
  32. <!--hadoop访问文件的IO操作都需要通过代码库。因此,在很多情况下,io.file.buffer.size都被用来设置SequenceFile中用到的读/写缓存大小。不论是对硬盘或者是网络操作来讲,较大的缓存都可以提供更高的数据传输,但这也就意味着更大的内存消耗和延迟。这个参数要设置为系统页面大小的倍数,以byte为单位,默认值是4KB,一般情况下,可以设置为64KB(65536byte),这里设置128K-->
  33. <name>io.file.buffer.size</name>
  34. <value>131072</value>
  35. </property>
  36. <property>
  37. <name>dfs.namenode.handler.count</name>
  38. <value>200</value>
  39. <description>The number of server threads for the namenode.</description>
  40. </property>
  41. <property>
  42. <name>dfs.datanode.handler.count</name>
  43. <value>100</value>
  44. <description>The number of server threads for the datanode.</description>
  45. </property>
  46. </configuration>

5.配置/opt/hadoop-2.7.1/etc/hadoop目录下的hdfs-site.xml

Xml代码  
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <!--
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License. See accompanying LICENSE file.
  13. -->
  14. <!-- Put site-specific property overrides in this file. -->
  15. <configuration>
  16. <property>
  17. <!--dfs.namenode.name.dir - 这是NameNode结点存储hadoop文件系统信息的本地系统路径。这个值只对NameNode有效,DataNode并不需要使用到它。-->
  18. <name>dfs.namenode.name.dir</name>
  19. <value>file:/home/hadoop/hdfs/name</value>
  20. </property>
  21. <property>
  22. <!--dfs.datanode.data.dir - 这是DataNode结点被指定要存储数据的本地文件系统路径。DataNode结点上的这个路径没有必要完全相同,因为每台机器的环境很可能是不一样的。但如果每台机器上的这个路径都是统一配置的话,会使工作变得简单一些。默认的情况下,它的值-->
  23. <name>dfs.datanode.data.dir</name>
  24. <value>file:/home/hadoop/hdfs/data</value>
  25. </property>
  26. <property>
  27. <!--dfs.replication -它决定着系统里面的文件块的数据备份个数。对于一个实际的应用,它应该被设为3(这个数字并没有上限,但更多的备份可能并没有作用,而且会占用更多的空间)。少于三个的备份,可能会影响到数据的可靠性(系统故障时,也许会造成数据丢失)-->
  28. <name>dfs.replication</name>
  29. <value>3</value>
  30. </property>
  31. <property>
  32. <!--配置hadoop做实验,在网上看了许多有关hadoop的配置,但是这些配置多数是将namenode和secondaryNameNode配置在同一台计算机上,这种配置方法如果是做做实验的还可以,如果应用到实际中,存在较大风险,如果存放namenode的主机出现问题,整个文件系统将被破坏,严重的情况是所有文件都丢失。配置hadoop2.7将namenode和secondaryNameNode配置在不同的机器上,这样的实用价值更大-->
Xml代码  
  1. <name>dfs.namenode.secondary.http-address</name>
  2. <value>192.168.181.66:9001</value>
  3. </property>
  4. <property>
  5. <!--namenode的hdfs-site.xml是必须将dfs.webhdfs.enabled属性设置为true,否则就不能使用webhdfs的LISTSTATUS、LISTFILESTATUS等需要列出文件、文件夹状态的命令,因为这些信息都是由namenode来保存的。
  6. 访问namenode的hdfs使用50070端口,访问datanode的webhdfs使用50075端口。访问文件、文件夹信息使用namenode的IP和50070端口,访问文件内容或者进行打开、上传、修改、下载等操作使用datanode的IP和50075端口。要想不区分端口,直接使用namenode的IP和端口进行所有的webhdfs操作,就需要在所有的datanode上都设置hefs-site.xml中的dfs.webhdfs.enabled为true。-->
  7. <name>dfs.webhdfs.enabled</name>
  8. <value>true</value>
  9. </property>
  10. <property>
  11. <name>dfs.datanode.du.reserved</name>
  12. <value>107374182400</value>
  13. <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use.
  14. </description>
  15. </property>
  16. <property>
  17. <!--这里设置HDFS客户端最大超时时间,尽量改大,后期hbase经常会因为该问题频繁宕机-->
  18. <name>dfs.client.socket-timeout</name>
  19. <value>600000/value>
  20. </property>
  21. <property>
  22. <!--这里设置Hadoop允许打开最大文件数,默认4096,不设置的话会提示xcievers exceeded错误-->
  23. <name>dfs.datanode.max.transfer.threads</name>
  24. <value>409600</value>
  25. </property>
  26. </configuration>

6.配置/opt/hadoop-2.7.1/etc/hadoop目录下的mapred-site.xml.template另存为mapred-site.xml ,修改内容如下:

Xml代码  
  1. <?xml version="1.0"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <!--
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License. See accompanying LICENSE file.
  13. -->
  14. <!-- Put site-specific property overrides in this file. -->
  15. <configuration>
  16. <property>
  17. <!--新框架支持第三方 MapReduce 开发框架以支持如 SmartTalk/DGSG 等非 Yarn 架构,注意通常情况下这个配置的值都设置为 Yarn,如果没有配置这项,那么提交的 Yarn job 只会运行在 locale 模式,而不是分布式模式-->
  18. <name>mapreduce.framework.name</name>
  19. <value>yarn</value>
  20. </property>
  21. <!--Hadoop自带了一个历史服务器,可以通过历史服务器查看已经运行完的Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。默认情况下,Hadoop历史服务器是没有启动的,我们可以通过下面的命令来启动Hadoop历史服务器
  22. 参数是在mapred-site.xml文件中进行配置,mapreduce.jobhistory.address和mapreduce.jobhistory.webapp.address默认的值分别是0.0.0.0:10020和0.0.0.0:19888,大家可以根据自己的情况进行相应的配置,参数的格式是host:port。配置完上述的参数之后,重新启动Hadoop jobhistory,这样我们就可以在mapreduce.jobhistory.webapp.address参数配置的主机上对Hadoop历史作业情况经行查看-->
  23. <property>
  24. <name>mapreduce.jobhistory.address</name>
  25. <value>192.168.<span style="line-height: 1.5;">181.66</span><span style="font-size: 1em; line-height: 1.5;">:10020</value></span>
  26. </property>
  27. <property>
  28. <name>mapreduce.jobhistory.webapp.address</name>
  29. <value>192.168.88.21:19888</value>
  30. </property>
  31. </configuration>
7.配置/opt/hadoop-2.7.1/etc/hadoop目录下的yarn-site.xml ,修改内容如下:
Xml代码  
  1. <?xml version="1.0"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <configuration>
  4. <property>
  5. <!--Shuffle service 需要加以设置的Map Reduce的应用程序服务。-->
  6. <name>yarn.nodemanager.aux-services</name>
  7. <value>mapreduce_shuffle</value>
  8. </property>
  9. <property>
  10. <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
  11. <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  12. </property>
  13. <property>
  14. <!--新框架中 NodeManager 与 RM 通信的接口地址-->
  15. <name>yarn.resourcemanager.address</name>
  16. <value>192.168.<span style="line-height: 1.5;">181.66</span><span style="font-size: 1em; line-height: 1.5;">:8032</value></span>
  17. </property>
  18. <property>
  19. <!--NodeManger 需要知道 RM 主机的 scheduler 调度服务接口地址-->
  20. <name>yarn.resourcemanager.scheduler.address</name>
  21. <value>192.168.<span style="line-height: 1.5;">181.66</span><span style="font-size: 1em; line-height: 1.5;">:8030</value></span>
  22. </property>
  23. <property>
  24. <!--新框架中 NodeManager 需要向 RM 报告任务运行状态供 Resouce 跟踪,因此 NodeManager 节点主机需要知道 RM 主机的 tracker 接口地址-->
  25. <name>yarn.resourcemanager.resource-tracker.address</name>
  26. <value>192.168.<span style="line-height: 1.5;">181.66</span><span style="font-size: 1em; line-height: 1.5;">:8031</value></span>
  27. </property>
  28. <property>
  29. <!--ResourceManager 对管理员暴露的访问地址。管理员通过该地址向RM发送管理命令等。默认值:${yarn.resourcemanager.hostname}:8033-->
  30. <name>yarn.resourcemanager.admin.address</name>
  31. <value>192.168.<span style="line-height: 1.5;">181.66</span><span style="font-size: 1em; line-height: 1.5;">:8033</value></span>
  32. </property>
  33. <property>
  34. <!--参数解释:ResourceManager对外web ui地址。用户可通过该地址在浏览器中查看集群各类信息。默认值:${yarn.resourcemanager.hostname}:8088-->
  35. <name>yarn.resourcemanager.webapp.address</name>
  36. <value>192.168.<span style="line-height: 1.5;">181.66</span><span style="font-size: 1em; line-height: 1.5;">:8088</value></span>
  37. </property>
  38. <property>
  39. <!--参数解释:NodeManager总的可用物理内存。注意,该参数是不可修改的,一旦设置,整个运行过程中不 可动态修改。另外,该参数的默认值是8192MB,即使你的机器内存不够8192MB,YARN也会按照这些内存来使用(傻不傻?),因此,这个值通过一 定要配置。不过,Apache已经正在尝试将该参数做成可动态修改的。默认值:8192(8GB)当该值配置小于1024(1GB)时,NM是无法启动的!会报错:
  40. NodeManager from  slavenode2 doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager.-->
  41. <name>yarn.nodemanager.resource.memory-mb</name>
  42. <value>2048</value>
  43. </property>
  44. </configuration>

8.配置/home/hadoop/hadoop-2.7.1/etc/hadoop目录下hadoop-env.sh、yarn-env.sh的JAVA_HOME,不设置的话,启动不了

Java代码  
  1. export JAVA_HOME=/opt/java/jdk1.7.0_65

9.配置/home/hadoop/hadoop-2.7.1/etc/hadoop目录下的slaves,删除默认的localhost,增加1个从节点

Java代码  
  1. #localhost
  2. 192.168.88.22
  3. 192.168.88.21

10.将配置好的Hadoop复制到各个节点对应位置上,通过scp传送

Java代码  
  1. chmod -R 777 /home/hadoop
  2. chmod -R 777 /opt/hadoop-2.7.1
  3. scp -r /opt/hadoop-2.7.1 192.168.88.22:/opt/
  4. scp -r /home/hadoop 192.168.88.22:/home
  5. scp -r /opt/hadoop-2.7.1 192.168.88.21:/opt/
  6. scp -r /home/hadoop 192.168.88.21:/home

11.在Master服务器启动hadoop,从节点会自动启动,进入/home/hadoop/hadoop-2.7.1目录

(1)初始化,输入命令./hdfs namenode -format
Java代码  
  1. [hadoop@nmsc2 bin]# cd /opt/hadoop-2.7.1/bin/
  2. [hadoop@nmsc2 bin]# ls
  3. container-executor  hadoop  hadoop.cmd  hdfs  hdfs.cmd  mapred  mapred.cmd  rcc  test-container-executor  yarn  yarn.cmd
  4. [hadoop@nmsc2 bin]# ./hdfs namenode -format
  5. 15/09/23 16:03:17 INFO namenode.NameNode: STARTUP_MSG:
  6. /************************************************************
  7. STARTUP_MSG: Starting NameNode
  8. STARTUP_MSG:   host = nmsc2/127.0.0.1
  9. STARTUP_MSG:   args = [-format]
  10. STARTUP_MSG:   version = 2.7.1
  11. STARTUP_MSG:   classpath = /opt/hadoop-2.7.1/etc/hadoop:/opt/hadoop-2.7.1/share/hadoop/common/lib/httpclient-4.2.5.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jersey-json-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/xz-1.0.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-logging-1.1.3.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-cli-1.2.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-compress-1.4.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/curator-framework-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jersey-server-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/curator-client-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/avro-1.7.4.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/gson-2.2.4.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/mockito-all-1.8.5.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jsr305-3.0.0.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/paranamer-2.3.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/guava-11.0.2.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jsch-0.1.42.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jsp-api-2.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/servlet-api-2.5.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-lang-2.6.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/hadoop-annotations-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/activation-1.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-codec-1.4.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-io-2.4.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/httpcore-4.2.5.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/xmlenc-0.52.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-collections-3.2.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-httpclient-3.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/netty-3.6.2.Final.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-net-3.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-math3-3.1.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jersey-core-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/zookeeper-3.4.6.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-digester-1.8.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jettison-1.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/junit-4.11.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jetty-util-6.1.26.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/stax-api-1.0-2.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-configuration-1.6.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/hadoop-auth-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/log4j-1.2.17.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jets3t-0.9.0.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/asm-3.2.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/hamcrest-core-1.3.jar:/opt/hadoop-2.7.1/share/hadoop/common/lib/jetty-6.1.26.jar:/opt/hadoop-2.7.1/share/hadoop/common/hadoop-nfs-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1-tests.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/commons-io-2.4.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/asm-3.2.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/hadoop-hdfs-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/hadoop-hdfs-2.7.1-tests.jar:/opt/hadoop-2.7.1/share/hadoop/hdfs/hadoop-hdfs-nfs-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jersey-json-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/xz-1.0.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/commons-cli-1.2.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jersey-server-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/guice-3.0.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/guava-11.0.2.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/aopalliance-1.0.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/servlet-api-2.5.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/commons-lang-2.6.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/javax.inject-1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/activation-1.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/commons-codec-1.4.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/commons-io-2.4.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jersey-core-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jettison-1.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jersey-client-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/log4j-1.2.17.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/asm-3.2.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/jetty-6.1.26.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-common-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-server-common-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-registry-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-api-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-client-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/xz-1.0.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/guice-3.0.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/hadoop-annotations-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/javax.inject-1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/junit-4.11.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/asm-3.2.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.1-tests.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.1.jar:/opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.1.jar:/contrib/capacity-scheduler/*.jar
  12. STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a; compiled by 'jenkins' on 2015-06-29T06:04Z
  13. STARTUP_MSG:   java = 1.7.0_65
  14. ************************************************************/
  15. 15/09/23 16:03:17 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
  16. 15/09/23 16:03:17 INFO namenode.NameNode: createNameNode [-format]
  17. 15/09/23 16:03:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  18. Formatting using clusterid: CID-695216ce-3c4e-47e4-a31f-24a7e40d8791
  19. 15/09/23 16:03:18 INFO namenode.FSNamesystem: No KeyProvider found.
  20. 15/09/23 16:03:18 INFO namenode.FSNamesystem: fsLock is fair:true
  21. 15/09/23 16:03:18 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
  22. 15/09/23 16:03:18 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
  23. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
  24. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: The block deletion will start around 2015 Sep 23 16:03:18
  25. 15/09/23 16:03:18 INFO util.GSet: Computing capacity for map BlocksMap
  26. 15/09/23 16:03:18 INFO util.GSet: VM type       = 64-bit
  27. 15/09/23 16:03:18 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
  28. 15/09/23 16:03:18 INFO util.GSet: capacity      = 2^21 = 2097152 entries
  29. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
  30. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: defaultReplication         = 1
  31. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: maxReplication             = 512
  32. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: minReplication             = 1
  33. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
  34. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false
  35. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
  36. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
  37. 15/09/23 16:03:18 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
  38. 15/09/23 16:03:18 INFO namenode.FSNamesystem: fsOwner             = hadoop(auth:SIMPLE)
  39. 15/09/23 16:03:18 INFO namenode.FSNamesystem: supergroup          = supergroup
  40. 15/09/23 16:03:18 INFO namenode.FSNamesystem: isPermissionEnabled = true
  41. 15/09/23 16:03:18 INFO namenode.FSNamesystem: HA Enabled: false
  42. 15/09/23 16:03:18 INFO namenode.FSNamesystem: Append Enabled: true
  43. 15/09/23 16:03:18 INFO util.GSet: Computing capacity for map INodeMap
  44. 15/09/23 16:03:18 INFO util.GSet: VM type       = 64-bit
  45. 15/09/23 16:03:18 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
  46. 15/09/23 16:03:18 INFO util.GSet: capacity      = 2^20 = 1048576 entries
  47. 15/09/23 16:03:18 INFO namenode.FSDirectory: ACLs enabled? false
  48. 15/09/23 16:03:18 INFO namenode.FSDirectory: XAttrs enabled? true
  49. 15/09/23 16:03:18 INFO namenode.FSDirectory: Maximum size of an xattr: 16384
  50. 15/09/23 16:03:18 INFO namenode.NameNode: Caching file names occuring more than 10 times
  51. 15/09/23 16:03:18 INFO util.GSet: Computing capacity for map cachedBlocks
  52. 15/09/23 16:03:18 INFO util.GSet: VM type       = 64-bit
  53. 15/09/23 16:03:18 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
  54. 15/09/23 16:03:18 INFO util.GSet: capacity      = 2^18 = 262144 entries
  55. 15/09/23 16:03:18 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
  56. 15/09/23 16:03:18 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
  57. 15/09/23 16:03:18 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
  58. 15/09/23 16:03:18 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
  59. 15/09/23 16:03:18 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
  60. 15/09/23 16:03:18 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
  61. 15/09/23 16:03:18 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
  62. 15/09/23 16:03:18 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
  63. 15/09/23 16:03:18 INFO util.GSet: Computing capacity for map NameNodeRetryCache
  64. 15/09/23 16:03:18 INFO util.GSet: VM type       = 64-bit
  65. 15/09/23 16:03:18 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
  66. 15/09/23 16:03:18 INFO util.GSet: capacity      = 2^15 = 32768 entries
  67. 15/09/23 16:03:18 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1469452028-127.0.0.1-1442995398776
  68. 15/09/23 16:03:18 INFO common.Storage: Storage directory /home/hadoop/dfs/name has been successfully formatted.
  69. 15/09/23 16:03:19 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
  70. 15/09/23 16:03:19 INFO util.ExitUtil: Exiting with status 0
  71. 15/09/23 16:03:19 INFO namenode.NameNode: SHUTDOWN_MSG:
  72. /************************************************************
  73. SHUTDOWN_MSG: Shutting down NameNode at nmsc2/127.0.0.1
  74. ************************************************************/
  75. [hadoop@nmsc2 bin]#

(2)全部启动./start-all.sh,也可以分开./start-dfs.sh、./start-yarn.sh

Java代码  
  1. [hadoop@nmsc1 bin]# cd /opt/hadoop-2.7.1/sbin/
  2. [hadoop@nmsc1 sbin]# ./start-all.sh
  3. This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
  4. 15/09/23 16:48:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  5. Starting namenodes on [192.168.88.21]
  6. 192.168.88.21: starting namenode, logging to /opt/hadoop-2.7.1/logs/hadoop-hadoop-namenode-nmsc1.out
  7. 192.168.88.22: starting datanode, logging to /opt/hadoop-2.7.1/logs/hadoop-hadoop-datanode-nmsc2.out
  8. Starting secondary namenodes [192.168.88.21]
  9. 192.168.88.21: starting secondarynamenode, logging to /opt/hadoop-2.7.1/logs/hadoop-hadoop-secondarynamenode-nmsc1.out
  10. 15/09/23 16:48:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  11. starting yarn daemons
  12. resourcemanager running as process 5881. Stop it first.
  13. 192.168.88.22: starting nodemanager, logging to /opt/hadoop-2.7.1/logs/yarn-hadoop-nodemanager-nmsc2.out
  14. [hadoop@nmsc1 sbin]#

(3)停止的话,输入命令,./stop-all.sh

关于Hadoop相关shell命令调用关系,见下图:

(4)输入命令,jps,可以看到相关信息

Java代码  
  1. [hadoop@nmsc1 sbin]# jps
  2. 14201 Jps
  3. 5881 ResourceManager
  4. 13707 NameNode
  5. 13924 SecondaryNameNode
  6. [hadoop@nmsc1 sbin]#

12.Web访问,要先开放端口或者直接关闭防火墙

(1)输入命令

systemctl stop firewalld.service(centos)

chkconfig iptables on(redhat 防火墙开启)

chkconfig iptables off (redhat 防火墙关闭)

(2)浏览器打开http://192.168.181.66:8088/(ResourceManager对外web ui地址。用户可通过该地址在浏览器中查看集群各类信息)

(3)浏览器打开http://192.168.181.66:50070/           (NameNode)


      (4)浏览器打开http://192.168.181.66:9001          (备用第二个NameNode)

13.安装完成。这只是大数据应用的开始,之后的工作就是,结合自己的情况,编写程序调用Hadoop的接口,发挥hdfs、mapreduce的作用

3.6 遇到的问题

1)hadoop 启动的时候datanode报错 Problem connecting to server

解决办法是修改/etc/hosts,详见http://blog.csdn.net/renfengjun/article/details/25320043

2)启动yarn时候nodemanager无法启动,报错doesn't satisfy minimum allocations, Sending SHUTDOWN signal原因是yarn-site.xml中yarn.nodemanager.resource.memory-mb配置的nodemanager可使用的内存过低,最低不能小于1024M

3)hbase报错:util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using built

解决办法:

sudo rm -r /opt/hbase-1.2.1/lib/native

sudo mkdir /opt/hbase-1.2.1/lib/native

sudo mkdir /opt/hbase-1.2.1/lib/native/Linux-amd64-64

sudo cp -r /opt/hadoop-2.7.1/lib/native/* /opt/hbase-1.2.1/lib/native/Linux-amd64-64/

4)在hbase创建表时指定压缩方式报错”Compression algorithm 'snappy' previously failed test. Set hbase.table.sanity.checks to false at conf“

建表语句为:create 'signal1', { NAME => 'info', COMPRESSION => 'SNAPPY' }, SPLITS => ['00000000000000000000000000000000','10000000000000000000000000000000','20000000000000000000000000000000','30000000000000000000000000000000','40000000000000000000000000000000','50000000000000000000000000000000','60000000000000000000000000000000','70000000000000000000000000000000','80000000000000000000000000000000','90000000000000000000000000000000']

解决办法是在hbase-site.xml中增加配置

<property>

<name>hbase.table.sanity.checks</name>

<value>false</value>

</property>

5)nodemanager无法启动,报错如下

Java代码  
  1. 2016-01-26 18:45:10,891 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[]
  2. 2016-01-26 18:45:11,778 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unexpected error starting NodeStatusUpdater
  3. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: NodeManager from  slavery01 doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager.
  4. at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
  5. at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
  6. at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
  7. at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
  8. at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
  9. at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
  10. at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
  11. at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
  12. 2016-01-26 18:45:11,781 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: NodeManager from  slavery01 doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager.

解决办法

Java代码  
  1. <property>
  2. <!--该值不能小于1024-->
  3. <name>yarn.nodemanager.resource.memory-mb</name>
  4. <value>2048</value>
  5. </property>
  6. </configuration>

6) WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

解决办法  http://zhidao.baidu.com/link?url=_cOK3qt3yzgWwifuMGuZhSOTUyKTiYZfyHr3Xd1id345B9SvSIGsJ-mGLDsk4QseWmBnY5LjxgwHwjKQ4UTFtm8IV6J2im4QfSRh__MhzpW

7) 很多人单机版Hadoop遇到错误Hadoop hostname: Unknown host

解决办法:先ifconfig查看本机IP和用hostname查看主机名,比如为 192.168.55.128和hadoop,那就在/etc/hosts增加一条记录192.168.55.128 hadoop,然后同步修改core-site.xml和mapred-site.xml中localhost为hadoop,修改完后执行./hdfs namenode format,执行完后sbin/start-all.sh就可以了

3.7 网上找到一网友关于hadoop2.7+hbase1.0+hive1.2安装的总结,详见附件“我学大数据技术(hadoop2.7+hbase1.0+hive1.2).pdf”

另外写的比较好的文章有:

Hadoop2.7.1分布式安装-准备篇      http://my.oschina.net/allman90/blog/485352

Hadoop2.7.1分布式安装-安装篇      http://my.oschina.net/allman90/blog/486117

3.8 常用shell

Java代码  
  1. #显示hdfs指定路径/user/下的文件和文件夹
  2. bin/hdfs dfs –ls  /user/
  3. #将本地文件/opt/smsmessage.txt上传到hdfs的目录/user/下
  4. bin/hdfs dfs –put /opt/smsmessage.txt  /user/
  5. #将hdfs上的文件/user/smsmessage.txt下载到本地/opt/目录下
  6. bin/hdfs dfs -get /user/smsmessage.txt /opt/
  7. #查看hdfs中的文本文件/opt/smsmessage.txt内容
  8. bin/hdfs dfs  –cat /opt/smsmessage.txt
  9. #查看hdfs中的/user/smsmessage.txt文件内容
  10. bin/hdfs dfs  –text /user/smsmessage.txt
  11. #将hdfs上的文件/user/smsmessage.txt删除
  12. bin/hdfs dfs –rm /user/smsmessage.txt
  13. #在执行balance 操作之前,可设置一下balance 操作占用的网络带宽,设置10M,10*1024*1024
  14. bin/hdfs dfsadmin -setBalancerBandwidth <bandwidth in bytes per second>
  15. #执行Hadoop自带Wordcount例子,/input目录必须存在于HDFS上,且其下有文件,/output目录是输出目录,mapreduce会自动创建
  16. bin/hadoop jar /opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input /output
  17. #用这个命令可以检查整个文件系统的健康状况,但是要注意它不会主动恢复备份缺失的block,这个是由NameNode单独的线程异步处理的。
  18. cd /opt/hadoop-2.7.1/bin
  19. ./hdfs  fsck /
  20. #Hadoop设置根目录/下的备份数
  21. cd /opt/hadoop-2.7.1/bin
  22. ./hadoop fs -setrep -R 2 /
  23. #也可以使用如下命令
  24. ./hdfs dfs -setrep -R 2 /
  25. #打印出了这个文件每个block的详细信息包括datanode的机架信息。
  26. cd /opt/hadoop-2.7.1/bin
  27. bin/hadoop  fsck /user/distribute-hadoop-boss/tmp/pgv/20090813/1000000103/input/JIFEN.QQ.COM.2009-08-13-18.30 -files -blocks -locations  -racks
  28. #查看配置文件hdfs-site.xml中配置项dfs.client.block.write.replace-datanode-on-failure.enable和dfs.client.block.write.replace-datanode-on-failure.policy配置的值
  29. cd /opt/hadoop-2.7.1/bin
  30. ./hdfs getconf -confKey dfs.client.block.write.replace-datanode-on-failure.enable
  31. ./hdfs getconf -confKey dfs.client.block.write.replace-datanode-on-failure.policy
  32. #启动HDFS,该命令会读取slaves和配置文件,将所有节点HDFS相关服务启动
  33. cd /opt/hadoop-2.7.1/sbin
  34. ./start-dfs.sh
  35. #启动yarn,该命令会读取slaves和配置文件,将所有节点YARN相关服务启动
  36. cd /opt/hadoop-2.7.1/sbin
  37. ./start-yarn.sh
  38. #只在单机上启动服务namenode、secondarynamenode、journalnode、datanode
  39. ./hadoop-daemon.sh start/stop namenode
  40. ./hadoop-daemon.sh start/stop secondarynamenode
  41. ./hadoop-daemon.sh start/stop journalnode
  42. ./hadoop-daemon.sh start/stop datanode
  43. #查看是否在安全模式
  44. [hadoop@nmsc2 bin]$ cd /opt/hadoop-2.7.1/bin
  45. [hadoop@nmsc2 bin]$ ./hdfs dfsadmin -safemode get
  46. Safe mode is OFF
  47. [hadoop@nmsc2 bin]$
  48. #离开安全模式
  49. [hadoop@nmsc2 bin]$ cd /opt/hadoop-2.7.1/bin
  50. [hadoop@nmsc2 bin]$ ./hdfs dfsadmin -safemode leave
  51. Safe mode is OFF
  52. [hadoop@nmsc2 bin]$
  53. #查看某些参数配置值
  54. [hadoop@nmsc1 bin]$ cd /opt/hadoop-2.7.1/bin
  55. [hadoop@nmsc1 bin]$ ./hdfs getconf -confKey dfs.datanode.handler.count
  56. 100
  57. [hadoop@nmsc1 bin]$ ./hdfs getconf -confKey dfs.namenode.handler.count
  58. 200
  59. [hadoop@nmsc1 bin]$ ./hdfs getconf -confKey dfs.namenode.avoid.read.stale.datanode
  60. false
  61. [hadoop@nmsc1 bin]$ ./hdfs getconf -confKey dfs.namenode.avoid.write.stale.datanode
  62. false
  63. [hadoop@nmsc1 bin]$

Hadoop 简介 及 安装相关推荐

  1. hadoop基础一:Hadoop简介、安装

    你的点赞与评论是我最大的创作动力! hadoop简介: hadoop平台是一个可靠的.可扩展的.可分布式计算的开源软件. Apache Hadoop平台是一个框架,允许使用简单的编程模型.该平台被设计 ...

  2. Hadoop简介与分布式安装

    Hadoop的基本概念和分布式安装: Hadoop 简介 Hadoop 是Apache Lucene创始人道格·卡丁(Doug Cutting)创建的,Lucene是一个应用广泛的文本搜索库,Hado ...

  3. 安装、进程-云计算学习笔记---hadoop的简介,以及安装,用命令实现对hdfs系统进行文件的上传下载-by小雨...

    本文是一篇关于安装.进程-的帖子 1.Hadoop简介 1.hadoop的生诞 l  Nutch和Lucene之父Doug Cutting在2006年成完Hadoop目项. l  Hadoop并非一个 ...

  4. Py之GraphLab:graphlab库的简介、安装、使用方法之详细攻略

    Py之GraphLab:graphlab库的简介.安装.使用方法之详细攻略 目录 graphlab库的简介 1.GraphLab是什么 2.GraphLab的五大特点 3.为什么需要GraphLab ...

  5. Redis简介及安装

    Redis简介及安装 文章目录 Redis简介及安装 一.nosql介绍 1.NoSQL 2.NoSQL和SQL数据库的比较: 二.Redis 1.简介 2.Redis特性 3.Redis 优势 4. ...

  6. Azkaban简介及安装教程

    前言: 最近在实际工作中玩到了Azkaban,虽然之前有简单的接触,但是真正用到的时候,才能体会到这个工具的实用性有多强.下面就写个系列文章来记录下azkaban从简介及安装配置再到简单使用的一个过程 ...

  7. Docker学习一:Docker简介与安装

    前言 本次学习来自于datawhale组队学习: 教程地址为: https://github.com/datawhalechina/team-learning-program/tree/master/ ...

  8. Hadoop完全分布式安装

    Hadoop 博客链接:http://hphblog.cn/2018/12/17/Hadoop简介与分布式安装/ 简介 Hadoop 是Apache Lucene创始人道格·卡丁(Doug Cutti ...

  9. 云计算学习笔记004---hadoop的简介,以及安装,用命令实现对hdfs系统进行文件的上传下载

    1.Hadoop简介 1.hadoop的诞生 l  Nutch和Lucene之父Doug Cutting在2006年完成Hadoop项目. l  Hadoop并不是一个单词,它来源于DougCutti ...

  10. beeline安装_Hive 系列 之 简介与安装

    下面是系列文章的目录 (1)hive系列之简介,安装,beeline和hiveserver2 (2)hive系列之基本操作 (3)hive系列之udf (4)hive系列之二级分区和动态分区 (5)h ...

最新文章

  1. openfire学习4---android客户端聊天开发之聊天功能开发
  2. 56秒看完131年英格兰顶级联赛冠军排行:利物浦时隔30年再夺冠
  3. Linux下性能监控工具介绍
  4. MVC控制器取参数值
  5. Hbase笔记:批量导入
  6. python3中def的用法-python3中的def函数语法错误
  7. 车牌识别opencv_基于OpenCV 的车牌识别
  8. 交流异步电机矢量控制(一)——电机模型及其坐标变换
  9. 语义分割之pspnet
  10. 浙江大学计算机图形学视频教程,浙江大学现代教务管理系统
  11. 博客园编辑器为Markdown时改变图片大小
  12. 将文件夹下的多个文件的内容合并到一个文件中
  13. 如何自动生成一本epub电子书
  14. Go语学习笔记 - gorm使用 - gorm处理错误 Web框架Gin(十)
  15. 数据库模型设计——历史表与版本号设计
  16. Sp是如何走到这一步
  17. nginx root 和alise
  18. hive-5(窗口函数)
  19. 含有一般疑问句的歌_七年级——一般现在时
  20. 4.4.1小问题集锦

热门文章

  1. 还不重视!脸上有螨虫的几种表现?
  2. MT6577/MT6589处理器参数对比分析
  3. 如何在论文后面插参考文献
  4. Android中自定义农历日历,Android实现自定义日历
  5. photoshop 用户名、组织或序列号丢失或无效的解决方法(转http://apps.hi.baidu.com/share/detail/10025023)
  6. QScrollArea qt滚动区域的简单使用
  7. 中国诺贝尔物理学奖所有获得者名单(转)
  8. 腾讯地图位置服务器,腾讯地图推出地形图服务
  9. CSDN博客专栏申请方法
  10. 硬盘检测工具MHDD图文教程