搭建Hadoop2.6.4伪分布式
准备工作
操作系统
CentOS 7
软件环境
- JDK 1.7.0_79 下载地址
- SSH,正常来说是系统自带的,若没有请自行搜索安装方法
关闭防火墙
systemctl stop firewalld.service #停止firewall systemctl disable firewalld.service #禁止firewall开机启动
设置HostName
[root@localhost ~]# hostname localhost
安装环境
安装JDK
[root@localhost ~]# tar -xzvf jdk-7u79-linux-x64.tar.gz
配置java环境变量
[root@localhost ~]# vi /etc/profile #添加如下配置 JAVA_HOME=/root/jdk1.7.0_79 PATH=$JAVA_HOME/bin:$PATH CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarexport JAVA_HOME export PATH export CLASSPATH
验证java
[root@localhost ~]# java -version java version "1.7.0_79" Java(TM) SE Runtime Environment (build 1.7.0_79-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
待输出以上内容时说明java已安装配置成功。
安装Hadoop
下载Hadoop 2.6.4
安装Hadoop 2.6.4
[root@localhost ~]# tar -xzvf hadoop-2.6.4.tar.gz
配置Hadoop环境变量
[root@localhost ~]# vim /etc/profile #添加以下配置 export HADOOP_HOME=/root/hadoop-2.6.4 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin[root@localhost ~]# vim /root/hadoop-2.6.4/etc/hadoop/hadoop-env.sh #修改以下配置 # The only required environment variable is JAVA_HOME. All others are # optional. When running a distributed configuration it is best to # set JAVA_HOME in this file, so that it is correctly defined on # remote nodes.# The java implementation to use. export JAVA_HOME=/root/jdk1.7.0_79
验证Hadoop
[root@localhost ~]# hadoop version Hadoop 2.6.4 Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6 Compiled by jenkins on 2016-02-12T09:45Z Compiled with protoc 2.5.0 From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010 This command was run using /root/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar
修改Hadoop配置文件
配置文件均存放在/root/hadoop-2.6.4/etc/hadoop
<!-- core-site.xml--> <configuration><property><name>fs.defaultFS</name><value>hdfs://localhost:9000</value></property> </configuration><!-- hdfs-site.xml --> <configuration><property><name>dfs.replication</name><value>1</value></property> </configuration><!-- mapred-site.xml --> <configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property> </configuration><!-- yarn-site.xml --> <configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property> </configuration>
SSH免密码登陆
[root@localhost ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa [root@localhost ~]# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
输入以下命令,如果不要求输入密码则表示配置成功:
[root@localhost ~]# ssh localhost Last login: Fri May 6 05:17:32 2016 from 192.168.154.1
执行Hadoop
格式化hdfs
[root@localhost ~]# hdfs namenode -format
启动NameNode,DataNode和YARN
[root@localhost ~]# start-dfs.sh Starting namenodes on [localhost] localhost: starting namenode, logging to /root/hadoop-2.6.4/logs/hadoop-root-namenode-localhost.out localhost: starting datanode, logging to /root/hadoop-2.6.4/logs/hadoop-root-datanode-localhost.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /root/hadoop-2.6.4/logs/hadoop-root-secondarynamenode-localhost.out[root@localhost ~]# start-yarn.sh starting yarn daemons starting resourcemanager, logging to /root/hadoop-2.6.4/logs/yarn-root-resourcemanager-localhost.out localhost: starting nodemanager, logging to /root/hadoop-2.6.4/logs/yarn-root-nodemanager-localhost.out
向hdfs上传测试文件
首先在/root/test中建立test1.txt和test2.txt,分别输入“hello world”和“hello hadoop”并保存。
使用如下命令将文件上传至hdfs的input目录中:
[root@localhost ~]# hadoop fs -put /root/test/ input [root@localhost ~]# hadoop fs -ls input Found 2 items -rw-r--r-- 1 root supergroup 12 2016-05-06 06:35 input/test1.txt -rw-r--r-- 1 root supergroup 13 2016-05-06 06:35 input/test2.txt
执行wordcount demo
输入以下命令并等待执行结果:
[root@localhost ~]# hadoop jar /root/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount input output 16/05/06 06:44:15 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/05/06 06:44:16 INFO input.FileInputFormat: Total input paths to process : 2 16/05/06 06:44:17 INFO mapreduce.JobSubmitter: number of splits:2 16/05/06 06:44:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1462530786445_0001 16/05/06 06:44:18 INFO impl.YarnClientImpl: Submitted application application_1462530786445_0001 16/05/06 06:44:18 INFO mapreduce.Job: The url to track the job: http://server1:8088/proxy/application_1462530786445_0001/ 16/05/06 06:44:18 INFO mapreduce.Job: Running job: job_1462530786445_0001 16/05/06 06:44:33 INFO mapreduce.Job: Job job_1462530786445_0001 running in uber mode : false 16/05/06 06:44:33 INFO mapreduce.Job: map 0% reduce 0% 16/05/06 06:44:52 INFO mapreduce.Job: map 50% reduce 0% 16/05/06 06:44:53 INFO mapreduce.Job: map 100% reduce 0% 16/05/06 06:45:03 INFO mapreduce.Job: map 100% reduce 100% 16/05/06 06:45:03 INFO mapreduce.Job: Job job_1462530786445_0001 completed successfully 16/05/06 06:45:04 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=55FILE: Number of bytes written=320242FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=249HDFS: Number of bytes written=25HDFS: Number of read operations=9HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=2Launched reduce tasks=1Data-local map tasks=2Total time spent by all maps in occupied slots (ms)=34487Total time spent by all reduces in occupied slots (ms)=7744Total time spent by all map tasks (ms)=34487Total time spent by all reduce tasks (ms)=7744Total vcore-milliseconds taken by all map tasks=34487Total vcore-milliseconds taken by all reduce tasks=7744Total megabyte-milliseconds taken by all map tasks=35314688Total megabyte-milliseconds taken by all reduce tasks=7929856Map-Reduce FrameworkMap input records=2Map output records=4Map output bytes=41Map output materialized bytes=61Input split bytes=224Combine input records=4Combine output records=4Reduce input groups=3Reduce shuffle bytes=61Reduce input records=4Reduce output records=3Spilled Records=8Shuffled Maps =2Failed Shuffles=0Merged Map outputs=2GC time elapsed (ms)=364CPU time spent (ms)=3990Physical memory (bytes) snapshot=515538944Virtual memory (bytes) snapshot=2588155904Total committed heap usage (bytes)=296755200Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=25File Output Format Counters Bytes Written=25
查看执行结果
[root@localhost ~]# hadoop fs -ls output Found 2 items -rw-r--r-- 1 root supergroup 0 2016-05-06 06:45 output/_SUCCESS -rw-r--r-- 1 root supergroup 25 2016-05-06 06:45 output/part-r-00000 [root@localhost ~]# hadoop fs -cat output/part-r-00000 hadoop 1 hello 2 world 1
至此,Pseudo-Distributed就已经完成了。
完全分布式可参考这里
原创文章,转载请注明: 转载自xdlysk的博客
本文链接地址: 搭建Hadoop伪分布式[http://www.xdlysk.com/article/572c956642c817300e0f7ab1]
转载于:https://www.cnblogs.com/xdlysk/p/5514082.html
搭建Hadoop2.6.4伪分布式相关推荐
- 在Win7虚拟机下搭建Hadoop2.6.0伪分布式环境
近几年大数据越来越火热.由于工作需要以及个人兴趣,最近开始学习大数据相关技术.学习过程中的一些经验教训希望能通过博文沉淀下来,与网友分享讨论,作为个人备忘. 第一篇,在win7虚拟机下搭建hadoop ...
- Hadoop2.2.0伪分布式环境搭建(附:64位下编译Hadoop-2.2.0过程)
Hadoop2.2.0伪分布式环境搭建: 写在前面:Hadoop2.2.0默认是支持32位的OS,如果想要在64位OS下运行的话,可以通过在64位OS下面编译Hadoop2.2.0来实现,编译的操作步 ...
- ubuntu14.04安装hadoop2.7.1伪分布式和错误解决
ubuntu14.04安装hadoop2.7.1伪分布式和错误解决 需要说明的是我下载的是源码,通过编译源码并安装 一.需要准备的软件: 1.JDK和GCC 设置JAVA_HOME: ...
- hadoop2.9.1伪分布式环境搭建以及文件系统的简单操作
1.准备 1.1.在vmware上安装centos7的虚拟机 1.2.系统配置 配置网络 # vi /etc/sysconfig/network-scripts/ifcfg-ens33 BOOTPRO ...
- Hadoop2.2.0伪分布式搭建
在hadoop中,分为单机模式,伪分布式,和完全分布式.而伪分布式在1.X中就是类似JobTracker和TaskTracker都在一台机器上运行,在2.X中,就是NameNode和DataNode在 ...
- # 从零開始搭建Hadoop2.7.1的分布式集群
Hadoop 2.7.1 (2015-7-6更新),Hadoop的环境配置不是特别的复杂,可是确实有非常多细节须要注意.不然会造成很多配置错误的情况.尽量保证一次配置正确防止重复改动. 网上教程有非常 ...
- Hadoop2.7.0伪分布式安装教程
2019独角兽企业重金招聘Python工程师标准>>> 总是要学点什么是吧,Java学大数据据说很快,就从这面入手了,正好项目在使用可以get一项新技能了,距离全栈工程师又进了一步不 ...
- ubuntu20.10上搭建hadoop3.2.2伪分布式
目录 1.准备环境 2.配置静态ip地址 3.安装jdk 4.修改主机名hostname和hosts 5.配置ssh免密登录 6.安装hadoop 7.搭建伪分布式 1.准备环境 1.Ubuntu20 ...
- Hadoop小兵笔记【五】hadoop2.2.0伪分布式环境搭建疑难-第一个用例wordcount失败
问题现象 问题原因 由于参考了之前研究生阶段下载的资料,按照真分布式环境一步步配置,觉得只是将datanode同时存放在namenode,将hdfs-site.xml文件中的dfs.replicati ...
最新文章
- Spring Data JPA_多表关联查询中应该注意的问题
- Scala AKKA入门示例
- 塑壳断路器用考虑启动电流么_塑壳式断路器知识
- Android Wear 唤醒热词会比“你好,安卓”好吗?
- HBase数据存储格式
- go Template 使用{{ end -}}的坑
- oledb excel java_C#中Excel 2016的oledb连接字符串
- Leetcode刷题记录[java]——561 Array Partition I
- 勤于思考:Asp.Net MVC Html.TextBoxFor日期格式化
- mysql 中时间和日期函数应用
- 动易SiteFactoryCMS 网站配置保存不了问题
- GPS数据包格式+数据解析
- 如何通过官方原版win10PE安装纯净版win10系统
- HTML——添加网页背景音乐
- LinkLab 链接
- python表格控件_python--excel操作插件openpyxl
- 主流前端框架实现原理
- macbookpro2011安装单系统win10
- Power BI 学习三:数据整理和关系管理
- springmvc mvn搭建