云帆大数据学院_hadoop 2.2.0源码编译
2.1下载地址
1、ApacheHadoop(100%永久开源)下载地址:
- http://hadoop.apache.org/releases.html
- SVN:http://svn.apache.org/repos/asf/hadoop/common/branches/
2、CDH(ClouderaDistributed Hadoop,100%永久开源)下载地址:
- http://archive.cloudera.com/cdh4/cdh/4/(是tar.gz文件!)
- http://archive.cloudera.com/cdh5/cdh/ (是tar.gz文件!)
2.2官方版本说明
(1) 官网:http://hadoop.apache.org
(2) 下载Hadoop包
(3) 官方版本存在的问题
官方版本是在Linux 32位环境下编译的,在Linux64为环境下运行会出错:
u 错误警告:WARNutil.NativeCodeLoader: Unable to load native-hadoop library for yourplatform... using builtin-java classes where applicable。
u 官网提供的二进制包,里面的native库,是32位的可以通过以下命令进行查看:
$file $HADOOP_PREFIX/lib/native/libhadoop.so.1.0.0
可以看到该库是基于32位的
libhadoop.so.1.0.0: ELF 32-bit LSBshared object, Intel 80386, version 1 (SYSV), dynamically linked,BuildID[sha1]=0x9eb1d49b05f67d38454e42b216e053a27ae8bac9, not stripped。
2.3官方编译说明
在下载下来的hadoop-2.2.0-src.tar.gz包下有个BUILDING.txt文件,这个文件详细说明了编译步骤
Build instructions for Hadoop
----------------------------------------------------------------------------------
Requirements:先决条件
* Unix System (这里采用社区版Linux CentOS 6.4版本 64位)
* JDK 1.6+ (JDK 1.6以上)
* Maven 3.0 or later (建议最好采用 3.0.5版本)
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code) (编译本地库)
* Internet connection for first build (to fetch allMaven and Hadoop dependencies) (联网下载依赖包)
----------------------------------------------------------------------------------
Maven main modules:
hadoop (Main Hadoopproject)
-hadoop-project (Parent POM forall Hadoop Maven modules. )
(Allplugins & dependencies versions are defined here.)
-hadoop-project-dist (Parent POM formodules that generate distributions.)
-hadoop-annotations (Generates theHadoop doclet used to generated the Javadocs)
-hadoop-assemblies (Mavenassemblies used by the different modules)
-hadoop-common-project (Hadoop Common)
-hadoop-hdfs-project (Hadoop HDFS)
-hadoop-mapreduce-project (Hadoop MapReduce)
-hadoop-tools (Hadoop toolslike Streaming, Distcp, etc.)
-hadoop-dist (Hadoopdistribution assembler)
----------------------------------------------------------------------------------
Where to run Maven from?
It can berun from any module. The only catch is that if not run from utrunk all modules that are not part of the buildrun must be installed in the local Mavencache or available in a Maven repository.
----------------------------------------------------------------------------------
Maven build goals:
* Clean : mvn clean
*Compile : mvn compile[-Pnative]
* Runtests : mvn test[-Pnative]
* CreateJAR : mvn package
* Runfindbugs : mvn compilefindbugs:findbugs
* Runcheckstyle : mvn compilecheckstyle:checkstyle
* InstallJAR in M2 cache : mvn install
* Deploy JARto Maven repo : mvn deploy
* Runclover : mvn test -Pclover[-DcloverLicenseLocation=${user.name}/.clover.license]
* RunRat : mvnapache-rat:check
* Buildjavadocs : mvn javadoc:javadoc
* Builddistribution : mvn package[-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar]
* Change Hadoopversion : mvn versions:set-DnewVersion=NEWVERSION
Buildoptions:
* Use-Pnative to compile/bundle native code
* Use-Pdocs to generate & bundle the documentation in the distribution (using-Pdist)
* Use -Psrcto create a project source TAR.GZ
* Use -Dtarto create a TAR with the distribution (using -Pdist)
Snappybuild options:
Snappy isa compression library that can be utilized by the native code. It is currentlyan optional component, meaning that Hadoop can be built with or without this dependency.
* Use-Drequire.snappy to fail the build if libsnappy.so is not found. If this optionis not specified and the snappy library is missing, we silently build a version of libhadoop.sothat cannot make use of snappy. Thisoption is recommended if you plan on making use of snappy and want to get more repeatable builds.
* Use-Dsnappy.prefix to specify a nonstandard location for the libsnappy headerfiles and library files. You do not need this option if you have installedsnappy using a package manager.
* Use-Dsnappy.lib to specify a nonstandard location for the libsnappy library files. Similarly to nappy.prefix, you do not need this option if you have installed snappy using a package manager.
* Use-Dbundle.snappy to copy the contents of the snappy.lib directory into the finaltar file. This option requires that -Dsnappy.lib is also given, and it ignoresthe -Dsnappy.prefix option.
---------------------------------------------------------------------------------
Building components separately
If you are building a submodule directory, all thehadoop dependencies this submodule has will be resolved as all other 3rd partydependencies. This is,from the Maven cache or from a Maven repository (if notavailable in the cache or the SNAPSHOT 'timed out').
An alternative is to run 'mvn install -DskipTests' from Hadoop source top levelonce; and then work from the submodule. Keep in mind that SNAPSHOTs time outafter a while, using the Maven '-nsu' will stop Maven from trying to updateSNAPSHOTs from external repos.
----------------------------------------------------------------------------------
Protocol Buffer compiler
The version of Protocol Buffer compiler, protoc,must match the version of the protobuf JAR.
If you have multiple versions of protoc in yoursystem, you can set in your build shell the HADOOP_PROTOC_PATH environmentvariable to point to the one you want to use for the Hadoop build. If you don'tdefine this environment variable,protoc is looked up in the PATH.
----------------------------------------------------------------------------------
Importing projects to eclipse
When you import the project to eclipse, installhadoop-maven-plugins at first.
$ cdhadoop-maven-plugins
$ mvninstall
Then, generate eclipse project files.
$ mvneclipse:eclipse -DskipTests
At last, import to eclipse by specifying the rootdirectory of the project via
[File] > [Import] > [Existing Projects intoWorkspace].
----------------------------------------------------------------------------------
Building distributions: (编译发布)
Create binary distribution without native codeand without documentation:(二进制源码)
$ mvnpackage -Pdist -DskipTests –Dtar
Create binary distribution with native code andwith documentation:(二进制源码+本地库+文档)
$ mvnpackage -Pdist,native,docs -DskipTests –Dtar
Create source distribution:(源码)
$ mvnpackage -Psrc –DskipTests
Create source and binarydistributions with native code and documentation:(源码+二进制源码+本地库+文档)
$ mvnpackage -Pdist,native,docs,src -DskipTests –Dtar
Create a local staging version of the website (in/tmp/hadoop-site)
$ mvn cleansite; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
----------------------------------------------------------------------------------
Handling out of memory errors in builds(解决内存溢出问题)
If the build process fails with an out of memoryerror, you should be able to fix it by increasing the memory used by maven-which can be done via the environment variable MAVEN_OPTS.
Here is an example setting to allocate between 256and 512 MB of heap space to Maven
export MAVEN_OPTS="-Xms256m -Xmx512m"
----------------------------------------------------------------------------------
2.4编译步骤
Step1:安装VMware 10 (略)
Step2:安装 Linux操作系统 64bit(略)
这里采用社区版CentOS 6.4版本 64位. 下载地址:http://www.centoscn.com/CentosSoft/
Step3:设置Linux联网
(1) 设置VMware虚拟机网络模式为:NAT模式
(2) 设置Linux操作系统的网络类型为:动态获取DHCP服务器地址,与宿主机共享网络
(3) 测试:ping www.baidu.com
Step4:安装JDK
说明: JDK版本为1.5以上 ; 64位编译版本 (本环境采用jdk-6u45-linux-x64.bin)
(1)使用FTP工具(WinSCP工具或FileZilla)将jdk-6u45-linux-x64.bin上传到Linxu系统/software/目录下
(2)安装jdk
cd /software/
chmod u+x jdk-6u45-linux-x64.bin --授予执行权限
mkdir /workDir --创建一个软件安装目录(个人习惯而已)
cp jdk-6u45-linux-x64.bin /workDir --复制到workDir目录
./ jdk-6u45-linux-x64.bin --执行自解压文件
mv jdk1.6.0_45 jdk6u45 --方便起见,对文件夹重命名
(3)配置环境变量
Vi /etc/profile
增加如下配置:
export JAVA_HOME=/workDir/jdk6u45
export PATH=.:$PATH:$JAVA_HOME/bin
(1) 使环境变量生效
source /etc/profile
(5)验证jdk是否安装成功
java –verson
Step5:安装依赖包
yum install autoconf -y
yum install automake -y
yum install libtool -y
yum install cmake -y
yum installncurses-devel -y
yum installopenssl-devel -y
yum installgcc -y
yum install gcc-c++ -y
yum install lzo-devel -y
yum installzlib-devel -y
说明:-y 代表在安装过程中提示选择默认为“yes”
验证:
rpm –qa | grep autoconf
【yum命令简介】:
yum(全称为 Yellow dog Updater, Modified)是一个在Fedora和RedHat以及SUSE中的Shell前端软件包管理器。基於RPM包管理,能够从指定的服务器自动下载RPM包并且安装,可以自动处理依赖性关系,并且一次安装所有依赖的软体包,无须繁琐地一次次下载、安装。yum提供了查找、安装、删除某一个、一组甚至全部软件包的命令,而且命令简洁而又好记。
yum的命令形式一般是如下:yum [options] [command] [package...]
其中的[options]是可选的,选项包括-h(帮助),-y(当安装过程提示选择全部为"yes"),-q(不显示安装的过程)等等。[command]为所要进行的操作,[package ...]是操作的对象。
- 部分常用的命令包括:
自动搜索最快镜像插件: yum install yum-fastestmirror
安装yum图形窗口插件: yum install yumex
查看可能批量安装的列表: yum grouplist
- 安装
yuminstall 全部安装
yuminstall package1 安装指定的安装包package1
yumgroupinsall group1 安装程序组group1
Step6:安装Maven
(1) Maven 版本下载apache-maven-3.0.5-bin.tar.gz
说明:不要使用最新的Maven 3.1.1,Hadoop2.2.0的源码与Maven3.x存在兼容性问题,所以会出现
java.lang.NoClassDefFoundError:org/sonatype/aether/graph/DependencyFilter
建议使用Maven3.0.5版本
(2) 下载
地址: http://maven.apache.org/download.cgi
选择 apache-maven-3.0.5-bin.tar.gz下载
(3) 上传到Linux并解压到安装目录
tar –zxvf apache-maven-3.0.5-bin.tar.gz –C/workDir
(4) 设置环境变量
vi/etc/profile
新增:
exportMAVEN_HOME=/workDir/apache-maven-3.0.5
exportPATH=$PATH:$MAVEN_HOME/bin
执行命令:source /etc/profile 或者 . /etc/profile
验证:
mvn-v
Step7:配置Maven国内镜像
(1) 编辑 settings.xml文件
进入安装目录 /workDir/apache-maven-3.0.5/conf
* 修改<mirrors>内容:
<mirror>
<id>nexus-osc</id>
<mirrorOf>*</mirrorOf>
<name>Nexusosc</name>
<url>http://maven.oschina.net/content/groups/public/</url>
</mirror>
* 修改<profiles>内容:
<profile>
<id>jdk-1.6</id>
<activation>
<jdk>1.6</jdk>
</activation>
<repositories>
<repository>
<id>nexus</id>
<name>localprivate nexus</name>
<url>http://maven.oschina.net/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>nexus</id>
<name>localprivate nexus</name>
<url>http://maven.oschina.net/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
</profile>
(2) 复制配置
说明:将settings.xml文件复制到用户目录,使得每次对maven创建时,都采用该配置
cd /home/Hadoop --*查看用户目录【/home/hadoop】是否存在【.m2】文件夹,如没有,则创建
mkdir .m2
cp /workDir/apache-maven-3.0.5/conf/settings.xml~/.m2 --复制文件
(3) 配置DNS
vi /etc/resolv.conf
修改如下:
nameserver 8.8.8.8
nameserver 8.8.4.4
Step8:安装protobuf
(1) 下载protobuf-2.5.0.tar.gz
https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
(2) 解压到安装目录
cd /software
tar-zxvf protobuf-2.5.0.tar.gz –C /wrokDir
(3) 安装下面3个依赖包(如果已经安装可以跳过)
yuminstall gcc -y
yuminstall gcc-c++ -y
yuminstall make -y
【说明】:如果缺少这个3个依赖包,会报下面的错误:
ERROR]Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.2.0:protoc(compile-protoc) on project hadoop-common:org.apache.maven.plugin.MojoExecutionException: 'protoc --version' did notreturn a version -> [Help 1]
[ERROR]
[ERROR]To see the full stack trace of the errors, re-run Maven with the -eswitch.
[ERROR]Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR]For more information about the errors and possible solutions, please read thefollowing articles:
[ERROR][Help 1]http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR]After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :hadoop-common
(4) 编译安装,执行配置文件
进入安装目录,执行configure文件
cd/workDir/protobuf-2.5.0 --进入安装目录
./configure --执行配置文件
(5) 安装
make& make check & make install
说明:安装protobuf需要安装gcc gcc-c++系统包(如果之前安装的话就不用再安装)
(6) 配置环境变量
vi /etc/profile
新增:
export PROTOBUF_HOME=/workDir/ protobuf-2.5.0
export PATH=$PATH:$PROTOBUF_HOME/bin
使配置生效:
source /etc/profile 或者 . /etc/profile
验证:
protoc --version
Step9:安装findbugs-3.0.0
(1) 下载:findbugs-3.0.0.tar.gz
http://sourceforge.jp/projects/sfnet_findbugs/releases/
(2) 解压到安装目录
cd /software
tar -zxvf findbugs-3.0.0.tar.gz-C /workDir
(3) 设置环境变量
vi/etc/profile
增加如下内容:
exportFINDBUGS_HOME=/wrokDir/findbugs-3.0.0
exportPATH=$PATH:$FINDBUGS_HOME/bin
(4) 使环境变量生效
source/etc/profile 或者 ./etc/profile
(5) 验证
findbugs-version
重要说明:
如果出现以下错误,说明jdk版本不兼容导致。findbugs-2.5.0和findbugs3.0.0是在jdk7以上编译的,所以需要在Linux上安装jdk7才可以。
错误提示:
Step10:编译hadoop-src-2.2.0源码
(1) 下载:hadoop-2.2.0-src.tar.gz
http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz
(2) 解压到安装目录
cd/software
tar-zxvf hadoop-2.2.0-src.tar.gz –C/workDir
(3) 源码包打Patch
- 重要说明:hadoop-2.2.0版本的源码存在bug,在apache官方JIRA上有说明:
JIRA地址:https://issues.apache.org/jira/browse/HADOOP-10110
- Bug修复办法:
Index: hadoop-common-project/hadoop-auth/pom.xml
===================================================================
--- hadoop-common-project/hadoop-auth/pom.xml (revision 1543124)
+++ hadoop-common-project/hadoop-auth/pom.xml (working copy)
@@ -54,6 +54,11 @@
</dependency>
<dependency>
<groupId>org.mortbay.jetty</groupId>
+ <artifactId>jetty-util</artifactId>
+ <scope>test</scope>
+ </dependency>
+ <dependency>
+ <groupId>org.mortbay.jetty</groupId>
<artifactId>jetty</artifactId>
<scope>test</scope>
</dependency>
从上面官方的bug修复说明中可以看到,需要编辑目录$HADOOP_SRC_HOME/hadoop-common-project/hadoop-auth中的pom.xml文件,在第55行下增加以下内容:
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
<scope>test</scope>
</dependency>
否则会报下面的错误:
[ERROR]Failed to execute goalorg.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile(default-testCompile) on project hadoop-auth: Compilation failure: Compilationfailure:
[ERROR]/home/chuan/trunk/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:[84,13]cannot access org.mortbay.component.AbstractLifeCycle
[ERROR]class file for org.mortbay.component.AbstractLifeCycle not found
(4) 编译
官方编译说明:
Createsource and binary distributions with native code and documentation:(源码+二进制源码+本地库+文档)
$ mvnpackage -Pdist,native,docs,src -DskipTests –Dtar
cd/wrokDir/Hadoop-2.2.0-src
mvnpackage -DskipTests -Pdist,native -Dtar
说明:如果在编译过程中出现内存溢出的情况时,可以调整一下内存大小
export MAVEN_OPTS="-Xms256m -Xmx512m"
这个过程时间比较久,需要上网下载依赖包……
直到看到下面的信息,说明编译成功:
[INFO]------------------------------------------------------------------------
[INFO]BUILD SUCCESS
[INFO]------------------------------------------------------------------------
[INFO]Total time: 11:53.144s
[INFO]Finished at: Fri Nov 22 16:58:32 CST 2013
[INFO]Final Memory: 70M/239M
[INFO]------------------------------------------------------------------------
Step11:编译后说明
1. 查看编译后的文件
编译后的路径在:hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
cd /workDir/ hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
ll --查看编译好的目录
编译后hadoop-2.2.0目录下的目录:
drwxr-xr-x. 2 root root 4096 Aug 11 12:00 bin
drwxr-xr-x. 3 root root 4096 Aug 11 12:00 etc
drwxr-xr-x. 2 root root 4096 Aug 11 12:00 include
drwxr-xr-x. 3 root root 4096 Aug 11 12:00 lib
drwxr-xr-x. 2 root root 4096 Aug 11 12:00 libexec
drwxr-xr-x. 2 root root 4096 Aug 11 12:00 sbin
drwxr-xr-x. 4 root root 4096 Aug 11 12:00 share
进入 bin目录,执行hadoop命令查看脚本
cd bin
./Hadoop version
可以看到所有版本:
[root@localhost bin]# ./hadoop version
Hadoop 2.2.0
Subversion Unknown -r Unknown
Compiled by root on 2014-08-11T18:34Z
Compiled with protoc 2.5.0
From source with checksum79e53ce7994d1628b240f09af91e1af4
This command was run using /workDir/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/
hadoop-common-2.2.0.jar
2. 查看本地库编译版本
cd /workDir/ hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
file lib//native/*
可以看到是64位的版本了(红色字部分):
[root@localhost hadoop-2.2.0]# file lib//native/*
lib//native/libhadoop.a: current ar archive
lib//native/libhadooppipes.a: current ar archive
lib//native/libhadoop.so: symbolic link to `libhadoop.so.1.0.0'
lib//native/libhadoop.so.1.0.0: ELF 64-bit LSB shared object, x86-64, version 1(SYSV), dynamically linked, not stripped
lib//native/libhadooputils.a: current ar archive
lib//native/libhdfs.a: current ar archive
lib//native/libhdfs.so: symbolic link to `libhdfs.so.0.0.0'
lib//native/libhdfs.so.0.0.0: ELF 64-bit LSBshared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
至此,编译成功!
转载于:https://blog.51cto.com/yfteach01/1629703
云帆大数据学院_hadoop 2.2.0源码编译相关推荐
- “卜算子·大数据”学习系列原创文章、源码——从入门到精通
大数据 big-data :white_check_mark: 转载请注明出处与作者信息(如下) 原创作者:王小雷 作品出自:https://github.com/wangxiaoleiAI/big- ...
- ambari 2.5.0源码编译安装
参考:https://www.ibm.com/developerworks/cn/opensource/os-cn-bigdata-ambari/index.html Ambari 是什么 Ambar ...
- Atlas 2.2.0源码编译及安装步骤
Atlas 2.2.0源码编译及安装步骤 一.源码编译 1. 下载源码 2. 前置环境安装 3. 修改版本号 4. 修改源码中 atlas与kafka版本兼容问题 5. 开始编译 6. 问题锦集 二. ...
- cmake 编译curl源码_OpenCV4.0 源码编译
之前写过几篇关于OpenCV的博客,都是基于openCV 3.14写的,10月份OpenCV发布了4.0的bate版本,我就切换到4.0版本上来.之后的博客都会是基于4.0版本的.本文主要介绍一下三个 ...
- Android4.0源码编译方法以及错误解决方案
from:http://blog.csdn.net/wanjun8659/article/details/8095664 历时一个星期,终于将android4.0源码编译成功,中间经历了各种曲折,非常 ...
- postgresql 12.0 源码编译安装
postgresql 12.0 源码编译安装 1.安装相关软件包 su - root yum install -y cmake gcc gcc-c++ perl readline readline-d ...
- tesseract-4.0.0源码编译安装
tesseract-4.0.0源码编译安装 安装开发工具 apt-get -y install gcc g++ make cmake autoconf automake libtool pkg-con ...
- python 3.10.0源码编译安装
python 3.10.0源码编译安装 文章目录 python 3.10.0源码编译安装 1. 安装编译依赖工具 2. 下载python 3.10.0 3. 编译安装 Python 4. 体验 1. ...
- PHP 8.0 源码编译安装 JIT 尝鲜
女主宣言 今天小编为大家分享一篇最简化的 PHP 8 源码编译安装方法.PHP 8.0 Alpha 1 已经在2020年6月25号发布了,今天带领大家快速尝鲜 PHP 8.0 的新特性 JIT.希望能 ...
最新文章
- Socket编程与TCP
- 45个纯 CSS 实现的精美边框效果【附演示和源码】【上篇】
- xp mysql字符集与乱码_mysql字符集(GBK、GB2312、UTF8)与中文乱码的原因及解决
- Jquery中使用SweetAlert使弹窗更漂亮
- 常用 TCP 端口作用及其操作建议
- html5游戏引擎-Pharse.js学习笔记(一)
- antdesignvue upload vue3个人笔记待更新
- Class的getInterfaces与getGenericInterface区别
- SpringBoot学习——@Autowired自动注入报:could not be found问题的理解和解决方案
- gimp中文版教程_GIMP中文详细教程.pdf
- python中如何去掉重复元素
- Mac OS X任务管理器
- 计算机考研国家线好过,考研国家线真的很好过吗?
- 跨境追踪(ReID)多粒度网络(MGN)详解及代码实现(1)
- 程序员笑话集:bug跟蚊子的相似之处
- 汽车计算机英语,架图你的行车电脑变成英文了怎么办??????
- 思科模拟器:修改根交换机
- 厉害,我带的实习生仅用四步就整合好SpringSecurity+JWT实现登录认证
- 【BUG】Swagger2 与通用的 Spring MVC Controller 冲突
- MATLAB运动目标检测