阿里云一键部署 Hadoop 分布式集群
Hadoop是一个能够让用户轻松架构和使用的开源分布式计算框架,一种可靠、高效、可伸缩的方式进行数据处理。本文主要目的是为大家提供一种非常简单的方法,在阿里云上部署Hadoop集群。
通过<阿里云ROS资源编排服务>,将VPC、NAT Gateway、ECS创建和Hadoop部署过程自动化,使大家能够非常方便地部署一个Hadoop集群。本文创建的Hadoop集群包含三个节点:master.hadoop,slave1.hadoop,slave2.hadoop。
UserData中安装Hadoop四部曲
正常安装Hadoop分为四步:配置ssh无密码登录,安装Java JDK,安装配置Hadoop tar包,和启动测试集群。
1. 配置ssh无密码登录
集群中的所有主机应该可以无密码ssh登录,因此3台ECS都应该执行如下命令,保证所有的主机中临时密钥和公钥均相同,这样3台ECS就可以无密码登录了。
"ssh-keygen -t rsa -P '' -f '/root/.ssh/id_rsa' \n","cd /root/.ssh \n","mv id_rsa bak.id_rsa \n","mv id_rsa.pub bak.id_rsa.pub \n","echo '-----BEGIN RSA PRIVATE KEY-----' > id_rsa \n","echo 'MIIEowIBAAKCAQEAzfQ/QHwWB1njU9+Wu3RYi9g+g5rydSpAE0klefTJuZjtcaic' >> id_rsa \n","echo 'SCeBN5avih8UToZ148+Ef2YzOtoosqluZpoYLCPSpAqr8pmviBJIU3vfu9mnDG9L' >> id_rsa \n","echo 'oevT6K8w3wCBRCmqu+vc0Tpju/EeCuYK3+8w7e2I6F3+zIYzhhX3qmkocje7ACnV' >> id_rsa \n","echo 'yh2DB/2m3sogTMc0RT+5y3kJAnC6TOIlGsYjOOkPbEF3Ifn1o4ZjOFOmQlcRJer4' >> id_rsa \n","echo '1E6rTdXAxS2uDNMFBCf9Xyx7O9J+ELTCAXc4h7AE6WLdQb5Apzv4t1KswCAtRenP' >> id_rsa \n","echo '1xGcUYY8I/JUT2VvBWtQJennrk9jrPZUFDcmcwIDAQABAoIBAAwzDZQaRYvF7UtI' >> id_rsa \n","echo 'kTslVyFhe8J76SS7jfQWfxvMPi66OkZjQG6duG+8g0VhNei42j7WSfjp6trvlT2P' >> id_rsa \n","echo '/7QgKJJkxNNmtmy2Ycljm9kmG0ibSePYq9g5ieHcjr6G3yFUfoKHJBtYpBO74pWu' >> id_rsa \n","echo 'rrI5DuLpERUCjFc9E8w7fOIhPH4XZ6wk/EmPxHTgxZk+aMvqptyPSbUyiUOvCiZf' >> id_rsa \n","echo 'MD0ircs9vgtslMVDlz9m6CoiNz6B3Yf6eLRoGGMiGnsQzZHIfnHCMX8i65Jc+TvQ' >> id_rsa \n","echo 'fLopIHzwBwI55xpcOIBRgYiEAQJhLsSNSFugoxMcwe9RalTGGS21HOQu4b3ZylKM' >> id_rsa \n","echo '8ofEVKECgYEA/Y4MzN04wAsM1yNuHN+9sdiVLG28LWH3dgpcXqa9gyNsWs9Gf8uf' >> id_rsa \n","echo 'qbuvQGeKDRXCW93wO1jO+pCYVrkyY3l+KhBKIqmkJT5gFsa8dBUvEBLALiHNg3+o' >> id_rsa \n","echo 'jR2Vqsemk8kMZA8zfJ3FKcMb1pt4S2GqepqsdC3DgzIIsxufCh7jSzsCgYEAz/Cv' >> id_rsa \n","echo 'Z7gAdSFC8q6QxFxSyhfZwGA6QW6ZU7rZv9ZdGySvZg+vHbpNVmQ0BAQzJui3Qs9u' >> id_rsa \n","echo 'XQMpYafXWzKsPzG6ZWvYXTF7fuxlovvG8Qd2A2QnGtGMB9YtQMHVqbsUDwxMjiTn' >> id_rsa \n","echo 'VBZILkDf+WCwQ98P5UMoumI0goIcNZ+AXhcqrikCgYBkSwLvKfYfqH9Mvfv5Odsr' >> id_rsa \n","echo '9NKUv1c20FB1BYYh/mxp6eIbTW/CbwXZup6IqCvoHxpBAlna77b3T6iibSDsTgtE' >> id_rsa \n","echo 'kirw6Q8/mBukBrpWZGa4QeJ4nPBQuncuUmx4H/7Y6CaZkZW5DiMF8OIbEmYT0y7+' >> id_rsa \n","echo 'zh222r9CLtFYH23aL/uSLwKBgQCS1xyG2eE41aw5RBznDWtJW15iA5If8sJD5ocu' >> id_rsa \n","echo 'eWp2aImUQS8ghxdmEozI6U5WA7CmdWUyObFXTPc/Z6FLXwqJ5IZ+CRt0neuIFNSA' >> id_rsa \n","echo 'EQy9iFQ1FBUW06BRQpBns7yOg9jr6BOTxchjIV0I9caDp1nKRIrWU9NQ9iCFnYVA' >> id_rsa \n","echo '7IsvQQKBgFxgF7UOhwaiMb/ATSuhm2v9kVvRPEO9umdo7YJ9I4L09lYbAtpcnusQ' >> id_rsa \n","echo 'fIROYL25VeEMgYcQyInc3Fm/sgJdbXQnUy+3QbbCcBCmcLj27LPQyxuu7p9hbPPT' >> id_rsa \n","echo 'Mxx37OmYvOkSVTQz0T9HfDGvOJgt4t4cXD4T/7ewk62p6jdpSQrt' >> id_rsa \n","echo '-----END RSA PRIVATE KEY-----' >> id_rsa \n","echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDN9D9AfBYHWeNT35a7dFiL2D6DmvJ1KkATSSV59Mm5mO1xqJxIJ4E3lq+KHxROhnXjz4R/ZjM62iiyqW5mmhgsI9KkCqvyma+IEkhTe9+72acMb0uh69PorzDfAIFEKaq769zROmO78R4K5grf7zDt7YjoXf7MhjOGFfeqaShyN7sAKdXKHYMH/abeyiBMxzRFP7nLeQkCcLpM4iUaxiM46Q9sQXch+fWjhmM4U6ZCVxEl6vjUTqtN1cDFLa4M0wUEJ/1fLHs70n4QtMIBdziHsATpYt1BvkCnO/i3UqzAIC1F6c/XEZxRhjwj8lRPZW8Fa1Al6eeuT2Os9lQUNyZz root@iZ2zee53wf4ndvajz30cdvZ' > id_rsa.pub \n","cp id_rsa.pub authorized_keys \n","chmod 600 authorized_keys \n","chmod 600 id_rsa \n","sed -i 's/# StrictHostKeyChecking ask/StrictHostKeyChecking no/' /etc/ssh/ssh_config \n","sed -i 's/GSSAPIAuthentication yes/GSSAPIAuthentication no/' /etc/ssh/ssh_config \n","service sshd restart \n",
为了保证安全,防止对外泄露密钥和公钥。我们要在master上执行下述命令,替换掉公开的临时密钥与公钥:
"scp -r /root/.ssh/bak.* root@$ipaddr_slave1:/root/.ssh/ \n","scp -r /root/.ssh/bak.* root@$ipaddr_slave2:/root/.ssh/ \n","ssh root@$ipaddr_slave1 \"cd /root/.ssh; rm -f id_rsa; mv bak.id_rsa id_rsa; rm -f id_rsa.pub; mv bak.id_rsa.pub id_rsa.pub; rm -f authorized_keys; cp id_rsa.pub authorized_keys; chmod 600 authorized_keys; chmod 600 id_rsa; exit\" \n","ssh root@$ipaddr_slave2 \"cd /root/.ssh; rm id_rsa; mv bak.id_rsa id_rsa; rm id_rsa.pub; mv bak.id_rsa.pub id_rsa.pub; rm authorized_keys; cp id_rsa.pub authorized_keys; chmod 600 authorized_keys; chmod 600 id_rsa; exit\" \n","cd /root/.ssh \n rm id_rsa \n mv bak.id_rsa id_rsa \n rm id_rsa.pub \n mv bak.id_rsa.pub id_rsa.pub \n rm authorized_keys \n cp id_rsa.pub authorized_keys \n chmod 600 authorized_keys \n chmod 600 id_rsa \n cd \n",
2. 安装配置JDK
master 上安装JDK,并远程控制在slaves上安装JDK。
"yum -y install aria2 \n","aria2c --header='Cookie: oraclelicense=accept-securebackup-cookie' $JdkUrl \n","mkdir -p $JAVA_HOME \ntar zxvf jdk*-x64.tar.gz -C $JAVA_HOME \ncd $JAVA_HOME \nmv jdk*.*/* ./ \nrmdir jdk*.* \n","echo >> /etc/profile \n","echo export JAVA_HOME=$JAVA_HOME >> /etc/profile \n","echo export JRE_HOME=$JAVA_HOME/jre >> /etc/profile \n","echo export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH >> /etc/profile \n","echo export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH >> /etc/profile \n","source /etc/profile \n","ssh root@$ipaddr_slave1 \"mkdir -p $JAVA_HOME; exit\" \n","ssh root@$ipaddr_slave2 \"mkdir -p $JAVA_HOME; exit\" \n","scp -r $JAVA_HOME/* root@$ipaddr_slave1:$JAVA_HOME \n","scp -r $JAVA_HOME/* root@$ipaddr_slave2:$JAVA_HOME \n","ssh root@$ipaddr_slave1 \"echo >> /etc/profile; echo export JAVA_HOME=$JAVA_HOME >> /etc/profile; echo export JRE_HOME=$JAVA_HOME/jre >> /etc/profile; echo export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH >> /etc/profile; echo export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH >> /etc/profile; source /etc/profile; exit\" \n","ssh root@$ipaddr_slave2 \"echo >> /etc/profile; echo export JAVA_HOME=$JAVA_HOME >> /etc/profile; echo export JRE_HOME=$JAVA_HOME/jre >> /etc/profile; echo export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH >> /etc/profile; echo export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH >> /etc/profile; source /etc/profile; exit\" \n"," \n",
3. 安装配置Hadoop
下载安装Hadoop:
"aria2c $HadoopUrl \n","mkdir -p $HADOOP_HOME \ntar zxvf hadoop-*.tar.gz -C $HADOOP_HOME \ncd $HADOOP_HOME \nmv hadoop-*.*/* ./ \nrmdir hadoop-*.* \n","echo >> /etc/profile \n","echo export HADOOP_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_MAPRED_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_COMMON_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native >> /etc/profile \n","echo export YARN_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_HDFS_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop >> /etc/profile \n","echo export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_HOME/lib/native >> /etc/profile \n","echo export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH >> /etc/profile \n","echo export CLASSPATH=.:$CLASSPATH:$HADOOP__HOME/lib >> /etc/profile \n","source /etc/profile \n"," \n","mkdir -p $HADOOP_HOME/tmp \n","mkdir -p $HADOOP_HOME/dfs/name \n","mkdir -p $HADOOP_HOME/dfs/data \n", ###
此处要对这4个文件进行正确配置,具体内容请参考模板Hadoop_Distributed_Env_3_ecs.json.core-site.xml,hdfs-site.xml,mapred-site.xml,yarn-site.xml
###"ssh root@$ipaddr_slave1 \"mkdir -p $HADOOP_HOME; exit\" \n","ssh root@$ipaddr_slave2 \"mkdir -p $HADOOP_HOME; exit\" \n","scp -r $HADOOP_HOME/* root@$ipaddr_slave1:$HADOOP_HOME \n","scp -r $HADOOP_HOME/* root@$ipaddr_slave2:$HADOOP_HOME \n"," \n","cd $HADOOP_HOME/etc/hadoop \n","echo $ipaddr_slave1 > slaves \n","echo $ipaddr_slave2 >> slaves \n",
4. 启动测试集群
最后格式化HDFS,关闭防火墙,启动集群。
"hadoop namenode -format \n","systemctl stop firewalld \n","start-dfs.sh \n","start-yarn.sh \n",
急速部署Hadoop集群
一键部署Hadoop集群>>
注意:
- 必须确保可以正确下载Jdk和Hadoop tar 包,我们可以选择类似如下的URL:
- http://download.oracle.com/otn-pub/java/jdk/8u121-b13/e9e7ea248e2c4826b92b3f075a80e441/jdk-8u121-linux-x64.tar.gz (Must set 'Cookie: oraclelicense=accept-securebackup-cookie' in Header, when download.)
- http://mirrors.hust.edu.cn/apache/hadoop/core/hadoop-2.7.1/hadoop-2.7.1.tar.gz
- 利用该模板创建时,只能选择CentOS系统;
- 为了防止Timeout 失败,可以设置为120分钟;
- 我们选择的数据中心在上海。
测试部署结果
创建完成后,查看资源栈概况:
浏览器中输入图中的的WebsiteUrl,得到如下结果,则部署成功:
ROS示例模板
Hadoop_Distributed_Env_3_ecs.json:通过该模板可以一键部署上面的集群。
Hadoop_Distributed_ecsgroup.json: 该模板允许用户指定slaves节点的数量。
阿里云一键部署 Hadoop 分布式集群相关推荐
- 【云原生】阿里云服务器部署 Docker Swarm集群
- Hadoop分布式集群的安装与部署实训总结报告
目录 前言 一.Hadoop平台框介绍 1.Hadoop的架构 2.HDFS:遵循主从架构,它具有以下元素. 2.1 名称节点 -Namenode 2.2 数据节点 - Datanode 2.3 块 ...
- Hadoop集群安装部署_分布式集群安装_02
文章目录 一.上传与 解压 1. 上传安装包 2. 解压hadoop安装包 二.修改hadoop相关配置文件 2.1. hadoop-env.sh 2.2. core-site.xml 2.3. hd ...
- CentOS下部署Hadoop高性能集群
目录: •Hadoop 概述 •实战1:部署Hadoop高性能集群 Hadoop是什么 Hadoop是Lucene创始人Doug Cutting,根据Google的相关内容山寨出来的分布式文件系统和对 ...
- docker-compose部署MinIO分布式集群
docker-compose部署MinIO分布式集群 文章目录 docker-compose部署MinIO分布式集群 概述 纠删码 部署 配置 概述 MinIO是全球领先的对象存储先锋,目前在全世界有 ...
- 搭建Hadoop分布式集群的详细教程
目录 写在前面 一.创建虚拟机,安装Centos 二.VMware VMnet8模式共享主机网络配置 三.克隆集群节点HadoopSlave1与HadoopSlave2 四.Linux系统配置 五.H ...
- 阿里云环境下搭建HadoopHA集群
阿里云环境下搭建HadoopHA集群 1. HadoopHA介绍 1.1 hadoop高可用集群的简介 hadoop是一个海量数据存储和计算的平台,能够存储PB级以上的数据,并且利用MapRedu ...
- hadoop分布式集群搭建
hadoop集群搭建前的准备(一定要读):https://blog.51cto.com/14048416/2341450 hadoop分布式集群搭建: 1. 集群规划: 2.具体步骤: (1)上传安装 ...
- 【转】Hadoop分布式集群搭建hadoop2.6+Ubuntu16.04
https://www.cnblogs.com/caiyisen/p/7373512.html 前段时间搭建Hadoop分布式集群,踩了不少坑,网上很多资料都写得不够详细,对于新手来说搭建起来会遇到很 ...
- Hadoop分布式集群搭建hadoop2.6+Ubuntu16.04
前段时间搭建Hadoop分布式集群,踩了不少坑,网上很多资料都写得不够详细,对于新手来说搭建起来会遇到很多问题.以下是自己根据搭建Hadoop分布式集群的经验希望给新手一些帮助.当然,建议先把HDFS ...
最新文章
- Matlab200以内所有质数,Matlab 中求质数表
- Ubuntu下bpf纯c程序的编写与运行
- 现代谱估计:多窗口谱
- LinuxC高级编程——线程
- mysql聚合索引跟非聚合索引的区别_聚集索引和非聚集索引的区别有哪些
- c 语言 pthread_create_哪种编程语言又快又省电?有人对比了27种语言
- UI-Day02--昨日作业代码(二)
- boost.asio无锁异步并发
- markdown写小于等于号(等于贴着角)\leqslant
- 『科学计算_理论』矩阵求导
- 机器学习面试-其他重要算法
- 应用开发之Linq和EF
- ACM-百度之星资格赛之Energy Conversion——hdu4823
- 【笔记】android 系统常用user id列表
- arcgis 实验教程 第二章 ArcCatalog 简单操作--创建自己独特的工具箱
- matlab fft 作图,Matlab绘图示例
- 7.sqli-labs-Less7
- 把数字翻译成字符串——python
- Rebranding (字典序替换 思维)
- CAN发送和接收数据(回环测试,ok)