Hadoop是一个能够让用户轻松架构和使用的开源分布式计算框架,一种可靠、高效、可伸缩的方式进行数据处理。本文主要目的是为大家提供一种非常简单的方法,在阿里云上部署Hadoop集群。
通过<阿里云ROS资源编排服务>,将VPC、NAT Gateway、ECS创建和Hadoop部署过程自动化,使大家能够非常方便地部署一个Hadoop集群。本文创建的Hadoop集群包含三个节点:master.hadoop,slave1.hadoop,slave2.hadoop。

UserData中安装Hadoop四部曲

正常安装Hadoop分为四步:配置ssh无密码登录,安装Java JDK,安装配置Hadoop tar包,和启动测试集群。

1. 配置ssh无密码登录

集群中的所有主机应该可以无密码ssh登录,因此3台ECS都应该执行如下命令,保证所有的主机中临时密钥和公钥均相同,这样3台ECS就可以无密码登录了。

    "ssh-keygen -t rsa -P '' -f '/root/.ssh/id_rsa' \n","cd /root/.ssh \n","mv id_rsa bak.id_rsa \n","mv id_rsa.pub bak.id_rsa.pub \n","echo '-----BEGIN RSA PRIVATE KEY-----' > id_rsa \n","echo 'MIIEowIBAAKCAQEAzfQ/QHwWB1njU9+Wu3RYi9g+g5rydSpAE0klefTJuZjtcaic' >> id_rsa \n","echo 'SCeBN5avih8UToZ148+Ef2YzOtoosqluZpoYLCPSpAqr8pmviBJIU3vfu9mnDG9L' >> id_rsa \n","echo 'oevT6K8w3wCBRCmqu+vc0Tpju/EeCuYK3+8w7e2I6F3+zIYzhhX3qmkocje7ACnV' >> id_rsa \n","echo 'yh2DB/2m3sogTMc0RT+5y3kJAnC6TOIlGsYjOOkPbEF3Ifn1o4ZjOFOmQlcRJer4' >> id_rsa \n","echo '1E6rTdXAxS2uDNMFBCf9Xyx7O9J+ELTCAXc4h7AE6WLdQb5Apzv4t1KswCAtRenP' >> id_rsa \n","echo '1xGcUYY8I/JUT2VvBWtQJennrk9jrPZUFDcmcwIDAQABAoIBAAwzDZQaRYvF7UtI' >> id_rsa \n","echo 'kTslVyFhe8J76SS7jfQWfxvMPi66OkZjQG6duG+8g0VhNei42j7WSfjp6trvlT2P' >> id_rsa \n","echo '/7QgKJJkxNNmtmy2Ycljm9kmG0ibSePYq9g5ieHcjr6G3yFUfoKHJBtYpBO74pWu' >> id_rsa \n","echo 'rrI5DuLpERUCjFc9E8w7fOIhPH4XZ6wk/EmPxHTgxZk+aMvqptyPSbUyiUOvCiZf' >> id_rsa \n","echo 'MD0ircs9vgtslMVDlz9m6CoiNz6B3Yf6eLRoGGMiGnsQzZHIfnHCMX8i65Jc+TvQ' >> id_rsa \n","echo 'fLopIHzwBwI55xpcOIBRgYiEAQJhLsSNSFugoxMcwe9RalTGGS21HOQu4b3ZylKM' >> id_rsa \n","echo '8ofEVKECgYEA/Y4MzN04wAsM1yNuHN+9sdiVLG28LWH3dgpcXqa9gyNsWs9Gf8uf' >> id_rsa \n","echo 'qbuvQGeKDRXCW93wO1jO+pCYVrkyY3l+KhBKIqmkJT5gFsa8dBUvEBLALiHNg3+o' >> id_rsa \n","echo 'jR2Vqsemk8kMZA8zfJ3FKcMb1pt4S2GqepqsdC3DgzIIsxufCh7jSzsCgYEAz/Cv' >> id_rsa \n","echo 'Z7gAdSFC8q6QxFxSyhfZwGA6QW6ZU7rZv9ZdGySvZg+vHbpNVmQ0BAQzJui3Qs9u' >> id_rsa \n","echo 'XQMpYafXWzKsPzG6ZWvYXTF7fuxlovvG8Qd2A2QnGtGMB9YtQMHVqbsUDwxMjiTn' >> id_rsa \n","echo 'VBZILkDf+WCwQ98P5UMoumI0goIcNZ+AXhcqrikCgYBkSwLvKfYfqH9Mvfv5Odsr' >> id_rsa \n","echo '9NKUv1c20FB1BYYh/mxp6eIbTW/CbwXZup6IqCvoHxpBAlna77b3T6iibSDsTgtE' >> id_rsa \n","echo 'kirw6Q8/mBukBrpWZGa4QeJ4nPBQuncuUmx4H/7Y6CaZkZW5DiMF8OIbEmYT0y7+' >> id_rsa \n","echo 'zh222r9CLtFYH23aL/uSLwKBgQCS1xyG2eE41aw5RBznDWtJW15iA5If8sJD5ocu' >> id_rsa \n","echo 'eWp2aImUQS8ghxdmEozI6U5WA7CmdWUyObFXTPc/Z6FLXwqJ5IZ+CRt0neuIFNSA' >> id_rsa \n","echo 'EQy9iFQ1FBUW06BRQpBns7yOg9jr6BOTxchjIV0I9caDp1nKRIrWU9NQ9iCFnYVA' >> id_rsa \n","echo '7IsvQQKBgFxgF7UOhwaiMb/ATSuhm2v9kVvRPEO9umdo7YJ9I4L09lYbAtpcnusQ' >> id_rsa \n","echo 'fIROYL25VeEMgYcQyInc3Fm/sgJdbXQnUy+3QbbCcBCmcLj27LPQyxuu7p9hbPPT' >> id_rsa \n","echo 'Mxx37OmYvOkSVTQz0T9HfDGvOJgt4t4cXD4T/7ewk62p6jdpSQrt' >> id_rsa \n","echo '-----END RSA PRIVATE KEY-----' >> id_rsa \n","echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDN9D9AfBYHWeNT35a7dFiL2D6DmvJ1KkATSSV59Mm5mO1xqJxIJ4E3lq+KHxROhnXjz4R/ZjM62iiyqW5mmhgsI9KkCqvyma+IEkhTe9+72acMb0uh69PorzDfAIFEKaq769zROmO78R4K5grf7zDt7YjoXf7MhjOGFfeqaShyN7sAKdXKHYMH/abeyiBMxzRFP7nLeQkCcLpM4iUaxiM46Q9sQXch+fWjhmM4U6ZCVxEl6vjUTqtN1cDFLa4M0wUEJ/1fLHs70n4QtMIBdziHsATpYt1BvkCnO/i3UqzAIC1F6c/XEZxRhjwj8lRPZW8Fa1Al6eeuT2Os9lQUNyZz root@iZ2zee53wf4ndvajz30cdvZ' > id_rsa.pub \n","cp id_rsa.pub authorized_keys \n","chmod 600 authorized_keys \n","chmod 600 id_rsa \n","sed -i 's/#   StrictHostKeyChecking ask/StrictHostKeyChecking no/' /etc/ssh/ssh_config \n","sed -i 's/GSSAPIAuthentication yes/GSSAPIAuthentication no/' /etc/ssh/ssh_config \n","service sshd restart \n",

为了保证安全,防止对外泄露密钥和公钥。我们要在master上执行下述命令,替换掉公开的临时密钥与公钥:

    "scp -r /root/.ssh/bak.*  root@$ipaddr_slave1:/root/.ssh/  \n","scp -r /root/.ssh/bak.*  root@$ipaddr_slave2:/root/.ssh/  \n","ssh root@$ipaddr_slave1 \"cd /root/.ssh; rm -f id_rsa; mv bak.id_rsa id_rsa; rm -f id_rsa.pub; mv bak.id_rsa.pub id_rsa.pub; rm -f authorized_keys; cp id_rsa.pub authorized_keys; chmod 600 authorized_keys; chmod 600 id_rsa; exit\" \n","ssh root@$ipaddr_slave2 \"cd /root/.ssh; rm id_rsa; mv bak.id_rsa id_rsa; rm id_rsa.pub; mv bak.id_rsa.pub id_rsa.pub; rm authorized_keys; cp id_rsa.pub authorized_keys; chmod 600 authorized_keys; chmod 600 id_rsa; exit\" \n","cd /root/.ssh \n rm id_rsa \n mv bak.id_rsa id_rsa \n rm id_rsa.pub \n mv bak.id_rsa.pub id_rsa.pub \n rm authorized_keys \n cp id_rsa.pub authorized_keys \n chmod 600 authorized_keys \n chmod 600 id_rsa \n cd \n",

2. 安装配置JDK

master 上安装JDK,并远程控制在slaves上安装JDK。

    "yum -y install aria2 \n","aria2c --header='Cookie: oraclelicense=accept-securebackup-cookie' $JdkUrl \n","mkdir -p $JAVA_HOME \ntar zxvf jdk*-x64.tar.gz -C $JAVA_HOME \ncd $JAVA_HOME \nmv jdk*.*/* ./ \nrmdir jdk*.* \n","echo  >> /etc/profile \n","echo export JAVA_HOME=$JAVA_HOME >> /etc/profile \n","echo export JRE_HOME=$JAVA_HOME/jre >> /etc/profile \n","echo export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH >> /etc/profile \n","echo export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH >> /etc/profile \n","source /etc/profile \n","ssh root@$ipaddr_slave1 \"mkdir -p $JAVA_HOME; exit\" \n","ssh root@$ipaddr_slave2 \"mkdir -p $JAVA_HOME; exit\" \n","scp -r $JAVA_HOME/*  root@$ipaddr_slave1:$JAVA_HOME \n","scp -r $JAVA_HOME/*  root@$ipaddr_slave2:$JAVA_HOME \n","ssh root@$ipaddr_slave1 \"echo >> /etc/profile; echo export JAVA_HOME=$JAVA_HOME >> /etc/profile; echo export JRE_HOME=$JAVA_HOME/jre >> /etc/profile; echo export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH >> /etc/profile; echo export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH >> /etc/profile; source /etc/profile; exit\" \n","ssh root@$ipaddr_slave2 \"echo >> /etc/profile; echo export JAVA_HOME=$JAVA_HOME >> /etc/profile; echo export JRE_HOME=$JAVA_HOME/jre >> /etc/profile; echo export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH >> /etc/profile; echo export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH >> /etc/profile; source /etc/profile; exit\" \n"," \n",

3. 安装配置Hadoop

下载安装Hadoop:

    "aria2c $HadoopUrl \n","mkdir -p $HADOOP_HOME \ntar zxvf hadoop-*.tar.gz -C $HADOOP_HOME \ncd $HADOOP_HOME \nmv hadoop-*.*/* ./ \nrmdir hadoop-*.* \n","echo  >> /etc/profile \n","echo export HADOOP_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_MAPRED_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_COMMON_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native >> /etc/profile \n","echo export YARN_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_HDFS_HOME=$HADOOP_HOME >> /etc/profile \n","echo export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop >> /etc/profile \n","echo export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_HOME/lib/native >> /etc/profile \n","echo export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH >> /etc/profile \n","echo export CLASSPATH=.:$CLASSPATH:$HADOOP__HOME/lib >> /etc/profile \n","source /etc/profile \n"," \n","mkdir -p $HADOOP_HOME/tmp \n","mkdir -p $HADOOP_HOME/dfs/name \n","mkdir -p $HADOOP_HOME/dfs/data \n",
###
    此处要对这4个文件进行正确配置,具体内容请参考模板Hadoop_Distributed_Env_3_ecs.json.core-site.xml,hdfs-site.xml,mapred-site.xml,yarn-site.xml
###"ssh root@$ipaddr_slave1 \"mkdir -p $HADOOP_HOME; exit\" \n","ssh root@$ipaddr_slave2 \"mkdir -p $HADOOP_HOME; exit\" \n","scp -r $HADOOP_HOME/*  root@$ipaddr_slave1:$HADOOP_HOME \n","scp -r $HADOOP_HOME/*  root@$ipaddr_slave2:$HADOOP_HOME \n"," \n","cd $HADOOP_HOME/etc/hadoop \n","echo $ipaddr_slave1 > slaves \n","echo $ipaddr_slave2 >> slaves \n",

4. 启动测试集群

最后格式化HDFS,关闭防火墙,启动集群。

    "hadoop namenode -format \n","systemctl stop  firewalld \n","start-dfs.sh \n","start-yarn.sh \n",

急速部署Hadoop集群

一键部署Hadoop集群>>

注意:

  • 必须确保可以正确下载Jdk和Hadoop tar 包,我们可以选择类似如下的URL:

    • http://download.oracle.com/otn-pub/java/jdk/8u121-b13/e9e7ea248e2c4826b92b3f075a80e441/jdk-8u121-linux-x64.tar.gz (Must set 'Cookie: oraclelicense=accept-securebackup-cookie' in Header, when download.)
    • http://mirrors.hust.edu.cn/apache/hadoop/core/hadoop-2.7.1/hadoop-2.7.1.tar.gz
  • 利用该模板创建时,只能选择CentOS系统;
  • 为了防止Timeout 失败,可以设置为120分钟;
  • 我们选择的数据中心在上海。

测试部署结果

创建完成后,查看资源栈概况:

浏览器中输入图中的的WebsiteUrl,得到如下结果,则部署成功:

ROS示例模板

Hadoop_Distributed_Env_3_ecs.json:通过该模板可以一键部署上面的集群。
Hadoop_Distributed_ecsgroup.json:  该模板允许用户指定slaves节点的数量。

阿里云一键部署 Hadoop 分布式集群相关推荐

  1. 【云原生】阿里云服务器部署 Docker Swarm集群

  2. Hadoop分布式集群的安装与部署实训总结报告

    目录 前言 一.Hadoop平台框介绍 1.Hadoop的架构 2.HDFS:遵循主从架构,它具有以下元素. 2.1 名称节点 -Namenode 2.2 数据节点 - Datanode 2.3 块 ...

  3. Hadoop集群安装部署_分布式集群安装_02

    文章目录 一.上传与 解压 1. 上传安装包 2. 解压hadoop安装包 二.修改hadoop相关配置文件 2.1. hadoop-env.sh 2.2. core-site.xml 2.3. hd ...

  4. CentOS下部署Hadoop高性能集群

    目录: •Hadoop 概述 •实战1:部署Hadoop高性能集群 Hadoop是什么 Hadoop是Lucene创始人Doug Cutting,根据Google的相关内容山寨出来的分布式文件系统和对 ...

  5. docker-compose部署MinIO分布式集群

    docker-compose部署MinIO分布式集群 文章目录 docker-compose部署MinIO分布式集群 概述 纠删码 部署 配置 概述 MinIO是全球领先的对象存储先锋,目前在全世界有 ...

  6. 搭建Hadoop分布式集群的详细教程

    目录 写在前面 一.创建虚拟机,安装Centos 二.VMware VMnet8模式共享主机网络配置 三.克隆集群节点HadoopSlave1与HadoopSlave2 四.Linux系统配置 五.H ...

  7. 阿里云环境下搭建HadoopHA集群

    阿里云环境下搭建HadoopHA集群 1. HadoopHA介绍 1.1 hadoop高可用集群的简介 ​ hadoop是一个海量数据存储和计算的平台,能够存储PB级以上的数据,并且利用MapRedu ...

  8. hadoop分布式集群搭建

    hadoop集群搭建前的准备(一定要读):https://blog.51cto.com/14048416/2341450 hadoop分布式集群搭建: 1. 集群规划: 2.具体步骤: (1)上传安装 ...

  9. 【转】Hadoop分布式集群搭建hadoop2.6+Ubuntu16.04

    https://www.cnblogs.com/caiyisen/p/7373512.html 前段时间搭建Hadoop分布式集群,踩了不少坑,网上很多资料都写得不够详细,对于新手来说搭建起来会遇到很 ...

  10. Hadoop分布式集群搭建hadoop2.6+Ubuntu16.04

    前段时间搭建Hadoop分布式集群,踩了不少坑,网上很多资料都写得不够详细,对于新手来说搭建起来会遇到很多问题.以下是自己根据搭建Hadoop分布式集群的经验希望给新手一些帮助.当然,建议先把HDFS ...

最新文章

  1. Matlab200以内所有质数,Matlab 中求质数表
  2. Ubuntu下bpf纯c程序的编写与运行
  3. 现代谱估计:多窗口谱
  4. LinuxC高级编程——线程
  5. mysql聚合索引跟非聚合索引的区别_聚集索引和非聚集索引的区别有哪些
  6. c 语言 pthread_create_哪种编程语言又快又省电?有人对比了27种语言
  7. UI-Day02--昨日作业代码(二)
  8. boost.asio无锁异步并发
  9. markdown写小于等于号(等于贴着角)\leqslant
  10. 『科学计算_理论』矩阵求导
  11. 机器学习面试-其他重要算法
  12. 应用开发之Linq和EF
  13. ACM-百度之星资格赛之Energy Conversion——hdu4823
  14. 【笔记】android 系统常用user id列表
  15. arcgis 实验教程 第二章 ArcCatalog 简单操作--创建自己独特的工具箱
  16. matlab fft 作图,Matlab绘图示例
  17. 7.sqli-labs-Less7
  18. 把数字翻译成字符串——python
  19. Rebranding (字典序替换 思维)
  20. CAN发送和接收数据(回环测试,ok)

热门文章

  1. mysql支付账单怎么设计_订单与支付设计
  2. ANE实现总结(一)
  3. C语言 AES算法 加密解密
  4. java根据业务排序利用Comparator.comparing自定义排序规则
  5. 876. 链表的中间结点【我亦无他唯手熟尔】
  6. 开源工作流引擎 Workflow Core 的研究和使用教程
  7. TKinter —— GUI in python  4. Handing User Event 小组件 赋功能 (概念 必看!)
  8. WAP中推送技术的分析与设计(转)
  9. xdoj-81-字符串查找
  10. nfc门禁卡的复制和迁移