http://localhost:50070/
http://localhost:8088/cluster

1 问题排查

➜  /Users/zhaoshuai11/work/hadoop-2.7.3 hadoop fs -ls /
22/07/30 09:43:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicablelog4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR

22/07/30 09:58:59 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
为什么执行mr 首先请求yarn?

2 hdfs 分布式文件系统

数据量大:多机横向发展,理论上无限扩展。
元数据记录:元数据记录-文件及其存储位置信息,快速定位文件位置。
文件分块存储:文件分块存储在不同机器,针对块并行操作提高效率。
副本机制:不同机器设置备份,冗余存储,保障数据安全。







➜  /Users/zhaoshuai11 hadoop fs -put java_error_in_idea_14849.log /itcast
➜  /Users/zhaoshuai11 hadoop fs -ls -h /itcast
Found 4 items
-rw-r--r--   1 zhaoshuai11 supergroup         21 2022-07-30 09:55 /itcast/hello.txt
-rw-r--r--   1 zhaoshuai11 supergroup    641.9 K 2022-07-30 09:48 /itcast/java_error_in_idea_11861.log
-rw-r--r--   1 zhaoshuai11 supergroup     98.9 K 2022-07-30 10:34 /itcast/java_error_in_idea_14849.log
drwxr-xr-x   - zhaoshuai11 supergroup          0 2022-07-30 09:59 /itcast/wordcount
➜  /Users/zhaoshuai11 hadoop fs -cat /itcast/hello.txt
hadoop hadoop hadoop
➜  /Users/zhaoshuai11 hadoop fs -get /itcast/wordcount/output/part-r-00000 ./
➜  /Users/zhaoshuai11 ll
-rw-r--r--   1 zhaoshuai11  staff     9B  7 30 10:41 part-r-00000
➜  /Users/zhaoshuai11 cat part-r-00000
hadoop  3

小文件合并:

➜  /Users/zhaoshuai11 hadoop fs -appendToFile login.sh login-ext.sh /itcast/hello.txt
➜  /Users/zhaoshuai11 hadoop fs -cat /itcast/hello.txthadoop hadoop hadoop#!/usr/bin/expect -fset host "relay.baidu-int.com"set username "zhaoshuai11"set password "zs19961211."spawn ssh $username@$hostexpect "*Please input user's password*" {send "$password\r"}interact#!/bin/shbasepath=$(cd `dirname $0`; pwd)export LC_CTYPE=en_US#expect脚本所在位置filepath=$1if [ -f $filepath ]; thenexpect $filepathelseecho "$filepath not exits"fi%
➜  /Users/zhaoshuai11

3 hdfs 角色解读





4 hdfs 上传文件








5 MR







6 MR 特点


7 MR局限性

8 MR 实例进程



9 MR 官方案例 圆周率评估



➜  /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce hadoop jar hadoop-mapreduce-examples-2.7.3.jar pi 2 2
Number of Maps  = 2
Samples per Map = 2
Wrote input for Map #0
Wrote input for Map #1
Starting Job
22/07/30 15:38:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/07/30 15:38:57 INFO input.FileInputFormat: Total input paths to process : 2
22/07/30 15:38:58 INFO mapreduce.JobSubmitter: number of splits:2
22/07/30 15:38:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1659144775026_0003
22/07/30 15:38:59 INFO impl.YarnClientImpl: Submitted application application_1659144775026_0003
22/07/30 15:38:59 INFO mapreduce.Job: The url to track the job: http://127.0.0.1:8088/proxy/application_1659144775026_0003/
22/07/30 15:38:59 INFO mapreduce.Job: Running job: job_1659144775026_0003
22/07/30 15:39:09 INFO mapreduce.Job: Job job_1659144775026_0003 running in uber mode : false
22/07/30 15:39:09 INFO mapreduce.Job:  map 0% reduce 0%
22/07/30 15:39:15 INFO mapreduce.Job:  map 100% reduce 0%
22/07/30 15:39:24 INFO mapreduce.Job:  map 100% reduce 100%
22/07/30 15:39:25 INFO mapreduce.Job: Job job_1659144775026_0003 completed successfully
22/07/30 15:39:25 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=50FILE: Number of bytes written=357444FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=540HDFS: Number of bytes written=215HDFS: Number of read operations=11HDFS: Number of large read operations=0HDFS: Number of write operations=3Job CountersLaunched map tasks=2Launched reduce tasks=1Data-local map tasks=2Total time spent by all maps in occupied slots (ms)=8543Total time spent by all reduces in occupied slots (ms)=4822Total time spent by all map tasks (ms)=8543Total time spent by all reduce tasks (ms)=4822Total vcore-milliseconds taken by all map tasks=8543Total vcore-milliseconds taken by all reduce tasks=4822Total megabyte-milliseconds taken by all map tasks=8748032Total megabyte-milliseconds taken by all reduce tasks=4937728Map-Reduce FrameworkMap input records=2Map output records=4Map output bytes=36Map output materialized bytes=56Input split bytes=304Combine input records=0Combine output records=0Reduce input groups=2Reduce shuffle bytes=56Reduce input records=4Reduce output records=0Spilled Records=8Shuffled Maps =2Failed Shuffles=0Merged Map outputs=2GC time elapsed (ms)=201CPU time spent (ms)=0Physical memory (bytes) snapshot=0Virtual memory (bytes) snapshot=0Total committed heap usage (bytes)=534773760Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=236File Output Format CountersBytes Written=97
Job Finished in 28.108 seconds
Estimated value of Pi is 4.00000000000000000000
➜  /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce

22/07/30 15:38:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 向yarn索要资源

10 MR wordcount单词统计


➜  /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /itcast/hello.txt /itcast/hello-count
22/07/30 15:49:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/07/30 15:49:26 INFO input.FileInputFormat: Total input paths to process : 1
22/07/30 15:49:27 INFO mapreduce.JobSubmitter: number of splits:1
22/07/30 15:49:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1659144775026_0004
22/07/30 15:49:27 INFO impl.YarnClientImpl: Submitted application application_1659144775026_0004
22/07/30 15:49:27 INFO mapreduce.Job: The url to track the job: http://127.0.0.1:8088/proxy/application_1659144775026_0004/
22/07/30 15:49:27 INFO mapreduce.Job: Running job: job_1659144775026_0004
22/07/30 15:49:37 INFO mapreduce.Job: Job job_1659144775026_0004 running in uber mode : false
22/07/30 15:49:37 INFO mapreduce.Job:  map 0% reduce 0%
22/07/30 15:49:43 INFO mapreduce.Job:  map 100% reduce 0%
22/07/30 15:49:49 INFO mapreduce.Job:  map 100% reduce 100%
22/07/30 15:49:50 INFO mapreduce.Job: Job job_1659144775026_0004 completed successfully
22/07/30 15:49:50 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=607FILE: Number of bytes written=238801FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=510HDFS: Number of bytes written=441HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job CountersLaunched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=3442Total time spent by all reduces in occupied slots (ms)=3708Total time spent by all map tasks (ms)=3442Total time spent by all reduce tasks (ms)=3708Total vcore-milliseconds taken by all map tasks=3442Total vcore-milliseconds taken by all reduce tasks=3708Total megabyte-milliseconds taken by all map tasks=3524608Total megabyte-milliseconds taken by all reduce tasks=3796992Map-Reduce FrameworkMap input records=18Map output records=47Map output bytes=591Map output materialized bytes=607Input split bytes=103Combine input records=47Combine output records=40Reduce input groups=40Reduce shuffle bytes=607Reduce input records=40Reduce output records=40Spilled Records=80Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=122CPU time spent (ms)=0Physical memory (bytes) snapshot=0Virtual memory (bytes) snapshot=0Total committed heap usage (bytes)=332398592Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=407File Output Format CountersBytes Written=441
➜  /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce

➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce hadoop fs -get /itcast/hello-count ./

➜  /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce/hello-count cat part-r-00000
"$filepath 1
"$password\r"}    1
"*Please   1
"relay.baidu-int.com" 1
"zhaoshuai11" 1
"zs19961211." 1
#!/usr/bin/expect   1
#expect脚本所在位置   1
$0`;   1
$filepath   2
$username@$host    1
-f  2
LC_CTYPE=en_US 1
[   1
];  1
`dirname   1
basepath=$(cd  1
echo    1
else    1
exits" 1
expect  2
export  1
fi  1
filepath=$1    1
hadoop  3
host    1
if  1
input   1
interact#!/bin/sh   1
not 1
password    1
password*" 1
pwd)    1
set 3
spawn   1
ssh 1
then    1
user's 1
username    1
{send   1
➜  /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce/hello-count

11 Map



12 Reduce


reduce 主动 拉取 mapTask 数据。

13 shuffle



14 Yarn






15 程序提交Yarn集群交互流程






16 资源调度器scheduler和调度策略








Hadoop基础入门学习相关推荐

  1. MAYA 2022基础入门学习教程

    流派:电子学习| MP4 |视频:h264,1280×720 |音频:AAC,48.0 KHz 语言:英语+中英文字幕(根据原英文字幕机译更准确)|大小解压后:3.41 GB |时长:4.5小时 包含 ...

  2. Blender 3.0基础入门学习教程 Introduction to Blender 3.0

    成为Blender通才,通过这个基于项目的循序渐进课程学习所有主题的基础知识. 你会学到什么 教程获取:Blender 3.0基础入门学习教程 Introduction to Blender 3.0- ...

  3. 三维地形制作软件 World Machine 基础入门学习教程

    <World Machine课程>涵盖了你需要的一切,让你有一个坚实的基础来构建自己的高质量的电影或视频游戏地形. 你会学到什么 为渲染或游戏开发创建高分辨率.高细节的地形. 基于Worl ...

  4. SketchUp Pro 2021基础入门学习视频教程

    SketchUp Pro 2021基础入门学习视频教程 1280X720 MP4 |视频:h264,1280×720 |音频:AAC,44.1 KHz,2 Ch 流派:电子学习|语言:英语+中文字幕( ...

  5. Maya基础入门学习教程

    Maya基础入门学习教程 视频:.MKV, 1280x720, 共57节课 时长 4小时25分钟,3GB 语言:英语+中文字幕(根据原英文字幕机译更准确)+原英文字幕 指导老师:Shane Whitt ...

  6. Maya2022基础入门学习教程

    Maya2022基础入门学习教程 Maya 2022 Essential Training Maya2022基础入门学习教程 Maya 2022 Essential Training MP4 |视频: ...

  7. Blender基础入门学习教程 Learning Blender from Scratch

    Blender基础入门学习教程 Learning Blender from Scratch 流派:电子学习| MP4 |视频:h264,1280×720 |音频:aac,48000 Hz 语言:英语+ ...

  8. 零基础入门学习Python,我与python的第一次亲密接触后的感受!

    前言:Python是适合初学者入门最好的语言 Python适合初学者入门最好的语言 人工智能用Python?高考要加入Python?现在连微软官方Excel都要把Python作为官方语言!Python ...

  9. 零基础入门学习Python(33)-图形用户界面编程(GUI编程)EasyGui

    用户界面编程,即平时说的GUI(Graphical User Interface)编程,那些带有按钮.文本.输入框的窗口的编程 EasyGui是一个非常简单的GUI模块,一旦导入EasyGui模块,P ...

最新文章

  1. backbone.js学习笔记
  2. 关于TensorFlow开发者证书,你想要的资源都在这里!
  3. confluence启动不起来_汽车“一键启动”只用来点火?太浪费!你不知道的还有这3个功能...
  4. linux mysql utf-8编码_笔记:linux下mysql设置utf-8编码方法
  5. ArcGIS 10.2.2 for Desktop非管理员权限用户连接Oracle12c,崩溃
  6. java程序员如何编写出优美的代码-java编程规范
  7. Python压缩图片到指定大小
  8. java判断来源_java中判断applet 来源的方法有()
  9. 用数据分析看共享单车
  10. 微信公众号文章采集浅谈--搜狗APP近一天文章
  11. 鼠标自动点击工具鼠标连点器鼠标定时自动点击使用方法
  12. 【无标题】Vue长列表性能优化常用方案
  13. MySQL cluster集群/NDB集群学习
  14. PAT | 1080 MOOC期终成绩 (25分)【附柳神代码】
  15. ZZULIOJ-1095: 时间间隔(多实例测试)(Java)
  16. msys2 结合 vscode 使用 lldb 进行调试及 lldb-mi.exe 问题
  17. 探秘 App Clips
  18. 新能源系统仿真测试解决方案
  19. poj 3014 Asteroids
  20. GPS北斗定位模块应用实现汽车风控管理

热门文章

  1. Centos7 GNOME Desktop桌面版-调整屏幕分辨率
  2. 系统服务器与手机关系,手机与云服务器交互
  3. Mysql——分组查询
  4. android 动画无缝滚动,CSS3动画之无缝滚动
  5. 图片与视频的相互转换
  6. 充分必要条件与C语言,充分条件与必要条件知识点总结,高中数学充分条件与必要条件知识点总结...
  7. 【Python机器学习】标注任务与序列问题讲解(图文解释)
  8. java-opencv文档
  9. 【数理知识】kronecker 克罗内克积
  10. 联合(联合体,共用体)详解