Hadoop2.x编程入门实例:MaxTemperature

@(HADOOP)[hadoop]

  • Hadoop2x编程入门实例MaxTemperature

    • 一前期准备
    • 二编写代码
      • 1创建Map
      • 2创建Reduce
      • 3创建main方法
      • 4导出成MaxTempjar并上传至运行程序的服务器
    • 三运行程序
      • 1创建input目录并将sampletxt复制到input目录
      • 2运行程序
      • 3查看结果

注意:以下内容在2.x版本与1.x版本同样适用,已在2.4.1与1.2.0进行测试。

一、前期准备

1、创建伪分布Hadoop环境,请参考官方文档。或者http://blog.csdn.net/jediael_lu/article/details/38637277

2、准备数据文件如下sample.txt:

123456798676231190101234567986762311901012345679867623119010123456798676231190101234561+00121534567890356
123456798676231190101234567986762311901012345679867623119010123456798676231190101234562+01122934567890456
123456798676231190201234567986762311901012345679867623119010123456798676231190101234562+02120234567893456
123456798676231190401234567986762311901012345679867623119010123456798676231190101234561+00321234567803456
123456798676231190101234567986762311902012345679867623119010123456798676231190101234561+00429234567903456
123456798676231190501234567986762311902012345679867623119010123456798676231190101234561+01021134568903456
123456798676231190201234567986762311902012345679867623119010123456798676231190101234561+01124234578903456
123456798676231190301234567986762311905012345679867623119010123456798676231190101234561+04121234678903456
123456798676231190301234567986762311905012345679867623119010123456798676231190101234561+00821235678903456

二、编写代码

1、创建Map

package org.jediael.hadoopDemo.maxtemperature;import java.io.IOException;import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;public class MaxTemperatureMapper extendsMapper<LongWritable, Text, Text, IntWritable> {private static final int MISSING = 9999;@Overridepublic void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException {String line = value.toString();String year = line.substring(15, 19);int airTemperature;if (line.charAt(87) == '+') { // parseInt doesn't like leading plus// signsairTemperature = Integer.parseInt(line.substring(88, 92));} else {airTemperature = Integer.parseInt(line.substring(87, 92));}String quality = line.substring(92, 93);if (airTemperature != MISSING && quality.matches("[01459]")) {context.write(new Text(year), new IntWritable(airTemperature));}}
}

2、创建Reduce

package org.jediael.hadoopDemo.maxtemperature;import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;public class MaxTemperatureReducer extendsReducer<Text, IntWritable, Text, IntWritable> {@Overridepublic void reduce(Text key, Iterable<IntWritable> values, Context context)throws IOException, InterruptedException {int maxValue = Integer.MIN_VALUE;for (IntWritable value : values) {maxValue = Math.max(maxValue, value.get());}context.write(key, new IntWritable(maxValue));}
}

3、创建main方法

package org.jediael.hadoopDemo.maxtemperature;import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class MaxTemperature {public static void main(String[] args) throws Exception {if (args.length != 2) {System.err.println("Usage: MaxTemperature <input path> <output path>");System.exit(-1);}Job job = new Job();job.setJarByClass(MaxTemperature.class);job.setJobName("Max temperature");FileInputFormat.addInputPath(job, new Path(args[0]));FileOutputFormat.setOutputPath(job, new Path(args[1]));job.setMapperClass(MaxTemperatureMapper.class);job.setReducerClass(MaxTemperatureReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);System.exit(job.waitForCompletion(true) ? 0 : 1);}
}

4、导出成MaxTemp.jar,并上传至运行程序的服务器。

三、运行程序

1、创建input目录并将sample.txt复制到input目录

hadoop fs -put sample.txt /

2、运行程序

export HADOOP_CLASSPATH=MaxTemp.jarhadoop org.jediael.hadoopDemo.maxtemperature.MaxTemperature /sample.txt output10

注意输出目录不能已经存在,否则会创建失败。

3、查看结果

(1)查看结果

[jediael@jediael44 code]$  hadoop fs -cat output10/*
14/07/09 14:51:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
1901    42
1902    212
1903    412
1904    32
1905    102

(2)运行时输出

[jediael@jediael44 code]$  hadoop org.jediael.hadoopDemo.maxtemperature.MaxTemperature /sample.txt output10
14/07/09 14:50:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/07/09 14:50:41 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/07/09 14:50:42 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
14/07/09 14:50:43 INFO input.FileInputFormat: Total input paths to process : 1
14/07/09 14:50:43 INFO mapreduce.JobSubmitter: number of splits:1
14/07/09 14:50:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1404888618764_0001
14/07/09 14:50:44 INFO impl.YarnClientImpl: Submitted application application_1404888618764_0001
14/07/09 14:50:44 INFO mapreduce.Job: The url to track the job: http://jediael44:8088/proxy/application_1404888618764_0001/
14/07/09 14:50:44 INFO mapreduce.Job: Running job: job_1404888618764_0001
14/07/09 14:50:57 INFO mapreduce.Job: Job job_1404888618764_0001 running in uber mode : false
14/07/09 14:50:57 INFO mapreduce.Job:  map 0% reduce 0%
14/07/09 14:51:05 INFO mapreduce.Job:  map 100% reduce 0%
14/07/09 14:51:15 INFO mapreduce.Job:  map 100% reduce 100%
14/07/09 14:51:15 INFO mapreduce.Job: Job job_1404888618764_0001 completed successfully
14/07/09 14:51:16 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=94FILE: Number of bytes written=185387FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=1051HDFS: Number of bytes written=43HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=5812Total time spent by all reduces in occupied slots (ms)=7023Total time spent by all map tasks (ms)=5812Total time spent by all reduce tasks (ms)=7023Total vcore-seconds taken by all map tasks=5812Total vcore-seconds taken by all reduce tasks=7023Total megabyte-seconds taken by all map tasks=5951488Total megabyte-seconds taken by all reduce tasks=7191552Map-Reduce FrameworkMap input records=9Map output records=8Map output bytes=72Map output materialized bytes=94Input split bytes=97Combine input records=0Combine output records=0Reduce input groups=5Reduce shuffle bytes=94Reduce input records=8Reduce output records=5Spilled Records=16Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=154CPU time spent (ms)=1450Physical memory (bytes) snapshot=303112192Virtual memory (bytes) snapshot=1685733376Total committed heap usage (bytes)=136515584Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=954File Output Format Counters Bytes Written=43

Hadoop2.x编程入门实例:MaxTemperature相关推荐

  1. linux Shell(脚本)编程入门实例讲解详解

    linux Shell(脚本)编程入门实例讲解详解 为什么要进行shell编程 在Linux系统中,虽然有各种各样的图形化接口工具,但是sell仍然是一个非常灵活的工具.Shell不仅仅是命令的收集, ...

  2. Windows 外壳扩展编程入门实例

    Windows 外壳扩展编程入门实例 -- Delphi 篇 作者的话 关于Windows 外壳扩展方面的文章私心以为最好的应当算是Michael Dunn 的TheComplete Idiot's ...

  3. Python基础编程入门实例:恺撒密码

    文章目录 Python基础编程入门实例:恺撒密码 一.什么是恺撒密码 二.程序运行环境 三.恺撒密码:加密 3.1.恺撒密码加密实例程序 3.2.恺撒密码加密实例程序运行结果 四.恺撒密码:解密 4. ...

  4. Hadoop2.4.1入门实例:MaxTemperature

    注意:以下内容在2.x版本与1.x版本同样适用,已在2.4.1与1.2.0进行测试. 一.前期准备 1.创建伪分布Hadoop环境,请参考官方文档.或者http://blog.csdn.net/jed ...

  5. matlab编程入门实例,matlab编程实例100例

    matlab 1-32是:图形应用篇 33-66是:界面设计篇 67-84是:图形处理篇 85-100是:数值分析篇 实例1:三角函数曲线(1) funcTIon shili01 h0=figure( ...

  6. [Delphi] Windows 外壳扩展编程入门实例

    关于Windows 外壳扩展方面的文章私心以为最好的应当算是Michael Dunn 的TheComplete Idiot's Guide to Writing Shell Extensions 我也 ...

  7. linux驱动编程入门实例

    编辑 /*****hello.c*******/ #include <linux/init.h> #include <linux/module.h> #include < ...

  8. Linux/Unix服务端和客户端Socket编程入门实例(含源码下载)

    前言 本章节是用基本的Linux/Unix基本函数编写一个完整的服务器和客户端例子,可在Linux(ubuntu)和Unix(freebsd)上运行,客户端和服务端的功能如下: 客户端从标准输入读入一 ...

  9. 用Android Studio进行NDK编程入门实例

    参考了网上各种教程,跌跌撞撞最终才把流程走通,特此记录一下: 有必要先交代下开发环境: 操作系统:Win7 Android Studio 3.0.1 gradle 3.0.1 首先,新建一个Andro ...

最新文章

  1. 统计学习方法笔记(二)-kd树原理及python实现
  2. 概率论-3.4 多维随机变量的特征数
  3. 如何查找Power BI本地报表服务器产品密钥
  4. python定义数列每项的变量__Python定义方法
  5. 14 MM配置-BP业务伙伴-定义供应商科目组和字段选择
  6. 分布式系统中的领导选举
  7. static 静态局部变量
  8. [Java] 蓝桥杯ADV-95 算法提高 字符串比较
  9. ELK下Kibana和Elasticsearch之间相互TLS身份验证
  10. IOS开发学习----给表视图设置缩进级别
  11. TeaVM的samples/benchmark范例运行办法
  12. python超简易入门笔记版(其二)
  13. 电脑快捷键大全 Ctrl
  14. js 十六进制,八进制,二进制
  15. 传说中WM手机工程测试命令
  16. 程序员——相忘于江湖
  17. ios13短信如何转移到android,iOS13加入全新数据迁移功能,无需网络也能转移旧手机数据...
  18. php vox转码,php base64 编码图片,音频,视频
  19. 接口规范-API接口
  20. 字典大全(修改,添加,删除)所有遍历

热门文章

  1. pat 1025 反转链表
  2. 【图表】java 24年发展历史及长期支持jdk版本(up to 2020.04)
  3. 【已解决】TypeError: ‘<‘ not supported between instances of ‘str‘ and ‘int‘_Python系列学习笔记
  4. 15行代码AC——习题3-1 得分 (UVa1585,Score)
  5. 未公开接口主要指以下哪几类_Java8的 Stream 函数式接口,你了解多少?
  6. usb接口驱动_乾坤合一~Linux设备驱动之USB主机和设备驱动
  7. MongoDB数据库(了解MongoDB及基础命令,备份数据库)
  8. c语言函数的程序设计,C语言程序设计第3版,第6章函数程序设计.ppt
  9. 理解有符号数和无符号数的区别
  10. unity android 符号表,记录腾讯bugly关于符号表的配置