测试数据:

[hadoop@h201 mapreduce]$ more counttext.txt
hello mama
hello baba
hello word
cai wen wei
mama baba jiejie gege
gege jiejie didi
meimei jiejie
didi mama
ayi shushu
ayi mama
hello mama
hello baba
hello word
cai wen wei
mama baba jiejie gege
gege jiejie didi
meimei jiejie
didi mama
ayi shushu
ayi mama
hello mama
hello baba
hello word
cai wen wei
mama baba jiejie gege
gege jiejie didi
meimei jiejie
didi mama
ayi shushu
ayi mama
hello mama
hello baba
hello word
cai wen wei
mama baba jiejie gege
gege jiejie didi
meimei jiejie
didi mama
ayi shushu
ayi mama
hello mama
hello baba
hello word
cai wen wei
mama baba jiejie gege
gege jiejie didi
meimei jiejie
didi mama
ayi shushu
ayi mama

vim WordCount2.java

 1 package MapReduce;
 2
 3 import java.io.*;
 4 import org.apache.hadoop.conf.Configuration;
 5 import org.apache.hadoop.fs.Path;
 6 import org.apache.hadoop.io.IntWritable;
 7 import org.apache.hadoop.io.Text;
 8 import org.apache.hadoop.mapreduce.Job;
 9 import org.apache.hadoop.mapreduce.Mapper;
10 import org.apache.hadoop.mapreduce.Reducer;
11 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
12 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
13
14 public class WordCount2{    private static final String INPUT_PATH = "hdfs://h201:9000/user/hadoop/counttext.txt";      private static final String OUTPUT_PATH = "hdfs://h201:9000/user/hadoop/output";
15     public static class WordCount2Mapper extends Mapper<Object,Text,Text,IntWritable>{
16         private final static IntWritable one = new IntWritable(1);
17         private Text word = new Text();
18
19         public void map(Object key,Text value,Context context) throws IOException, InterruptedException {
20             String[] words = value.toString().split(" ");
21             for (String str: words){
22             word.set(str);
23             context.write(word,one);
24             }
25         }
26     }
27
28    public static class WordCount2Reducer extends Reducer<Text,IntWritable,Text,IntWritable> {
29        public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException {
30            int total=0;
31            for (IntWritable val : values){
32                total++;
33            }
34            context.write(key, new IntWritable(total));
35        }
36    }
37
38    public static void main (String[] args) throws Exception{
39        Configuration conf = new Configuration();
40        conf.set("mapred.jar","wc1.jar");
41        Job job = new Job(conf, "wordcount");
42        job.setJarByClass(WordCount2.class);
43        job.setMapperClass(WordCount2Mapper.class);
44        job.setReducerClass(WordCount2Reducer.class);
45        job.setOutputKeyClass(Text.class);
46        job.setOutputValueClass(IntWritable.class);
47        FileInputFormat.addInputPath(job, new Path(args[0]));
48        FileOutputFormat.setOutputPath(job, new Path(args[1]));
49        //FileInputFormat.addInputPath(job, new Path(INPUT_PATH));addInputPaths多路径50        //FileOutputFormat.setOutputPath(job, new Path(OUTPUT_PATH));
51        System.exit(job.waitForCompletion(true) ? 0 : 1);
52    }
53 }

[hadoop@h201 mapreduce]$ /usr/jdk1.7.0_25/bin/javac WordCount2.java
Note: WordCount2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
[hadoop@h201 mapreduce]$ ls
counttext.txt  WordCount2.class  WordCount2.java  WordCount2$WordCount2Mapper.class  WordCount2$WordCount2Reducer.class
[hadoop@h201 mapreduce]$ /usr/jdk1.7.0_25/bin/jar cvf wc1.jar WordCount2*class
added manifest
adding: WordCount2.class(in = 1531) (out= 815)(deflated 46%)
adding: WordCount2$WordCount2Mapper.class(in = 1831) (out= 783)(deflated 57%)
adding: WordCount2$WordCount2Reducer.class(in = 1623) (out= 670)(deflated 58%)
[hadoop@h201 mapreduce]$ ls
counttext.txt  wc1.jar  WordCount2.class  WordCount2.java  WordCount2$WordCount2Mapper.class  WordCount2$WordCount2Reducer.class
[hadoop@h201 mapreduce]$ hadoop jar wc1.jar WordCount2 hdfs://h201:9000/user/hadoop/counttext.txt hdfs://h201:9000/user/hadoop/output
18/03/09 23:33:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/03/09 23:33:39 INFO client.RMProxy: Connecting to ResourceManager at h201/192.168.121.132:8032
18/03/09 23:33:55 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
18/03/09 23:34:05 INFO input.FileInputFormat: Total input paths to process : 1
18/03/09 23:34:06 INFO mapreduce.JobSubmitter: number of splits:1
18/03/09 23:34:06 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/03/09 23:34:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1516635595760_0001
18/03/09 23:34:21 INFO impl.YarnClientImpl: Submitted application application_1516635595760_0001
18/03/09 23:34:21 INFO mapreduce.Job: The url to track the job: http://h201:8088/proxy/application_1516635595760_0001/
18/03/09 23:34:21 INFO mapreduce.Job: Running job: job_1516635595760_0001
18/03/09 23:35:32 INFO mapreduce.Job: Job job_1516635595760_0001 running in uber mode : false
18/03/09 23:35:32 INFO mapreduce.Job:  map 0% reduce 0%
18/03/09 23:36:33 INFO mapreduce.Job:  map 100% reduce 0%
18/03/09 23:36:45 INFO mapreduce.Job:  map 100% reduce 100%
18/03/09 23:36:47 INFO mapreduce.Job: Job job_1516635595760_0001 completed successfully
18/03/09 23:36:47 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=1366
                FILE: Number of bytes written=221143
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=747
                HDFS: Number of bytes written=101
                HDFS: Number of read operations=6
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=55286
                Total time spent by all reduces in occupied slots (ms)=8704
                Total time spent by all map tasks (ms)=55286
                Total time spent by all reduce tasks (ms)=8704
                Total vcore-seconds taken by all map tasks=55286
                Total vcore-seconds taken by all reduce tasks=8704
                Total megabyte-seconds taken by all map tasks=56612864
                Total megabyte-seconds taken by all reduce tasks=8912896
        Map-Reduce Framework
                Map input records=50
                Map output records=120
                Map output bytes=1120
                Map output materialized bytes=1366
                Input split bytes=107
                Combine input records=0
                Combine output records=0
                Reduce input groups=13
                Reduce shuffle bytes=1366
                Reduce input records=120
                Reduce output records=13
                Spilled Records=240
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=1264
                CPU time spent (ms)=4210
                Physical memory (bytes) snapshot=223772672
                Virtual memory (bytes) snapshot=2148155392
                Total committed heap usage (bytes)=136712192
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=640
        File Output Format Counters
                Bytes Written=101
[hadoop@h201 mapreduce]$ hadoop fs -lsr /user/hadoop/output
lsr: DEPRECATED: Please use 'ls -R' instead.
18/03/09 23:37:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   2 hadoop supergroup          0 2018-03-09 23:36 /user/hadoop/output/_SUCCESS
-rw-r--r--   2 hadoop supergroup        101 2018-03-09 23:36 /user/hadoop/output/part-r-00000
[hadoop@h201 mapreduce]$ hadoop fs -cat /user/hadoop/output/part-r-00000
18/03/09 23:39:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ayi     10
baba    10
cai     5
didi    10
gege    10
hello   15
jiejie  15
mama    20
meimei  5
shushu  5
wei     5
wen     5
word    5

转载于:https://www.cnblogs.com/jieran/p/8537012.html

MapReduce_wordcount相关推荐

  1. MapReduce之WordCount程序

    mapreduce整个流程包括三个类:mapper类.reduce类.driver端,如下: public class WordMapper extends Mapper<LongWritabl ...

最新文章

  1. linux 故障注入_阿里巴巴开源故障注入工具_chaosblade
  2. .Net Reactor 5脱壳教程
  3. python基础代码事例-Python基础总结成千行代码,让Python入门更简单!
  4. Go语言的DES加密(CBC模式, ECB模式) ---- 与java加密互通(转)
  5. 洛谷 - P3321 [SDOI2015]序列统计(原根+NTT)
  6. 前端 --- 关于DOM的介绍
  7. 【ArcGIS风暴】ArcGIS自动生成标识码(BSM)的两种方法案例教程
  8. spring boo_为您的下一个基于Spring的应用程序考虑使用spring-boot的原因!
  9. 方格取数(信息学奥赛一本通-T1277)
  10. mysql unique和key_MYSQL的primary key和unique key的区别
  11. How to use neural network to realize logic 'and' and 'or'?
  12. 数学分析-1.2数列和收敛数列-例题1、2、3
  13. 小米8刷官方欧版rom并从国内版rom提取安装MiPay、门卡模拟
  14. Doctrine浅析
  15. 安装Node.js,系统提示User installations are disabled via policy on the machine
  16. WebDay05 JQuery框架
  17. 大学物理复习笔记——量子物理
  18. DB2 sequence 获取下一个值
  19. [开发探索]知行合一
  20. 关于自动化安装离线补丁包更新

热门文章

  1. MS UI Automation Introduction
  2. C#拉姆达(=)表达式
  3. GNS3模拟VPC注意几点
  4. 网络工程师如何才能实现职位晋升
  5. Facebook的加密货币即将到来会对整个加密货币领域意味着什么
  6. (C++)1010 一元多项式求导 --需二刷
  7. hung-yi lee_p10_分类/概率生成模型
  8. 软件缺陷预测的两种定义
  9. Python组合数据类型之字典类型
  10. linux命令行3d,Linux命令行快捷键