Hadoop MR Java 代码,统计结果输出到日志文件中

package vitamin.user_static_table;import java.io.IOException;import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Counter;
import org.apache.hadoop.mapreduce.Mapper;public class GetUserStaticMapper extends Mapper<LongWritable, Text, Text, LongWritable> {private LongWritable out = new LongWritable(1);@Overrideprotected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {String line = value.toString().trim();String tks[] = line.split("\t");if (tks.length < 5) {return;}Counter number1 = context.getCounter("NULL", "all");number1.increment(1L);context.write(new Text("uid_"+tks[1]+"_"+tks[0]), out);if (!(tks[2].equals("_")||tks[3].equals("_"))) {Counter number2 = context.getCounter("NULL", "c_"+tks[1]);number2.increment(1L);if (tks[2].equals("1")) {Counter number3 = context.getCounter("NULL", "sex_c_"+tks[1]);number3.increment(1L);}if (tks[3].equals("1")) {Counter number4 = context.getCounter("NULL", "age_c_"+tks[1]);number4.increment(1L);}}if (tks[1].equals("1")&& !tks[4].equals("_")&&!tks[4].equals("0")) {Counter number5 = context.getCounter("NULL", "tag_c1_sum");number5.increment(1L);Counter number6 = context.getCounter("NULL", "tag_c_1");number6.increment(Integer.parseInt(tks[4])*1L); }}
}

统计结果直接输出到log文件中,不进入reducer里,输入如下(269-277为输出统计量):

426 17/08/09 09:05:38 INFO mapreduce.Job: Job job_1493284708000_147282 completed successfully
427 17/08/09 09:05:38 INFO mapreduce.Job: Counters: 53
428     File System Counters
429         FILE: Number of bytes read=768
430         FILE: Number of bytes written=28933956
431         FILE: Number of read operations=0
432         FILE: Number of large read operations=0
433         FILE: Number of write operations=0
434         HDFS: Number of bytes read=374470086
435         HDFS: Number of bytes written=0
436         HDFS: Number of read operations=684
437         HDFS: Number of large read operations=0
438         HDFS: Number of write operations=256
439     Job Counters
440         Killed map tasks=2
441         Launched map tasks=102
442         Launched reduce tasks=128
443         Data-local map tasks=14
444         Rack-local map tasks=88
445         Total time spent by all maps in occupied slots (ms)=3509175
446         Total time spent by all reduces in occupied slots (ms)=9734817
447         Total time spent by all map tasks (ms)=701835
448         Total time spent by all reduce tasks (ms)=3244939
449         Total vcore-seconds taken by all map tasks=701835
450         Total vcore-seconds taken by all reduce tasks=3244939
451         Total megabyte-seconds taken by all map tasks=3593395200
452         Total megabyte-seconds taken by all reduce tasks=9968452608
453     Map-Reduce Framework
454         Map input records=13186141
455         Map output records=0
456         Map output bytes=0
457         Map output materialized bytes=76800
458         Input split bytes=14800
459         Combine input records=0
460         Combine output records=0
461         Reduce input groups=0
462         Reduce shuffle bytes=76800
463         Reduce input records=0
466         Shuffled Maps =12800
467         Failed Shuffles=0
470         CPU time spent (ms)=546840
471         Physical memory (bytes) snapshot=137153257472
251         Total time spent by all maps in occupied slots (ms)=7766444
252         Total time spent by all reduces in occupied slots (ms)=0
253         Total time spent by all map tasks (ms)=3883222
254         Total vcore-seconds taken by all map tasks=3883222
255         Total megabyte-seconds taken by all map tasks=5964628992
258         Map output records=21091153
261         Failed Shuffles=0
262         Merged Map outputs=0
263         GC time elapsed (ms)=15595
264         CPU time spent (ms)=1460320
265         Physical memory (bytes) snapshot=168560390144
266         Virtual memory (bytes) snapshot=951342964736
267         Total committed heap usage (bytes)=468841398272
268     NULL
269         age_c_0=108143
270         age_c_1=7596379
271         all=21091153
272         c_0=1258386
273         c_1=19832601
274         sex_c_0=1138055
275         sex_c_1=18447951
276         tag_c1_sum=19175427
277         tag_c_1=6952428947
278     File Input Format Counters
279         Bytes Read=244379652
280     File Output Format Counters
281         Bytes Written=205508096
282 Job2 done...
283 All Jobs Finished !

hadoop stream python 里用错误输出统计

 12 import sys,hashlib,struct,os14 18 19 if __name__=="__main__":20     for line in sys.stdin:21         line = line.strip()22         if 'uid_0' in line:23             print >> sys.stderr, "reporter:counter:group,keep_0,1"24             #print 'keep_0'+'\t'+'1'25         elif 'uid_1' in line:26             #print 'keep_1'+'\t'+'1'27             print >> sys.stderr, "reporter:counter:group,keep_1,1"

上例中hadoop任务不需要reducer,输出如下:(keep_0, keep_1为统计量)

425 17/08/09 09:05:25 INFO mapreduce.Job:  map 100% reduce 100%
426 17/08/09 09:05:38 INFO mapreduce.Job: Job job_1493284708000_147282 completed successfully
427 17/08/09 09:05:38 INFO mapreduce.Job: Counters: 53
428     File System Counters
429         FILE: Number of bytes read=768
430         FILE: Number of bytes written=28933956
431         FILE: Number of read operations=0
432         FILE: Number of large read operations=0
433         FILE: Number of write operations=0
434         HDFS: Number of bytes read=374470086
435         HDFS: Number of bytes written=0
436         HDFS: Number of read operations=684
437         HDFS: Number of large read operations=0
438         HDFS: Number of write operations=256
439     Job Counters
440         Killed map tasks=2
441         Launched map tasks=102
442         Launched reduce tasks=128
443         Data-local map tasks=14
444         Rack-local map tasks=88
445         Total time spent by all maps in occupied slots (ms)=3509175
446         Total time spent by all reduces in occupied slots (ms)=9734817
447         Total time spent by all map tasks (ms)=701835
448         Total time spent by all reduce tasks (ms)=3244939
449         Total vcore-seconds taken by all map tasks=701835
450         Total vcore-seconds taken by all reduce tasks=3244939
451         Total megabyte-seconds taken by all map tasks=3593395200
452         Total megabyte-seconds taken by all reduce tasks=9968452608
453     Map-Reduce Framework
454         Map input records=13186141
455         Map output records=0
456         Map output bytes=0
457         Map output materialized bytes=76800
458         Input split bytes=14800
459         Combine input records=0
460         Combine output records=0
461         Reduce input groups=0
462         Reduce shuffle bytes=76800
463         Reduce input records=0
466         Shuffled Maps =12800
467         Failed Shuffles=0
470         CPU time spent (ms)=546840
471         Physical memory (bytes) snapshot=137153257472
472         Virtual memory (bytes) snapshot=656820338688
473         Total committed heap usage (bytes)=335173124096
474     Shuffle Errors
475         BAD_ID=0
476         CONNECTION=0
477         IO_ERROR=0
478         WRONG_LENGTH=0
479         WRONG_MAP=0
480         WRONG_REDUCE=0
481     group
482         keep_0=149982
483         keep_1=13036159
484     File Input Format Counters
485         Bytes Read=374455286
486     File Output Format Counters
487         Bytes Written=0

Hadoop-MapReducer 利用计数器Counters(Java)和Error output(python)计数相关推荐

  1. Sublime Text 2报 Decode error - output not utf-8 错误的解决办法

    Sublime Text 2报"Decode error - output not utf-8"错误的解决办法 作者:chszs,转载需注明. 作者博客主页:http://blog ...

  2. 云计算学习笔记---异常处理---hadoop问题处理ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.lang.NullPoin

    云计算学习笔记---异常处理---hadoop问题处理ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.lang.NullPoin ...

  3. Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeExcept

    参考:http://mars914.iteye.com/blog/1410035 hive-site.xml 修改如下配置 <property><name>javax.jdo. ...

  4. Hadoop 运行jar包时 java.lang.ClassNotFoundException: Class com.zhen.mr.RunJob$HotMapper not found...

    错误如下 Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.zhen.mr.RunJob$H ...

  5. HBase 计数器 (Counters)

    HBase 提供了一个高级特性:计数器(counter).很多收集统计信息的应用,例如在线广告的单击或查看统计,将这些数据收集到日志文件中用于后期的分析. 利用计数器提供的实时统计,从而放弃延时较高的 ...

  6. Exception in thread main java.lang.Error: 无法解析的编译问题: 方法 main 不能声明为 static;只能在静态类型或顶级类型中才能声明静态方法

    Exception in thread "main" java.lang.Error: 无法解析的编译问题: 方法 main 不能声明为 static:只能在静态类型或顶级类型中才 ...

  7. Exception in thread main java.lang.Error: Unresolved compilation problem

    初学java,使用eclipse编译时,可能会遇到如下图所示的编译错误(Exception in thread "main" java.lang.Error: Unresolved ...

  8. Hadoop HDFS文件操作的Java代码

    1.创建目录 import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.ha ...

  9. java中Error与Exception有什么区别

    Error类和Exception类都继承自Throwable类. Error的继承关系: java.lang.Object   java.lang.Throwable       java.lang. ...

最新文章

  1. idea使用maven创建java工程log4j的配置
  2. 附录C 编译安装Hive
  3. mysql防止误删除的方法
  4. C语言经典例20-小球反弹高度问题
  5. 6kyu Build a pile of Cubes
  6. 数学建模第五节2020.5.8-17补
  7. SMB MS17-010 利用(CVE-2017-0144 )
  8. 老年人自学计算机,老年人怎样学电脑?请问从网上能找到学习资吗?
  9. STL(五)——slist/list链表
  10. 【Elasticsearch】es 电台 收听 笔记
  11. 华哥讲堂:解析智能电视语音控制功能
  12. Python学习第二章:变量和简单类型
  13. Hibernate检索策略
  14. jenkins k8s 动态增减 jenkins-salve (2) 实现 slave 节点动态构建
  15. csgo控制台服务器信息,《csgo》国服控制台怎么打开 控制台指令设置方法
  16. Java小程序 个人缴税
  17. 苏宁易购不易购,遭遇临时涨价、一月未送货
  18. CSS的3D应用:绘制长方体
  19. C++ push方法与push_back方法 浅析
  20. idea工具和激活码获取

热门文章

  1. PCA主成分分析原理理解学习(源于b站某视频)
  2. 算法:输出10以内数的阶乘的结果。
  3. Android开源框架【集合】
  4. 【元宇宙】巧用大白话说清元宇宙
  5. fluent-bit之配置详解
  6. 支付宝碎屏险究竟是怎么回事?靠谱么?
  7. fen分离整数的各个位
  8. DownSampling向下采样
  9. 梦龙_C语言作业11
  10. USBoot制作U盘启动盘