MapReduce之week2 test 分区计算结余(练习)

static {
System.setProperty(“hadoop.home.dir”,“E:/x3/hadoop-2.9.2”);
}

//map
public static class MyMapper extends Mapper<LongWritable,Text,Text,Week2Data>{@Overrideprotected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {String lineValue = value.toString();String[] split = lineValue.split("\t");String year = split[0].substring(0, 4);Integer diff = null;Integer num1 = Integer.parseInt(split[1]);Integer num2 = Integer.parseInt(split[2]);diff = num1 - num2;context.write(new Text(year),new Week2Data(split[0],diff));}
}//自定义分区器
public static class MyPartitioner extends Partitioner<Text,Week2Data>{@Overridepublic int getPartition(Text key, Week2Data value, int i) {return key.hashCode() % 127 % i;}
}//自定义分组
public static class MyGroupComparble extends WritableComparator{protected MyGroupComparble() {super(Week2Data.class,true);}@Overridepublic int compare(WritableComparable a, WritableComparable b) {Week2Data data1 = (Week2Data) a;Week2Data data2 = (Week2Data) b;return data1.getYear().compareTo(data2.getYear());}
}//reduce
public static class MyReduce extends Reducer<Text,Week2Data,Text,Text>{@Overrideprotected void reduce(Text key, Iterable<Week2Data> values, Context context) throws IOException, InterruptedException {for (Week2Data value : values){context.write(new Text(value.getYear()+"   "+value.getBalance()),new Text(""));}}
}public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {//jobConfiguration conf = new Configuration();Job job = Job.getInstance(conf, "week2-test2");//写入文件FileInputFormat.addInputPath(job,new Path(args[0]));//map计算job.setMapperClass(MyMapper.class);job.setMapOutputValueClass(Week2Data.class);job.setMapOutputKeyClass(Text.class);//shuffle流程//调用自定义分区器job.setPartitionerClass(MyPartitioner.class);//调用自定义分组// job.setGroupingComparatorClass(MyGroupComparble.class);//reduce计算//设置reduce分区个数job.setNumReduceTasks(2);job.setReducerClass(MyReduce.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(Text.class);//判断文件是否存在FileSystem fs = FileSystem.get(conf);if (fs.exists(new Path(args[1]))) {fs.delete(new Path(args[1]), true);}//写出文件FileOutputFormat.setOutputPath(job,new Path(args[1]));//提交作业boolean result = job.waitForCompletion(true);System.out.println(result);
}

实体类
public class Week2Data implements WritableComparable{

private String year;
private Integer balance;public Week2Data(String year, Integer balance) {this.year = year;this.balance = balance;
}public Week2Data() {}public String getYear() {return year;
}public void setYear(String year) {this.year = year;
}public Integer getBalance() {return balance;
}public void setBalance(Integer balance) {this.balance = balance;
}@Override
public int compareTo(Week2Data o) {return o.balance.compareTo(this.getBalance());
}@Override
public void write(DataOutput output) throws IOException {output.writeUTF(year);output.writeInt(balance);
}@Override
public void readFields(DataInput input) throws IOException {this.year = input.readUTF();this.balance = input.readInt();
}

}

MapReduce之week2 test 分区计算结余(练习)相关推荐

使用QGIS分区统计工具实现栅格分类数据的分区计算面积——GlobeLand30地表覆盖数据为例
在栅格分析中,常常碰到使用分类后的栅格数据按照特定分区统计面积的需求,今天,我将使用QGIS的分区统计工具,演示地表覆盖数据按照地表分类分区域统计面积的过程,希望能给有这方面需求的朋友提供参考. 0 ...
MapReduce既是编程模型又是计算框架
learn from 从0开始学大数据(极客时间) MapReduce 编程模型包含 Map 和 Reduce 两个过程 map 的主要输入是一对 <Key, Value> 值,输出一对 ...
Spark vs. MapReduce 时间节约66%，计算节约40%
本文转自http://www.csdn.net/article/2014-11-04/2822474,所有权力归原作者所有.虽然本文并没有讲什么实质的东西,但是可以拿来吹牛逼呀~ ⁽⁽ଘ( ˊᵕˋ ) ...
库存流水账计算结余数量
原始数据计算后数据关键sql: sum(isnull(case when InoutFlag=1 then gios.FactReceiptQuan else null end, 0)-isnul ...
MapReduce自定义排序、分区、分组案例
一.题目数据:由于数据量比较大,放入百度网盘中链接: https://pan.baidu.com/s/13vHZ1v7Rw2Vbb5wZrWX0cA 提取码: 6qug 字段说明班级 ...
MapReduce二次排序分区，分组优化
自定义分组 NameGroup package test;import org.apache.hadoop.io.RawComparator; import org.apache.hadoop.io. ...
MapReduce编程系列 — 2：计算平均分
1.项目名称: 2.程序代码: package com.averagescorecount;import java.io.IOException; import java.util.Iterator; ...
MapReduce算法设计(三)----相对频率计算
1. 相对频率的计算在我们使用应用程序来分析文章时,一个重要的使用就是文章主题分类.就是依据文章所要表达的主题进行分类.而一般的程序化分类 (非人工分类)所使用的方法是TF-IDF.这种方法依 ...
Twister: 迭代MapReduce计算框架
摘要:MapReduce编程模型已经简化了许多数据并行应用的实现.编程模型的简化和MapReduce实现提供的服务质量在分布式计算社区上吸引了很多的热情.把MapReduce应用到多种科学应用的这么多 ...

MapReduce之week2 test 分区计算结余(练习)

MapReduce之week2 test 分区计算结余(练习)相关推荐

最新文章

热门文章