分布与并行计算—日志挖掘（Java）

日志挖掘——处理数据、计费统计

1、读取附件中日志的内容，找出自己学号停车场中对应的进出车次数（in/out配对的记录数，1条in、1条out，视为一个车次，本日志中in/out为一一对应，不存在缺失某条进或出记录）

2、统计自己停车场车辆累计停放秒（即某个车牌，车辆驶出时间-车辆驶入时间=本此停放秒数，统计自己车场全部车牌的累计）

在日志挖掘实验基础上，针对车辆进出进行计费。

计费规则如下：

1、进场后在30分钟内出场的车辆，免收费；

2、停放在2小时内（含2小时），收费10元；从第三小时起，每小时2元。

3、不封顶。

public class CountPark {static   Map<String,Integer> profit=new HashMap<>();static   Map<String,String> map=new HashMap<>();static long cnt=0,time=0;static SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");static SimpleDateFormat sdf2 = new SimpleDateFormat("yyyy-MM");public static List<String> readTxt(String fileName){List<String> list=new ArrayList<>();try { // 防止文件建立或读取失败，用catch捕捉错误并打印，也可以throw/* 读入TXT文件 */System.out.println(fileName);File filename = new File(fileName); // 要读取以上路径的input。txt文件InputStreamReader reader = new InputStreamReader(new FileInputStream(filename),"gbk"); // 建立一个输入流对象readerBufferedReader br = new BufferedReader(reader); // 建立一个对象，它把文件内容转成计算机能读懂的语言String line = "";line = br.readLine();while (line != null) {String[] temp=line.split(",");if(temp[1].equals("2018250")){if(map.containsKey(temp[2])){long cur=sdf.parse(temp[0]).getTime();int cost=0;String month=sdf2.format(sdf.parse(temp[0]));long pre=sdf.parse(map.get(temp[2])).getTime();double curTime=((double) (cur-pre))/1000;//秒if(curTime>1800)//大于半小时{cost+=10;//收10元double hour= Math.ceil(curTime/(60*60));//小时if(hour>2)//两小时以上cost+=(hour-2)*2;}profit.put(month,profit.getOrDefault(month,0)+cost);//记录map.remove(temp[2]);}elsemap.put(temp[2],temp[0]);cnt++;}line = br.readLine();}} catch (Exception e) {e.printStackTrace();}return list;}public static void main(String[] args) {long c=System.currentTimeMillis();List<String> list=readTxt("cars2.txt");System.out.println("运行时间（秒）:"+(double)(System.currentTimeMillis()-c)/1000+"秒");for(String s:profit.keySet())System.out.println(s+":"+profit.get(s));}
}

日志挖掘——并行改造

1、读取文件并行化（数据分割）

具体做法：针对之前一个大日志文件，根据时间（例如按照日期）进行切割，切割为多个小文件。读取时，每天的数据分别读取

2、处理并行化（任务分割）

具体做法：针对每个日期，分别进行各种扫描和统计；对于跨日期的数据，存放在公用地方，到时再配对处理

public class CountPark2 {static Map<String,ArrayList<String>> profit=new HashMap<>();static   Map<String,String> map=new HashMap<>();static long cnt=0,time=0;static   SimpleDateFormat formatter = new SimpleDateFormat("HH:mm:ss.SSS");static SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");static SimpleDateFormat sdf2 = new SimpleDateFormat("yyyy-MM");public static void readTxt(String fileName){List<String> list=new ArrayList<>();try { // 防止文件建立或读取失败，用catch捕捉错误并打印，也可以throw/* 读入TXT文件 */File filename = new File(fileName); // 要读取以上路径的input。txt文件InputStreamReader reader = new InputStreamReader(new FileInputStream(filename),"gbk"); // 建立一个输入流对象readerBufferedReader br = new BufferedReader(reader); // 建立一个对象，它把文件内容转成计算机能读懂的语言String line = "";line = br.readLine();while (line != null) {String[] temp=line.split(",");if(temp[1].equals("2018250")){long cur=sdf.parse(temp[0]).getTime();      String month=sdf2.format(sdf.parse(temp[0]));if(!profit.containsKey(month)) profit.put(month,new ArrayList<>());profit.get(month).add(line);}line = br.readLine();} }catch (Exception e) {e.printStackTrace();}}public static void writeTxt(String content,String filename){try {File writename = new File("src/logminning/reso/"+filename); // 相对路径，如果没有则要建立一个新的output.txt文件writename.createNewFile(); // 创建新文件BufferedWriter out = new BufferedWriter(new FileWriter(writename));out.write(content); // \r\n即为换行out.flush(); // 把缓存区内容压入文件out.close(); // 最后记得关闭文件} catch (Exception e) {e.printStackTrace();}}public void handle(){readTxt("cars2.txt");for(String c:profit.keySet()){StringBuilder stringBuilder=new StringBuilder();for(String t:profit.get(c))stringBuilder.append(t).append("\r\n");writeTxt(stringBuilder.toString(),c);}}public static void main(String[] args) {ConcurrentHashMap<String, PriorityBlockingQueue<Long>> concurrentHashMap=new ConcurrentHashMap<>();BlockingQueue<String> blockingQueue=new LinkedBlockingQueue<>();ExecutorService executorService= Executors.newFixedThreadPool(8);CountDownLatch countDownLatch=new CountDownLatch(6);System.out.println("Start ReadFile:"+formatter.format(new Date()));for(int i=1;i<=6;i++){executorService.execute(new ReadFileThread(concurrentHashMap,"src/logminning/reso/2020-0"+String.valueOf(i)+".txt",countDownLatch));}try {countDownLatch.await();} catch (InterruptedException e) {e.printStackTrace();}System.out.println("End ReadFile:"+formatter.format(new Date()));
/*        int used=0;for(String c:concurrentHashMap.keySet()){if(concurrentHashMap.get(c).size()==2) continue;System.out.println(c+':'+concurrentHashMap.get(c).size());used+=concurrentHashMap.get(c).size();}System.out.println(used);*/Res res=new Res();countDownLatch=new CountDownLatch(6);for(int i=0;i<26;i+=5){executorService.execute(new handler(concurrentHashMap,countDownLatch,i,res));}try {countDownLatch.await();} catch (InterruptedException e) {e.printStackTrace();}System.out.println("End Process:"+formatter.format(new Date()));System.out.println(res.getTar()+":"+res.getUsed());executorService.shutdown();}
}

public class handler implements Runnable{CountDownLatch countDownLatch;ConcurrentHashMap<String, PriorityBlockingQueue<Long>> concurrentHashMap;int s;Res res;public handler(ConcurrentHashMap<String, PriorityBlockingQueue<Long>> concurrentHashMap,CountDownLatch countDownLatch,int s,Res res) {this.s=s;this.res=res;this.countDownLatch=countDownLatch;this.concurrentHashMap = concurrentHashMap;}@Overridepublic void run() {try {for(String c:concurrentHashMap.keySet() ){int cur=c.charAt(1)-'A';if(cur>=s&&cur<s+5){PriorityBlockingQueue<Long> priorityQueue=concurrentHashMap.get(c);while (priorityQueue.size()>0){long c1=priorityQueue.take(),c2=priorityQueue.take();res.add((c2-c1)/1000);}}}} catch (Exception e) {e.printStackTrace();}countDownLatch.countDown();}
}

public class ReadFileThread implements Runnable {private ConcurrentHashMap<String, PriorityBlockingQueue<Long>> concurrentHashMap;private String fileName;CountDownLatch countDownLatch;SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");public ReadFileThread(ConcurrentHashMap<String, PriorityBlockingQueue<Long>> concurrentHashMap, String filename, CountDownLatch countDownLatch) {this.concurrentHashMap = concurrentHashMap;this.fileName = filename;this.countDownLatch=countDownLatch;}@Overridepublic void run() {try { // 防止文件建立或读取失败，用catch捕捉错误并打印，也可以throw/* 读入TXT文件 */File filename = new File(fileName); // 要读取以上路径的input。txt文件InputStreamReader reader = new InputStreamReader(new FileInputStream(filename),"UTF-8"); // 建立一个输入流对象readerBufferedReader br = new BufferedReader(reader); // 建立一个对象，它把文件内容转成计算机能读懂的语言String line = "";line = br.readLine();while (line != null) {String[] temp=line.split(",");if(!concurrentHashMap.containsKey(temp[2]))concurrentHashMap.put(temp[2],new PriorityBlockingQueue<>());synchronized (this){long cur=sdf.parse(temp[0]).getTime(); concurrentHashMap.get(temp[2]).add(cur);}line = br.readLine();}} catch (Exception e) {e.printStackTrace();}countDownLatch.countDown();}
}

public class Res {private   long tar = 0;private  int used=0;public long getTar() {return tar;}public int getUsed() {return used;}public synchronized void add(long c){tar+=c;used++;}
}

分布与并行计算—日志挖掘（Java）相关推荐

自动清理归档日志_Oracle重做日志和日志挖掘
为什么需要redo log 内存中数据修改后,不必立即更新到磁盘---效率由日志完成数据的保护目的---效率其他副产品数据恢复(备份集+归档日志)数据同步(DG,streams,goldengat ...
ML之FE：对人类性别相关属性数据集进行数据特征分布可视化分析与挖掘
ML之FE:对人类性别相关属性数据集进行数据特征分布可视化分析与挖掘目录对人类性别相关属性数据集进行数据特征分布可视化分析与挖掘输出结果实现代码对人类性别相关属性数据集进行数据特征分布可视化 ...
oracle 挖掘日志,Oracle 日志挖掘(LogMiner)使用详解
Logminer依赖于2个包:DBMS_LOGMNR和DBMS_LOGMNR_D,Oracle 11g默认已安装 Logminer 基本使用步骤 <1>. Specify a LogMin ...
java 抽样_beta分布的采样或抽样(java程序)
beta分布的采样或抽样(java程序) 标签:#beta分布采样# 时间:2017/05/12 15:47:04 作者:十七岁的雨季关于beta分布的介绍,请看我的另外一篇博客:http://bl ...
Oracle日志挖掘技术logminer
Logminer是Oracle推出的一项日志挖掘技术和工具,可用于分析对数据库的DML操作,获取操作的REDO SQL和UNDO SQL.它既可以分析在线日志,也可以分析离线日志,既可以分析自身数据库 ...
Oracle日志挖掘之LogMiner
Oracle日志挖掘之LogMiner 官方文档地址:http://docs.oracle.com/cd/E11882_01/server.112/e22490/logminer.htm#SUTIL0 ...
瀚高数据库日志挖掘方法
目录环境文档用途详细信息环境系统平台:Linux x86-64 Red Hat Enterprise Linux 7 版本:4.3.4.7 文档用途本文主要用于介绍如何通过walminer ...
oracle数据库日志挖掘操作步骤
目录前言一.日志挖掘 1.1 手动切换当前redo日志 1.2 确定需要进行日志挖掘的大体时间点
分布与并行计算—生产者消费者模型队列（Java）
在生产者-消费者模型中,在原有代码基础上,把队列独立为1个类实现,通过公布接口,由生产者和消费者调用. public class Consumer implements Runnable {int n ...

分布与并行计算—日志挖掘（Java）

日志挖掘——处理数据、计费统计

日志挖掘——并行改造

分布与并行计算—日志挖掘（Java）相关推荐

最新文章

热门文章